Vercel eve: When an Agent is No Longer Code, But a Directory

An in-depth deconstruction of the design philosophy behind Vercel's newly released AI Agent framework eve — why it defines an Agent as a filesystem rather than a code API, and how this fundamentally differs from other mainstream frameworks.

Vercel eve: When an Agent is No Longer Code, But a Directory

On June 17, 2026, Vercel open-sourced the Agent framework eve. Not just another "better LangChain," but an entirely different approach to building Agents.

TL;DR

Dimension	Mainstream Frameworks	eve
Defining an Agent	Code API (`@tool`, `Agent(name=...)`)	File system (drop a file to register)
Production capabilities	Bolt-on (deployment/sandbox/observability handled separately)	Built-in (durable execution + sandbox + tracing)
Channel integration	Post-hoc integration (build agent first, then add Slack)	First-class citizen (files alongside tools)
Deployment model	Self-hosted services or cloud functions	`vercel deploy`, just like a frontend project
Core metaphor	Agent = Workflow / Computation Graph	Agent = Deployable Software Product

1. The Problem: The "Multi-Layer Cake" Dilemma of Agent Frameworks

2025 was the year of Agent framework proliferation. LangGraph, CrewAI, OpenAI Agents SDK, Google ADK, Anthropic Agent SDK… Every framework tries to solve the same problem: make Agent development faster and more reliable.

But they all share one starting assumption: An Agent is code.

LangGraph defines an Agent as a state graph (StateGraph), CrewAI defines an Agent as a role object, OpenAI SDK defines an Agent as a model invocation chain. Each abstraction has its merits, but they all lead to the same outcome: you have to "read the code" to understand what an Agent is, what it can do, and where it's deployed.

Vercel experienced this pain internally throughout 2025 — they built hundreds of Agents, each with its own state management approach, credential management method, and logging format. It looked like a productivity revolution, but in reality, every team was reinventing the wheel.

eve starts from a different assumption: An Agent should be like a website — you know what it is at a glance.

2. Core Difference: File System as API

2.1 Mainstream Frameworks: Explicit Registration

# LangGraph
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(model, tools=[sql_tool, chart_tool])
 
# CrewAI
agent = Agent(
    role="data_analyst",
    goal="Answer team data questions",
    tools=[sql_tool, chart_tool]
)
 
# OpenAI Agents SDK
agent = Agent(
    name="data_analyst",
    instructions="You are a senior data analyst...",
    tools=[sql_tool, chart_tool]
)

Each framework has its own registration syntax. Tools must be imported, passed as parameters, and manually maintained in a list. The Agent's capability inventory is scattered across the codebase.

2.2 eve: Implicit Discovery

agent/
├── agent.ts              # Model config (one line of code)
├── instructions.md       # System prompt
├── tools/
│   ├── run_sql.ts        # Filename = tool name
│   └── post_chart.ts
├── skills/
│   └── revenue-definitions.md  # On-demand knowledge
├── channels/
│   └── slack.ts
└── schedules/
    └── monday-summary.ts

No add_tool(), no register(), no import list. Drop run_sql.ts into the tools/ directory, and eve automatically discovers and registers it at build time. The filename is the tool name, the directory location is the capability type.

This follows the same philosophy as Next.js: convention over configuration. Next.js makes folders = routes, eve makes files = Agent capabilities.

2.3 Why Does This Matter?

Scenario	Code API	File System
Newcomer understanding Agent capabilities	Read code, search for `@tool` decorators	Look at directory structure, instantly clear
Removing a tool	Find registration point, delete code, check references	Delete the file, done
Diff for Agent capability changes	Code changes mixed in with business logic	File additions/deletions = capability changes
Team collaboration	Multiple tools registered in one file, merge conflicts	One file per person, naturally conflict-free

The file system is humanity's most familiar "database." We manage documents with folders, manage code with directories, and manage knowledge with directories too. eve brings this intuition to Agent construction.

3. Production Is Built-In, Not Bolt-On

3.1 Other Frameworks' "Production Assembly"

Take LangGraph as an example. To build a production-grade Agent you need:

Core framework (LangGraph)
  + Deployment (LangServe / self-hosted FastAPI)
  + State persistence (Redis / PostgreSQL)
  + Sandbox (self-hosted Docker / cloud functions)
  + Observability (LangSmith or custom OpenTelemetry)
  + Human-in-the-loop (custom approval flow)
  + Evaluation framework (LangSmith evals or custom)

Each layer is a technical decision, and each layer requires integration and maintenance.

3.2 eve's "Production-Ready Out of the Box"

eve bakes all of the following capabilities directly into the framework as primitives:

Capability	Implementation	Description
Durable Execution	Based on open-source Workflow SDK	Each conversation is a persistent workflow, checkpointed at every step, crash-recoverable
Sandbox	Isolated runtime environment	Agent-generated code cannot touch the host environment (Vercel Sandbox / Docker / microsandbox)
Human-in-the-loop	Tool-level `needsApproval` field	Agent pauses waiting for approval, no compute resources consumed
Tracing	OpenTelemetry spans	Every model call and tool call has a complete trace, exportable to any platform
Evals	Built-in testing framework	`eve eval` runs locally, CI integration, validated before deployment
Multi-channel	File adapters	Same Agent serves Slack, Discord, Teams, HTTP

This isn't "we've connected these tools for you" — it's these capabilities are first-class components of the Agent, on par with tools, skills, and channels.

3.3 A Concrete Example: Human-in-the-loop

// agent/tools/run_sql.ts
export default defineTool({
  description: "Run a read-only SQL query against the warehouse.",
  inputSchema: z.object({ sql: z.string() }),
  
  // One field determines whether human approval is needed
  needsApproval: ({ toolInput }) => estimateScanGb(toolInput.sql) > 50,
  
  async execute({ sql }) {
    const { columns, rows } = await runReadOnlySql(sql);
    return { columns, rows: rows.slice(0, 500) };
  },
});

To do the same thing in other frameworks you need: define an approval middleware → register it in the Agent loop → implement pause/resume → integrate a notification channel. eve requires only a boolean expression.

4. Channels Are First-Class Citizens, Not Post-Hoc Integrations

4.1 Traditional Approach

1. Build Agent core logic
2. Agent is running
3. Want Slack → write Slack adapter
4. Want Discord → write another adapter
5. Each channel is an independent integration project

4.2 eve's Approach

agent/channels/
├── http.ts        # Default, included out of the box
├── slack.ts       # eve channels add slack (one command generates)
├── discord.ts     # eve channels add discord
└── telegram.ts

And sessions can migrate across channels — a question asked on Slack can be continued on the web interface. An HTTP webhook-triggered event can open a Slack investigation thread.

The conceptual difference: Other frameworks treat the Agent as "a bot living inside some IM platform"; eve treats the Agent as "a service that exists wherever the user is."

5. Deeper Design Philosophy Differences

5.1 Different "Ontology" of Agents

Framework	What is an Agent?	Core Abstraction
LangGraph	A stateful computation graph	StateGraph + Node + Edge
CrewAI	A role-playing team	Agent + Task + Crew
OpenAI SDK	An extension of the model	Agent + Tool + Handoff
Anthropic SDK	Claude's tool usage	Tool + Agent Loop
eve	A deployable software product	File + Directory + Channel

This isn't a question of "which is better" — it's that different starting assumptions lead to entirely different abstraction systems.

LangGraph's starting point is "how to model an Agent's reasoning process," so its core is a graph. CrewAI's starting point is "how to simulate team collaboration," so its core is roles. eve's starting point is "how to make an Agent as maintainable as a website," so its core is the file system.

5.2 Different Understanding of "Developer Experience"

Other frameworks' DX focus: the experience of writing code. Type hints, IDE autocomplete, streaming output.

eve's DX focus: the experience of understanding, maintaining, and collaborating. Look at the directory to know Agent capabilities, look at the diff to know what changed, look at the trace to know what happened.

These are two different levels of DX. The former optimizes for "writing the first Agent," the latter optimizes for "maintaining the hundredth Agent."

5.3 Different Understanding of "Production"

Other frameworks: "We help you build the Agent, production deployment is your problem."

eve: "An Agent is production software by nature; the framework should be designed for production from day one."

This is reflected in:

Deployment: Not "deploy to your server," but vercel deploy, just like a Next.js project
Version control: Every component of the Agent lives in Git, prompt changes are commits with diffs, reviews, and history
Rollback: Vercel's instant rollback applies directly
CI/CD: eve eval as a deployment gate, regression checks in CI

6. Vercel's Internal Practice

eve is not theoretical — Vercel itself has been running over a hundred Agents in production:

Agent	Purpose	Data
d0	Data analysis	30K+ questions processed monthly, permission isolation
Lead Agent	Autonomous SDR	$5K annual cost, 32x ROI
Athena	Sales dashboard	Built by RevOps in 6 weeks without engineers, pipeline coverage doubled
Vertex	Customer support	Resolves 92% of tickets independently
draft0	Content moderation	Automated review pipeline
V	Routing Agent	Distributes requests to the right specialized Agent

After these Agents migrated from various tech stacks to eve, they share the same set of tools, the same conventions, and the same observability. One hundred Agents run in the same way as one Agent.

7. Limitations and Use Cases

Limitations

Issue	Description
Beta stage	APIs and behaviors may change, not suitable for core business
Vercel lock-in	Sandbox, deployment, and OIDC are all deeply tied to the Vercel platform
TypeScript only	No Python SDK, excluding Python ecosystem Agent developers
Small ecosystem	2k stars on GitHub, community still early
Local development depends on Node	Unlike LangGraph which can be purely Python ecosystem

Suitable Use Cases

Teams already on Vercel: Zero-friction deployment
Products needing multi-channel Agents: Channels are first-class citizens
Large numbers of Agents needing unified management: File conventions naturally suit monorepos
Teams that value engineering discipline: Git-native, CI/CD, tracing all built-in

Less Suitable Scenarios

Pure Python data science teams: TypeScript barrier
Rapid prototyping/validation: CrewAI may be faster
Deeply customized Agent reasoning flows: LangGraph's graph model is more flexible
Don't want cloud platform lock-in: Hermes self-hosting offers more freedom

8. Comparison with Hermes

As a daily-use Agent framework, Hermes shares several design resonances with eve:

Dimension	Hermes	eve
Files as config	✅ skills/ directory	✅ tools/ + skills/ directories
Multi-channel	✅ QQ/Feishu/Local	✅ Slack/Discord/HTTP
Sub-agents	✅ delegate_task	✅ subagents/ directory
Observability	✅ trace-review	✅ OpenTelemetry
Deployment	Self-hosted	Vercel platform
Model flexibility	Multi-provider	AI Gateway
Target users	Personal AI co-pilot	Team/enterprise Agents

The core difference: Hermes is "my AI assistant," eve is "our Agent product." The former emphasizes personal productivity, the latter emphasizes team collaboration and production reliability.

9. Summary

eve's true innovation is not in the new features it provides, but in how it redefines "what an Agent is":

Not a code object, but a file directory
Not a workflow graph, but a software product
Not an AI experiment, but a production service
Not a framework, but an engineering convention

This echoes the history of web development. Before frameworks, every website was hand-written HTML + CGI scripts. Next.js wasn't just "better tools" — it was a convention for "what a website should look like." What eve is doing for the Agent domain may be what Next.js did for the web.

Of course, eve is still in beta with many limitations. But its design direction deserves attention from every Agent developer — as Agents move from experimentation to production, we need not just better APIs, but better engineering paradigms.

Reference links: