Vercel eve: When an Agent is No Longer Code, But a Directory
An in-depth deconstruction of the design philosophy behind Vercel's newly released AI Agent framework eve — why it defines an Agent as a filesystem rather than a code API, and how this fundamentally differs from other mainstream frameworks.
Vercel eve: When an Agent is No Longer Code, But a Directory
On June 17, 2026, Vercel open-sourced the Agent framework eve. Not just another "better LangChain," but an entirely different approach to building Agents.
TL;DR
| Dimension | Mainstream Frameworks | eve |
|---|---|---|
| Defining an Agent | Code API (@tool, Agent(name=...)) | File system (drop a file to register) |
| Production capabilities | Bolt-on (deployment/sandbox/observability handled separately) | Built-in (durable execution + sandbox + tracing) |
| Channel integration | Post-hoc integration (build agent first, then add Slack) | First-class citizen (files alongside tools) |
| Deployment model | Self-hosted services or cloud functions | vercel deploy, just like a frontend project |
| Core metaphor | Agent = Workflow / Computation Graph | Agent = Deployable Software Product |
1. The Problem: The "Multi-Layer Cake" Dilemma of Agent Frameworks
2025 was the year of Agent framework proliferation. LangGraph, CrewAI, OpenAI Agents SDK, Google ADK, Anthropic Agent SDK… Every framework tries to solve the same problem: make Agent development faster and more reliable.
But they all share one starting assumption: An Agent is code.
LangGraph defines an Agent as a state graph (StateGraph), CrewAI defines an Agent as a role object, OpenAI SDK defines an Agent as a model invocation chain. Each abstraction has its merits, but they all lead to the same outcome: you have to "read the code" to understand what an Agent is, what it can do, and where it's deployed.
Vercel experienced this pain internally throughout 2025 — they built hundreds of Agents, each with its own state management approach, credential management method, and logging format. It looked like a productivity revolution, but in reality, every team was reinventing the wheel.
eve starts from a different assumption: An Agent should be like a website — you know what it is at a glance.
2. Core Difference: File System as API
2.1 Mainstream Frameworks: Explicit Registration
# LangGraph
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(model, tools=[sql_tool, chart_tool])
# CrewAI
agent = Agent(
role="data_analyst",
goal="Answer team data questions",
tools=[sql_tool, chart_tool]
)
# OpenAI Agents SDK
agent = Agent(
name="data_analyst",
instructions="You are a senior data analyst...",
tools=[sql_tool, chart_tool]
)Each framework has its own registration syntax. Tools must be imported, passed as parameters, and manually maintained in a list. The Agent's capability inventory is scattered across the codebase.
2.2 eve: Implicit Discovery
agent/
├── agent.ts # Model config (one line of code)
├── instructions.md # System prompt
├── tools/
│ ├── run_sql.ts # Filename = tool name
│ └── post_chart.ts
├── skills/
│ └── revenue-definitions.md # On-demand knowledge
├── channels/
│ └── slack.ts
└── schedules/
└── monday-summary.ts
No add_tool(), no register(), no import list. Drop run_sql.ts into the tools/ directory, and eve automatically discovers and registers it at build time. The filename is the tool name, the directory location is the capability type.
This follows the same philosophy as Next.js: convention over configuration. Next.js makes folders = routes, eve makes files = Agent capabilities.
2.3 Why Does This Matter?
| Scenario | Code API | File System |
|---|---|---|
| Newcomer understanding Agent capabilities | Read code, search for @tool decorators | Look at directory structure, instantly clear |
| Removing a tool | Find registration point, delete code, check references | Delete the file, done |
| Diff for Agent capability changes | Code changes mixed in with business logic | File additions/deletions = capability changes |
| Team collaboration | Multiple tools registered in one file, merge conflicts | One file per person, naturally conflict-free |
The file system is humanity's most familiar "database." We manage documents with folders, manage code with directories, and manage knowledge with directories too. eve brings this intuition to Agent construction.
3. Production Is Built-In, Not Bolt-On
3.1 Other Frameworks' "Production Assembly"
Take LangGraph as an example. To build a production-grade Agent you need:
Core framework (LangGraph)
+ Deployment (LangServe / self-hosted FastAPI)
+ State persistence (Redis / PostgreSQL)
+ Sandbox (self-hosted Docker / cloud functions)
+ Observability (LangSmith or custom OpenTelemetry)
+ Human-in-the-loop (custom approval flow)
+ Evaluation framework (LangSmith evals or custom)
Each layer is a technical decision, and each layer requires integration and maintenance.
3.2 eve's "Production-Ready Out of the Box"
eve bakes all of the following capabilities directly into the framework as primitives:
| Capability | Implementation | Description |
|---|---|---|
| Durable Execution | Based on open-source Workflow SDK | Each conversation is a persistent workflow, checkpointed at every step, crash-recoverable |
| Sandbox | Isolated runtime environment | Agent-generated code cannot touch the host environment (Vercel Sandbox / Docker / microsandbox) |
| Human-in-the-loop | Tool-level needsApproval field | Agent pauses waiting for approval, no compute resources consumed |
| Tracing | OpenTelemetry spans | Every model call and tool call has a complete trace, exportable to any platform |
| Evals | Built-in testing framework | eve eval runs locally, CI integration, validated before deployment |
| Multi-channel | File adapters | Same Agent serves Slack, Discord, Teams, HTTP |
This isn't "we've connected these tools for you" — it's these capabilities are first-class components of the Agent, on par with tools, skills, and channels.
3.3 A Concrete Example: Human-in-the-loop
// agent/tools/run_sql.ts
export default defineTool({
description: "Run a read-only SQL query against the warehouse.",
inputSchema: z.object({ sql: z.string() }),
// One field determines whether human approval is needed
needsApproval: ({ toolInput }) => estimateScanGb(toolInput.sql) > 50,
async execute({ sql }) {
const { columns, rows } = await runReadOnlySql(sql);
return { columns, rows: rows.slice(0, 500) };
},
});To do the same thing in other frameworks you need: define an approval middleware → register it in the Agent loop → implement pause/resume → integrate a notification channel. eve requires only a boolean expression.
4. Channels Are First-Class Citizens, Not Post-Hoc Integrations
4.1 Traditional Approach
1. Build Agent core logic
2. Agent is running
3. Want Slack → write Slack adapter
4. Want Discord → write another adapter
5. Each channel is an independent integration project
4.2 eve's Approach
agent/channels/
├── http.ts # Default, included out of the box
├── slack.ts # eve channels add slack (one command generates)
├── discord.ts # eve channels add discord
└── telegram.ts
And sessions can migrate across channels — a question asked on Slack can be continued on the web interface. An HTTP webhook-triggered event can open a Slack investigation thread.
The conceptual difference: Other frameworks treat the Agent as "a bot living inside some IM platform"; eve treats the Agent as "a service that exists wherever the user is."
5. Deeper Design Philosophy Differences
5.1 Different "Ontology" of Agents
| Framework | What is an Agent? | Core Abstraction |
|---|---|---|
| LangGraph | A stateful computation graph | StateGraph + Node + Edge |
| CrewAI | A role-playing team | Agent + Task + Crew |
| OpenAI SDK | An extension of the model | Agent + Tool + Handoff |
| Anthropic SDK | Claude's tool usage | Tool + Agent Loop |
| eve | A deployable software product | File + Directory + Channel |
This isn't a question of "which is better" — it's that different starting assumptions lead to entirely different abstraction systems.
LangGraph's starting point is "how to model an Agent's reasoning process," so its core is a graph. CrewAI's starting point is "how to simulate team collaboration," so its core is roles. eve's starting point is "how to make an Agent as maintainable as a website," so its core is the file system.
5.2 Different Understanding of "Developer Experience"
Other frameworks' DX focus: the experience of writing code. Type hints, IDE autocomplete, streaming output.
eve's DX focus: the experience of understanding, maintaining, and collaborating. Look at the directory to know Agent capabilities, look at the diff to know what changed, look at the trace to know what happened.
These are two different levels of DX. The former optimizes for "writing the first Agent," the latter optimizes for "maintaining the hundredth Agent."
5.3 Different Understanding of "Production"
Other frameworks: "We help you build the Agent, production deployment is your problem."
eve: "An Agent is production software by nature; the framework should be designed for production from day one."
This is reflected in:
- Deployment: Not "deploy to your server," but
vercel deploy, just like a Next.js project - Version control: Every component of the Agent lives in Git, prompt changes are commits with diffs, reviews, and history
- Rollback: Vercel's instant rollback applies directly
- CI/CD:
eve evalas a deployment gate, regression checks in CI
6. Vercel's Internal Practice
eve is not theoretical — Vercel itself has been running over a hundred Agents in production:
| Agent | Purpose | Data |
|---|---|---|
| d0 | Data analysis | 30K+ questions processed monthly, permission isolation |
| Lead Agent | Autonomous SDR | $5K annual cost, 32x ROI |
| Athena | Sales dashboard | Built by RevOps in 6 weeks without engineers, pipeline coverage doubled |
| Vertex | Customer support | Resolves 92% of tickets independently |
| draft0 | Content moderation | Automated review pipeline |
| V | Routing Agent | Distributes requests to the right specialized Agent |
After these Agents migrated from various tech stacks to eve, they share the same set of tools, the same conventions, and the same observability. One hundred Agents run in the same way as one Agent.
7. Limitations and Use Cases
Limitations
| Issue | Description |
|---|---|
| Beta stage | APIs and behaviors may change, not suitable for core business |
| Vercel lock-in | Sandbox, deployment, and OIDC are all deeply tied to the Vercel platform |
| TypeScript only | No Python SDK, excluding Python ecosystem Agent developers |
| Small ecosystem | 2k stars on GitHub, community still early |
| Local development depends on Node | Unlike LangGraph which can be purely Python ecosystem |
Suitable Use Cases
- Teams already on Vercel: Zero-friction deployment
- Products needing multi-channel Agents: Channels are first-class citizens
- Large numbers of Agents needing unified management: File conventions naturally suit monorepos
- Teams that value engineering discipline: Git-native, CI/CD, tracing all built-in
Less Suitable Scenarios
- Pure Python data science teams: TypeScript barrier
- Rapid prototyping/validation: CrewAI may be faster
- Deeply customized Agent reasoning flows: LangGraph's graph model is more flexible
- Don't want cloud platform lock-in: Hermes self-hosting offers more freedom
8. Comparison with Hermes
As a daily-use Agent framework, Hermes shares several design resonances with eve:
| Dimension | Hermes | eve |
|---|---|---|
| Files as config | ✅ skills/ directory | ✅ tools/ + skills/ directories |
| Multi-channel | ✅ QQ/Feishu/Local | ✅ Slack/Discord/HTTP |
| Sub-agents | ✅ delegate_task | ✅ subagents/ directory |
| Observability | ✅ trace-review | ✅ OpenTelemetry |
| Deployment | Self-hosted | Vercel platform |
| Model flexibility | Multi-provider | AI Gateway |
| Target users | Personal AI co-pilot | Team/enterprise Agents |
The core difference: Hermes is "my AI assistant," eve is "our Agent product." The former emphasizes personal productivity, the latter emphasizes team collaboration and production reliability.
9. Summary
eve's true innovation is not in the new features it provides, but in how it redefines "what an Agent is":
- Not a code object, but a file directory
- Not a workflow graph, but a software product
- Not an AI experiment, but a production service
- Not a framework, but an engineering convention
This echoes the history of web development. Before frameworks, every website was hand-written HTML + CGI scripts. Next.js wasn't just "better tools" — it was a convention for "what a website should look like." What eve is doing for the Agent domain may be what Next.js did for the web.
Of course, eve is still in beta with many limitations. But its design direction deserves attention from every Agent developer — as Agents move from experimentation to production, we need not just better APIs, but better engineering paradigms.
Reference links: