Command Palette

Search for a command to run...

0

Vercel eve: When an Agent is No Longer Code, But a Directory

An in-depth deconstruction of the design philosophy behind Vercel's newly released AI Agent framework eve — why it defines an Agent as a filesystem rather than a code API, and how this fundamentally differs from other mainstream frameworks.

Vercel eve: When an Agent is No Longer Code, But a Directory

On June 17, 2026, Vercel open-sourced the Agent framework eve. Not just another "better LangChain," but an entirely different approach to building Agents.


TL;DR

DimensionMainstream Frameworkseve
Defining an AgentCode API (@tool, Agent(name=...))File system (drop a file to register)
Production capabilitiesBolt-on (deployment/sandbox/observability handled separately)Built-in (durable execution + sandbox + tracing)
Channel integrationPost-hoc integration (build agent first, then add Slack)First-class citizen (files alongside tools)
Deployment modelSelf-hosted services or cloud functionsvercel deploy, just like a frontend project
Core metaphorAgent = Workflow / Computation GraphAgent = Deployable Software Product

1. The Problem: The "Multi-Layer Cake" Dilemma of Agent Frameworks

2025 was the year of Agent framework proliferation. LangGraph, CrewAI, OpenAI Agents SDK, Google ADK, Anthropic Agent SDK… Every framework tries to solve the same problem: make Agent development faster and more reliable.

But they all share one starting assumption: An Agent is code.

LangGraph defines an Agent as a state graph (StateGraph), CrewAI defines an Agent as a role object, OpenAI SDK defines an Agent as a model invocation chain. Each abstraction has its merits, but they all lead to the same outcome: you have to "read the code" to understand what an Agent is, what it can do, and where it's deployed.

Vercel experienced this pain internally throughout 2025 — they built hundreds of Agents, each with its own state management approach, credential management method, and logging format. It looked like a productivity revolution, but in reality, every team was reinventing the wheel.

eve starts from a different assumption: An Agent should be like a website — you know what it is at a glance.


2. Core Difference: File System as API

2.1 Mainstream Frameworks: Explicit Registration

# LangGraph
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(model, tools=[sql_tool, chart_tool])
 
# CrewAI
agent = Agent(
    role="data_analyst",
    goal="Answer team data questions",
    tools=[sql_tool, chart_tool]
)
 
# OpenAI Agents SDK
agent = Agent(
    name="data_analyst",
    instructions="You are a senior data analyst...",
    tools=[sql_tool, chart_tool]
)

Each framework has its own registration syntax. Tools must be imported, passed as parameters, and manually maintained in a list. The Agent's capability inventory is scattered across the codebase.

2.2 eve: Implicit Discovery

agent/
├── agent.ts              # Model config (one line of code)
├── instructions.md       # System prompt
├── tools/
│   ├── run_sql.ts        # Filename = tool name
│   └── post_chart.ts
├── skills/
│   └── revenue-definitions.md  # On-demand knowledge
├── channels/
│   └── slack.ts
└── schedules/
    └── monday-summary.ts

No add_tool(), no register(), no import list. Drop run_sql.ts into the tools/ directory, and eve automatically discovers and registers it at build time. The filename is the tool name, the directory location is the capability type.

This follows the same philosophy as Next.js: convention over configuration. Next.js makes folders = routes, eve makes files = Agent capabilities.

2.3 Why Does This Matter?

ScenarioCode APIFile System
Newcomer understanding Agent capabilitiesRead code, search for @tool decoratorsLook at directory structure, instantly clear
Removing a toolFind registration point, delete code, check referencesDelete the file, done
Diff for Agent capability changesCode changes mixed in with business logicFile additions/deletions = capability changes
Team collaborationMultiple tools registered in one file, merge conflictsOne file per person, naturally conflict-free

The file system is humanity's most familiar "database." We manage documents with folders, manage code with directories, and manage knowledge with directories too. eve brings this intuition to Agent construction.


3. Production Is Built-In, Not Bolt-On

3.1 Other Frameworks' "Production Assembly"

Take LangGraph as an example. To build a production-grade Agent you need:

Core framework (LangGraph)
  + Deployment (LangServe / self-hosted FastAPI)
  + State persistence (Redis / PostgreSQL)
  + Sandbox (self-hosted Docker / cloud functions)
  + Observability (LangSmith or custom OpenTelemetry)
  + Human-in-the-loop (custom approval flow)
  + Evaluation framework (LangSmith evals or custom)

Each layer is a technical decision, and each layer requires integration and maintenance.

3.2 eve's "Production-Ready Out of the Box"

eve bakes all of the following capabilities directly into the framework as primitives:

CapabilityImplementationDescription
Durable ExecutionBased on open-source Workflow SDKEach conversation is a persistent workflow, checkpointed at every step, crash-recoverable
SandboxIsolated runtime environmentAgent-generated code cannot touch the host environment (Vercel Sandbox / Docker / microsandbox)
Human-in-the-loopTool-level needsApproval fieldAgent pauses waiting for approval, no compute resources consumed
TracingOpenTelemetry spansEvery model call and tool call has a complete trace, exportable to any platform
EvalsBuilt-in testing frameworkeve eval runs locally, CI integration, validated before deployment
Multi-channelFile adaptersSame Agent serves Slack, Discord, Teams, HTTP

This isn't "we've connected these tools for you" — it's these capabilities are first-class components of the Agent, on par with tools, skills, and channels.

3.3 A Concrete Example: Human-in-the-loop

// agent/tools/run_sql.ts
export default defineTool({
  description: "Run a read-only SQL query against the warehouse.",
  inputSchema: z.object({ sql: z.string() }),
  
  // One field determines whether human approval is needed
  needsApproval: ({ toolInput }) => estimateScanGb(toolInput.sql) > 50,
  
  async execute({ sql }) {
    const { columns, rows } = await runReadOnlySql(sql);
    return { columns, rows: rows.slice(0, 500) };
  },
});

To do the same thing in other frameworks you need: define an approval middleware → register it in the Agent loop → implement pause/resume → integrate a notification channel. eve requires only a boolean expression.


4. Channels Are First-Class Citizens, Not Post-Hoc Integrations

4.1 Traditional Approach

1. Build Agent core logic
2. Agent is running
3. Want Slack → write Slack adapter
4. Want Discord → write another adapter
5. Each channel is an independent integration project

4.2 eve's Approach

agent/channels/
├── http.ts        # Default, included out of the box
├── slack.ts       # eve channels add slack (one command generates)
├── discord.ts     # eve channels add discord
└── telegram.ts

And sessions can migrate across channels — a question asked on Slack can be continued on the web interface. An HTTP webhook-triggered event can open a Slack investigation thread.

The conceptual difference: Other frameworks treat the Agent as "a bot living inside some IM platform"; eve treats the Agent as "a service that exists wherever the user is."


5. Deeper Design Philosophy Differences

5.1 Different "Ontology" of Agents

FrameworkWhat is an Agent?Core Abstraction
LangGraphA stateful computation graphStateGraph + Node + Edge
CrewAIA role-playing teamAgent + Task + Crew
OpenAI SDKAn extension of the modelAgent + Tool + Handoff
Anthropic SDKClaude's tool usageTool + Agent Loop
eveA deployable software productFile + Directory + Channel

This isn't a question of "which is better" — it's that different starting assumptions lead to entirely different abstraction systems.

LangGraph's starting point is "how to model an Agent's reasoning process," so its core is a graph. CrewAI's starting point is "how to simulate team collaboration," so its core is roles. eve's starting point is "how to make an Agent as maintainable as a website," so its core is the file system.

5.2 Different Understanding of "Developer Experience"

Other frameworks' DX focus: the experience of writing code. Type hints, IDE autocomplete, streaming output.

eve's DX focus: the experience of understanding, maintaining, and collaborating. Look at the directory to know Agent capabilities, look at the diff to know what changed, look at the trace to know what happened.

These are two different levels of DX. The former optimizes for "writing the first Agent," the latter optimizes for "maintaining the hundredth Agent."

5.3 Different Understanding of "Production"

Other frameworks: "We help you build the Agent, production deployment is your problem."

eve: "An Agent is production software by nature; the framework should be designed for production from day one."

This is reflected in:

  • Deployment: Not "deploy to your server," but vercel deploy, just like a Next.js project
  • Version control: Every component of the Agent lives in Git, prompt changes are commits with diffs, reviews, and history
  • Rollback: Vercel's instant rollback applies directly
  • CI/CD: eve eval as a deployment gate, regression checks in CI

6. Vercel's Internal Practice

eve is not theoretical — Vercel itself has been running over a hundred Agents in production:

AgentPurposeData
d0Data analysis30K+ questions processed monthly, permission isolation
Lead AgentAutonomous SDR$5K annual cost, 32x ROI
AthenaSales dashboardBuilt by RevOps in 6 weeks without engineers, pipeline coverage doubled
VertexCustomer supportResolves 92% of tickets independently
draft0Content moderationAutomated review pipeline
VRouting AgentDistributes requests to the right specialized Agent

After these Agents migrated from various tech stacks to eve, they share the same set of tools, the same conventions, and the same observability. One hundred Agents run in the same way as one Agent.


7. Limitations and Use Cases

Limitations

IssueDescription
Beta stageAPIs and behaviors may change, not suitable for core business
Vercel lock-inSandbox, deployment, and OIDC are all deeply tied to the Vercel platform
TypeScript onlyNo Python SDK, excluding Python ecosystem Agent developers
Small ecosystem2k stars on GitHub, community still early
Local development depends on NodeUnlike LangGraph which can be purely Python ecosystem

Suitable Use Cases

  • Teams already on Vercel: Zero-friction deployment
  • Products needing multi-channel Agents: Channels are first-class citizens
  • Large numbers of Agents needing unified management: File conventions naturally suit monorepos
  • Teams that value engineering discipline: Git-native, CI/CD, tracing all built-in

Less Suitable Scenarios

  • Pure Python data science teams: TypeScript barrier
  • Rapid prototyping/validation: CrewAI may be faster
  • Deeply customized Agent reasoning flows: LangGraph's graph model is more flexible
  • Don't want cloud platform lock-in: Hermes self-hosting offers more freedom

8. Comparison with Hermes

As a daily-use Agent framework, Hermes shares several design resonances with eve:

DimensionHermeseve
Files as config✅ skills/ directory✅ tools/ + skills/ directories
Multi-channel✅ QQ/Feishu/Local✅ Slack/Discord/HTTP
Sub-agents✅ delegate_task✅ subagents/ directory
Observability✅ trace-review✅ OpenTelemetry
DeploymentSelf-hostedVercel platform
Model flexibilityMulti-providerAI Gateway
Target usersPersonal AI co-pilotTeam/enterprise Agents

The core difference: Hermes is "my AI assistant," eve is "our Agent product." The former emphasizes personal productivity, the latter emphasizes team collaboration and production reliability.


9. Summary

eve's true innovation is not in the new features it provides, but in how it redefines "what an Agent is":

  • Not a code object, but a file directory
  • Not a workflow graph, but a software product
  • Not an AI experiment, but a production service
  • Not a framework, but an engineering convention

This echoes the history of web development. Before frameworks, every website was hand-written HTML + CGI scripts. Next.js wasn't just "better tools" — it was a convention for "what a website should look like." What eve is doing for the Agent domain may be what Next.js did for the web.

Of course, eve is still in beta with many limitations. But its design direction deserves attention from every Agent developer — as Agents move from experimentation to production, we need not just better APIs, but better engineering paradigms.


Reference links: