AI Agent Amnesia: Every New Session Starts From Zero
> Claude Code / Codex / Cursor starts as a blank slate every session. It's not the model's fault — you haven't given it a memory system. A three-layer architecture + lifecycle hooks to make agents remember.

Claude Code / Codex / Cursor starts as a blank slate every session. It's not that the model isn't capable enough — you just haven't given it a memory system.
I've been using AI coding assistants for almost a year. At first I was amazed. Gradually, I got frustrated.
Not because they can't write code — but because every new session, they're like a new hire on day one:
- Don't know the project structure → grep for half an hour
- Don't know the tech choices → generated code doesn't match existing style
- Don't know what was changed last time → redo work, re-introduce fixed bugs
- Don't know cross-project dependencies → change backend API, frontend doesn't sync, CI breaks
Until I realized: the problem isn't that AI isn't smart enough. It's that I hadn't given it a context engineering system.
Why Can't Agents Remember?
Two 2026 studies provided the answer.
ETH Zurich's ICSE 2026 paper found: Giving an agent a long, generic AGENTS.md actually decreased task success rates by 2-3% while increasing costs by 20%+. Too much "nice to know" info pushed out the "need to use" info — the model couldn't prioritize.
GitHub's analysis of 2,500+ repositories found: Most AGENTS.md failures aren't technical limitations — they're vagueness.
// ❌ Anti-pattern: prose paragraphs
"We value code quality and follow TDD principles.
Please ensure all changes are properly tested."
// ✅ Correct: command-first
## Commands
# Test
uv run pytest tests/ -v --cov=80%
# Format
uv run ruff format . && ruff check --fix
The first example gets ignored. "Value code quality" is a human value, not a machine instruction. The second is exactly what an agent needs — a concrete shell command.
Three-Layer Memory Architecture
Based on these findings, I built a three-layer workspace memory architecture — from passive to active, each layer solving a different aspect of agent amnesia.
Layer 1: Static Context Files
AGENTS.md + HANDOVER.md + ADR
→ Read by agent, tells it about the project
Layer 2: Knowledge Graph Engine
CodeGraph / GitNexus (MCP)
→ Lets the agent discover code dependencies on its own
Layer 3: Lifecycle Hooks
SessionStart → PreToolUse → SessionEnd
→ Fires automatically, no "remembering" needed
Layer 1: Static Files
Three files per project:
| File | Purpose | Size Limit |
|---|---|---|
| AGENTS.md | Tech stack + commands + constraints | ≤150 lines |
| HANDOVER.md | Session log + changelog | 80 lines auto-archive |
| docs/decisions/ADR-YYYYMMDD | Architecture decision records | One decision per file |
Multi-project workspaces get an index layer:
workspace/
├── AGENTS.md ← Project map: what projects exist, dependencies
└── shared/ ← Cross-project docs
├── api-contracts.md
└── architecture-overview.md
service-a/
├── AGENTS.md ← Tech guide
├── HANDOVER.md ← Session log (cross-session memory)
└── docs/decisions/ ← ADR decisions
The golden rule for AGENTS.md: Commands first, no prose, numbered priority constraints.
// ❌ Bad (agent ignores)
"We prefer async programming patterns with proper error handling."
// ✅ Good (agent acts on)
## Constraints (by priority)
1. All API keys from .env, never hardcode
2. DB migrations must be additive only
3. Test coverage ≥ 80%
HANDOVER.md is the agent's cross-session memory:
# HANDOVER
## Current Goal
Implement user registration API
## Changelog
| Date | Type | Scope | Description |
|------|------|-------|-------------|
| 2026-06-25 | Added | auth | Email verification |
## Completed
- [x] 2026-06-25 User registration API
## In Progress
- [ ] 2026-06-26 OAuth login — token refresh pending
## Key Decisions
| Date | Decision | Reason |
|------|----------|--------|
| 2026-06-25 | FastAPI over Flask | Native async + auto OpenAPI |Layer 2: Knowledge Graph
AGENTS.md covers "project background," but for "this function change affects 47 callers," static files aren't enough.
CodeGraph (47.4k ★, MIT) and GitNexus (42k ★) are two breakout open-source projects from 2026. They parse your entire codebase with tree-sitter, pre-index imports, calls, and class hierarchies into a local database, and expose it to agents via MCP.
Benchmarks:
- CodeGraph: 58-70% fewer agent tool calls
- GitNexus: 88% fewer tool calls in a 17-agent production audit
Agent receives task: modify handleLogin function
→ queries CodeGraph: "who calls handleLogin?"
→ finds 3 routes + 1 middleware depend on it
→ plans modification order → changes + runs tests
→ passes first time, nothing broken
Layer 3: Lifecycle Hooks (Most Critical)
The first two layers tell the agent "what to do" — and the agent might or might not comply. Hooks are deterministic: they always execute.
Claude Code supports 6 hook events (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PreCompact, SessionEnd). We only need 3:
SessionStart hook — Runs automatically when a session boots.
- Reads HANDOVER.md and displays where we left off
- Checks environment: AGENTS.md exists? CLAUDE.md symlink intact?
- Warns if HANDOVER.md hasn't been updated in 14+ days
PreToolUse hook — Fires before file writes.
- Detects multi-file edits → reminds to query CodeGraph for blast radius
- Detects .env file access → prevents secret leaks
SessionEnd hook — Fires when the session ends.
- Extracts change summary from git diff
- Appends changelog to HANDOVER.md
- Records branch name, file list, change stats
This means: the agent doesn't need to "remember" to update the handover — the SessionEnd hook does it automatically.
# Auto-appended by SessionEnd hook
## Session End: 2026-06-26 15:30
- Branch: feat/user-auth
- Uncommitted files: 5
- Changes:
src/api/auth.py | 45 +++++++++++++++++++
tests/test_auth.py | 78 +++++++++++++++++++++++++++++++++++++Anti-rot Measures
AGENTS.md files rot. The codebase evolves — directories get renamed, scripts change, dependencies get swapped — but AGENTS.md stays the same.
agents-lint is a lightweight tool that detects AGENTS.md rot:
- Validates every referenced path still exists (directory renamed? file moved?)
- Checks npm scripts are still valid (
npm run testbut package.json removed it?) - Detects outdated framework patterns (AGENTS.md says Angular
@NgModulebut project is on standalone?) - Cross-file consistency check (AGENTS.md says yarn, CLAUDE.md says npm — conflict)
We added a weekly CI check, because code changes but AGENTS.md doesn't — without automated detection, it silently becomes useless in two months.
The Combined Effect
Initializing a workspace takes one command:
$ pnpm create agent-workspace ./project api-gateway user-service --hooks --ci
Your daily workflow becomes:
| When | What happens |
|---|---|
| New session starts | Auto-loads HANDOVER.md, shows where we left off |
| Before writing code | PreToolUse checks blast radius |
| After writing code | PostToolUse logs changes |
| Session ends | SessionEnd auto-writes changelog to HANDOVER.md |
| git commit | pre-commit hook validates AGENTS.md |
| Every Monday | GitHub Actions runs agents-lint for stale detection |
The agent is no longer "a blank slate every time" — it's working with yesterday's memory.
Get Started
The full reference architecture is open-source:
- GitHub template: github.com/lennney/agent-workspace-refarch
- npm package:
npx create-agent-workspace@latest
Includes:
- Workspace + per-project AGENTS.md templates
- HANDOVER.md session log (80-line auto-archive)
- ADR template (date-based IDs, no multi-project conflicts)
- 5 Claude Code lifecycle hook scripts
- agents-lint integration + GitHub Actions CI
- validate.sh + pre-commit hook
- CodeGraph / GitNexus / Repomix integration guides
AI agent capability is no longer the bottleneck. The bottleneck is context engineering.
Instead of waiting for bigger models, manage the context you already have.