Agentic Engineering¶
Category: concept Last updated: 2026-04-03 Status: draft
Summary¶
Agentic engineering is the practice of professional software development using AI coding agents — distinguished from "vibe coding" (building without reading or understanding the code) by its emphasis on quality, verifiability, and professional engineering standards. The term was coined by entities/simon-willison as a more precise alternative to "vibe coding" for engineers who use agents to produce production-ready software they have reviewed and validated. Key patterns include red/green TDD, starting from thin templates, maintaining a hoard of reusable research and tools, and keeping code costs low to prototype aggressively.
Details¶
Vibe Coding vs. Agentic Engineering¶
Vibe coding (Andrej Karpathy's original definition): you don't look at the code at all; you go on vibes — describe what you want, play with the result, keep iterating without reading the implementation. Originally framed for fun and prototyping. Now colloquially applied to all AI-assisted coding, which dilutes the term's utility.
==Agentic engineering: professional use of coding agents to produce production-quality software, with the engineer reviewing output, maintaining tests, managing agent direction, and applying decades of engineering judgment. Using agents well is mentally demanding — not trivially easy.==
The distinction matters because: - Vibe coding is irresponsible beyond personal projects (bugs can harm others; security, scraping, etc.) - Agentic engineering is the right frame for deploying software to real users
Product-oriented vibe coding (Priyanka Vergadia's "Vibe Code Framework"): a middle path — prompting AI as a Product Manager rather than a developer. The AI is treated as the implementer; the human is the Product Owner who makes all decisions. Structured as 5 phases: (1) Discovery — AI asks clarifying questions, separates must-haves from nice-to-haves; (2) Planning — blueprint and technical approach in plain language; (3) Building — staged construction with options presented when problems arise; (4) Polish — edge cases, device compatibility, finished feel; (5) Handoff — deployment and documentation so you aren't locked into one chat session. The golden rule: no prototypes — working product only.

Key Patterns¶
Red/Green TDD¶
The single most important agentic engineering practice: always have the agent run the tests.
- Without test execution, you're back to "copy-paste from ChatGPT and hope it works"
- ==Tell agents to write tests first, run them (watch them fail = "red"), implement the code, run them again (watch them pass = "green")==
- ==The phrase
red/green TDDis programming jargon agents understand — replaces a paragraph of instructions== - Tests accumulate in the repo, giving future agents a safety net against regressions
- Agent-written tests mean even boring/verbose test suites are cheap to maintain; Simon now accepts 100+ tests on small libraries
Starting with a Thin Template¶
Rather than writing long CLAUDE.md instructions, start every project from a thin code skeleton:
- A single passing test (e.g.,
assert 1 + 1 == 2) in the preferred structure - A few lines of boilerplate formatted how you like
- Agents observe this pattern and reproduce it throughout the project
This is more effective than verbose instructions because agents are very good at pattern-matching to existing code style.
Hoarding Techniques and Research¶
Build a personal backlog of solved problems, prototypes, and research outputs — the more you accumulate, the more you can combine to solve new problems.
entities/simon-willison's implementation:
- simonw/tools — 193+ small HTML/JS tools; each captures a thing that is possible to do
- ==simonw/research — 75+ agent-driven research projects in public + ~50 private; each a markdown report where an agent wrote and ran code, not just summarized web content==
Key insight: agent-generated research where code was actually written and run is far more valuable than "deep research" summaries, because it's verified and actionable.
==Usage: tell the agent to git checkout simonw/research and look at relevant examples before solving a new problem in the same domain.==
Code is Cheap — Prototype in 3 Directions¶
Because coding is cheap, the right response is to generate multiple variants before committing to a direction:
- ==Prototype any feature 3 ways; the low cost of generation means you can compare before committing==
- This was Simon's longtime superpower (fast prototyping), now democratized — anyone can do it
- ==The new scarce resource is knowing when to prototype and how to evaluate the result==
Managing Agents¶
- Run agents in "YOLO" /
dangerously-skip-permissionsmode to avoid constant approval interruptions — lets you run 4 in parallel and check in periodically rather than babysitting - Claude Code for web (Anthropic-hosted) is safer for YOLO mode: the worst case is wasting Anthropic's compute, not deleting your laptop
- Review agent output via GitHub PRs the same way you'd review another engineer's code
- Use your phone for prompting; review on desktop later if needed
Who Benefits Most¶
Amplification is strongest for experienced engineers who can: - Write one-sentence prompts that precisely scope a problem - Recognize immediately when an agent is going in the wrong direction - Evaluate output quality without reading every line - Know what isn't possible yet for current models
Mid-career engineers are at highest risk — not enough experience to effectively direct agents, but past the onboarding-speed boost that helps beginners.
The Future of Engineering Value¶
The bottleneck has shifted from writing code to: - Deciding what to build - Spec-writing and communication - Evaluating correctness and quality - Security judgment - Code review at a higher level of abstraction - Managing agent state and context effectively
==Code is now so cheap that the cost of building something is negligible; the value lies in knowing what to build and whether what was built is good.==
Key Claims & Data Points¶
- 95% of Simon's code is now AI-written; he finds managing 4 parallel agents mentally exhausting by 11am — [source: wc8FBhQtdsA]
red/green TDDas a two-word prompt replaces a paragraph of instructions and materially improves agent output — [source: wc8FBhQtdsA]- Starting from a thin template (single test) is more effective than verbose CLAUDE.md instructions for setting code style — [source: wc8FBhQtdsA]
- Mid-career engineers identified as most at risk; beginners and senior engineers each have distinct advantages — [source: wc8FBhQtdsA] (citing Thoughtworks VP survey)
- Prototyping is "essentially free" — build 3 variants of any feature before committing — [source: wc8FBhQtdsA]
Open Questions¶
- What is the best way for mid-career engineers to develop agent-direction skill quickly? (raised by: concepts/agentic-engineering, 2026-04-03)
- How do agentic engineering patterns transfer to non-code knowledge work (legal, medical, editorial)? (raised by: concepts/agentic-engineering, 2026-04-03)
- What does "code review" look like at scale when the reviewer cannot read every line? What signals substitute for line-by-line review? (raised by: concepts/agentic-engineering, 2026-04-03)
Related Articles¶
- concepts/agentic-workflows
- concepts/ai-inflection-point
- concepts/claude-code
- concepts/prompt-injection
- entities/simon-willison
Sources¶
- An AI state of the union: We've passed the inflection point & dark factories are coming — Lenny's Podcast interview with Simon Willison, early 2026