Agentic Engineering¶

Category: concept Last updated: 2026-04-03 Status: draft

Summary¶

Agentic engineering is the practice of professional software development using AI coding agents — distinguished from "vibe coding" (building without reading or understanding the code) by its emphasis on quality, verifiability, and professional engineering standards. The term was coined by entities/simon-willison as a more precise alternative to "vibe coding" for engineers who use agents to produce production-ready software they have reviewed and validated. Key patterns include red/green TDD, starting from thin templates, maintaining a hoard of reusable research and tools, and keeping code costs low to prototype aggressively.

Details¶

Vibe Coding vs. Agentic Engineering¶

Vibe coding (Andrej Karpathy's original definition): you don't look at the code at all; you go on vibes — describe what you want, play with the result, keep iterating without reading the implementation. Originally framed for fun and prototyping. Now colloquially applied to all AI-assisted coding, which dilutes the term's utility.

==Agentic engineering: professional use of coding agents to produce production-quality software, with the engineer reviewing output, maintaining tests, managing agent direction, and applying decades of engineering judgment. Using agents well is mentally demanding — not trivially easy.==

The distinction matters because: - Vibe coding is irresponsible beyond personal projects (bugs can harm others; security, scraping, etc.) - Agentic engineering is the right frame for deploying software to real users

Product-oriented vibe coding (Priyanka Vergadia's "Vibe Code Framework"): a middle path — prompting AI as a Product Manager rather than a developer. The AI is treated as the implementer; the human is the Product Owner who makes all decisions. Structured as 5 phases: (1) Discovery — AI asks clarifying questions, separates must-haves from nice-to-haves; (2) Planning — blueprint and technical approach in plain language; (3) Building — staged construction with options presented when problems arise; (4) Polish — edge cases, device compatibility, finished feel; (5) Handoff — deployment and documentation so you aren't locked into one chat session. The golden rule: no prototypes — working product only.

Vibe Code A Startup visual summary

Key Patterns¶

Red/Green TDD¶

The single most important agentic engineering practice: always have the agent run the tests.

Without test execution, you're back to "copy-paste from ChatGPT and hope it works"
==Tell agents to write tests first, run them (watch them fail = "red"), implement the code, run them again (watch them pass = "green")==
==The phrase red/green TDD is programming jargon agents understand — replaces a paragraph of instructions==
Tests accumulate in the repo, giving future agents a safety net against regressions
Agent-written tests mean even boring/verbose test suites are cheap to maintain; Simon now accepts 100+ tests on small libraries

Starting with a Thin Template¶

Rather than writing long CLAUDE.md instructions, start every project from a thin code skeleton:

A single passing test (e.g., assert 1 + 1 == 2) in the preferred structure
A few lines of boilerplate formatted how you like
Agents observe this pattern and reproduce it throughout the project

This is more effective than verbose instructions because agents are very good at pattern-matching to existing code style.

Hoarding Techniques and Research¶

Build a personal backlog of solved problems, prototypes, and research outputs — the more you accumulate, the more you can combine to solve new problems.

entities/simon-willison's implementation: - simonw/tools — 193+ small HTML/JS tools; each captures a thing that is possible to do - ==simonw/research — 75+ agent-driven research projects in public + ~50 private; each a markdown report where an agent wrote and ran code, not just summarized web content==

Key insight: agent-generated research where code was actually written and run is far more valuable than "deep research" summaries, because it's verified and actionable.

==Usage: tell the agent to git checkout simonw/research and look at relevant examples before solving a new problem in the same domain.==

Code is Cheap — Prototype in 3 Directions¶

Because coding is cheap, the right response is to generate multiple variants before committing to a direction:

==Prototype any feature 3 ways; the low cost of generation means you can compare before committing==
This was Simon's longtime superpower (fast prototyping), now democratized — anyone can do it
==The new scarce resource is knowing when to prototype and how to evaluate the result==

Managing Agents¶

Run agents in "YOLO" / dangerously-skip-permissions mode to avoid constant approval interruptions — lets you run 4 in parallel and check in periodically rather than babysitting
Claude Code for web (Anthropic-hosted) is safer for YOLO mode: the worst case is wasting Anthropic's compute, not deleting your laptop
Review agent output via GitHub PRs the same way you'd review another engineer's code
Use your phone for prompting; review on desktop later if needed

Who Benefits Most¶

Amplification is strongest for experienced engineers who can: - Write one-sentence prompts that precisely scope a problem - Recognize immediately when an agent is going in the wrong direction - Evaluate output quality without reading every line - Know what isn't possible yet for current models

Mid-career engineers are at highest risk — not enough experience to effectively direct agents, but past the onboarding-speed boost that helps beginners.

The Future of Engineering Value¶

The bottleneck has shifted from writing code to: - Deciding what to build - Spec-writing and communication - Evaluating correctness and quality - Security judgment - Code review at a higher level of abstraction - Managing agent state and context effectively

==Code is now so cheap that the cost of building something is negligible; the value lies in knowing what to build and whether what was built is good.==

Key Claims & Data Points¶

95% of Simon's code is now AI-written; he finds managing 4 parallel agents mentally exhausting by 11am — [source: wc8FBhQtdsA]
red/green TDD as a two-word prompt replaces a paragraph of instructions and materially improves agent output — [source: wc8FBhQtdsA]
Starting from a thin template (single test) is more effective than verbose CLAUDE.md instructions for setting code style — [source: wc8FBhQtdsA]
Mid-career engineers identified as most at risk; beginners and senior engineers each have distinct advantages — [source: wc8FBhQtdsA] (citing Thoughtworks VP survey)
Prototyping is "essentially free" — build 3 variants of any feature before committing — [source: wc8FBhQtdsA]

Open Questions¶

What is the best way for mid-career engineers to develop agent-direction skill quickly? (raised by: concepts/agentic-engineering, 2026-04-03)
How do agentic engineering patterns transfer to non-code knowledge work (legal, medical, editorial)? (raised by: concepts/agentic-engineering, 2026-04-03)
What does "code review" look like at scale when the reviewer cannot read every line? What signals substitute for line-by-line review? (raised by: concepts/agentic-engineering, 2026-04-03)

Sources¶

An AI state of the union: We've passed the inflection point & dark factories are coming — Lenny's Podcast interview with Simon Willison, early 2026