Skip to content

Agentic Engineering

Category: concept Last updated: 2026-04-03 Status: draft

Summary

Agentic engineering is the practice of professional software development using AI coding agents — distinguished from "vibe coding" (building without reading or understanding the code) by its emphasis on quality, verifiability, and professional engineering standards. The term was coined by entities/simon-willison as a more precise alternative to "vibe coding" for engineers who use agents to produce production-ready software they have reviewed and validated. Key patterns include red/green TDD, starting from thin templates, maintaining a hoard of reusable research and tools, and keeping code costs low to prototype aggressively.

Details

Vibe Coding vs. Agentic Engineering

Vibe coding (Andrej Karpathy's original definition): you don't look at the code at all; you go on vibes — describe what you want, play with the result, keep iterating without reading the implementation. Originally framed for fun and prototyping. Now colloquially applied to all AI-assisted coding, which dilutes the term's utility.

==Agentic engineering: professional use of coding agents to produce production-quality software, with the engineer reviewing output, maintaining tests, managing agent direction, and applying decades of engineering judgment. Using agents well is mentally demanding — not trivially easy.==

The distinction matters because: - Vibe coding is irresponsible beyond personal projects (bugs can harm others; security, scraping, etc.) - Agentic engineering is the right frame for deploying software to real users

Product-oriented vibe coding (Priyanka Vergadia's "Vibe Code Framework"): a middle path — prompting AI as a Product Manager rather than a developer. The AI is treated as the implementer; the human is the Product Owner who makes all decisions. Structured as 5 phases: (1) Discovery — AI asks clarifying questions, separates must-haves from nice-to-haves; (2) Planning — blueprint and technical approach in plain language; (3) Building — staged construction with options presented when problems arise; (4) Polish — edge cases, device compatibility, finished feel; (5) Handoff — deployment and documentation so you aren't locked into one chat session. The golden rule: no prototypes — working product only.

Vibe Code A Startup visual summary

Key Patterns

Red/Green TDD

The single most important agentic engineering practice: always have the agent run the tests.

  • Without test execution, you're back to "copy-paste from ChatGPT and hope it works"
  • ==Tell agents to write tests first, run them (watch them fail = "red"), implement the code, run them again (watch them pass = "green")==
  • ==The phrase red/green TDD is programming jargon agents understand — replaces a paragraph of instructions==
  • Tests accumulate in the repo, giving future agents a safety net against regressions
  • Agent-written tests mean even boring/verbose test suites are cheap to maintain; Simon now accepts 100+ tests on small libraries

Starting with a Thin Template

Rather than writing long CLAUDE.md instructions, start every project from a thin code skeleton:

  • A single passing test (e.g., assert 1 + 1 == 2) in the preferred structure
  • A few lines of boilerplate formatted how you like
  • Agents observe this pattern and reproduce it throughout the project

This is more effective than verbose instructions because agents are very good at pattern-matching to existing code style.

Hoarding Techniques and Research

Build a personal backlog of solved problems, prototypes, and research outputs — the more you accumulate, the more you can combine to solve new problems.

entities/simon-willison's implementation: - simonw/tools — 193+ small HTML/JS tools; each captures a thing that is possible to do - ==simonw/research — 75+ agent-driven research projects in public + ~50 private; each a markdown report where an agent wrote and ran code, not just summarized web content==

Key insight: agent-generated research where code was actually written and run is far more valuable than "deep research" summaries, because it's verified and actionable.

==Usage: tell the agent to git checkout simonw/research and look at relevant examples before solving a new problem in the same domain.==

Code is Cheap — Prototype in 3 Directions

Because coding is cheap, the right response is to generate multiple variants before committing to a direction:

  • ==Prototype any feature 3 ways; the low cost of generation means you can compare before committing==
  • This was Simon's longtime superpower (fast prototyping), now democratized — anyone can do it
  • ==The new scarce resource is knowing when to prototype and how to evaluate the result==

Managing Agents

  • Run agents in "YOLO" / dangerously-skip-permissions mode to avoid constant approval interruptions — lets you run 4 in parallel and check in periodically rather than babysitting
  • Claude Code for web (Anthropic-hosted) is safer for YOLO mode: the worst case is wasting Anthropic's compute, not deleting your laptop
  • Review agent output via GitHub PRs the same way you'd review another engineer's code
  • Use your phone for prompting; review on desktop later if needed

Who Benefits Most

Amplification is strongest for experienced engineers who can: - Write one-sentence prompts that precisely scope a problem - Recognize immediately when an agent is going in the wrong direction - Evaluate output quality without reading every line - Know what isn't possible yet for current models

Mid-career engineers are at highest risk — not enough experience to effectively direct agents, but past the onboarding-speed boost that helps beginners.

The Future of Engineering Value

The bottleneck has shifted from writing code to: - Deciding what to build - Spec-writing and communication - Evaluating correctness and quality - Security judgment - Code review at a higher level of abstraction - Managing agent state and context effectively

==Code is now so cheap that the cost of building something is negligible; the value lies in knowing what to build and whether what was built is good.==

Key Claims & Data Points

  • 95% of Simon's code is now AI-written; he finds managing 4 parallel agents mentally exhausting by 11am — [source: wc8FBhQtdsA]
  • red/green TDD as a two-word prompt replaces a paragraph of instructions and materially improves agent output — [source: wc8FBhQtdsA]
  • Starting from a thin template (single test) is more effective than verbose CLAUDE.md instructions for setting code style — [source: wc8FBhQtdsA]
  • Mid-career engineers identified as most at risk; beginners and senior engineers each have distinct advantages — [source: wc8FBhQtdsA] (citing Thoughtworks VP survey)
  • Prototyping is "essentially free" — build 3 variants of any feature before committing — [source: wc8FBhQtdsA]

Open Questions

  • What is the best way for mid-career engineers to develop agent-direction skill quickly? (raised by: concepts/agentic-engineering, 2026-04-03)
  • How do agentic engineering patterns transfer to non-code knowledge work (legal, medical, editorial)? (raised by: concepts/agentic-engineering, 2026-04-03)
  • What does "code review" look like at scale when the reviewer cannot read every line? What signals substitute for line-by-line review? (raised by: concepts/agentic-engineering, 2026-04-03)

Sources