Glossary¶

Key terms, jargon, and acronyms. Format: **Term** — Definition. See [[article]].

Agent Harness — The complete software infrastructure wrapping an LLM that transforms it from a stateless text generator into a capable agent: orchestration loop, tools, memory, context management, state persistence, error handling, and guardrails. Canonical formula (Vivek Trivedy / LangChain): "If you're not the model, you're the harness." See concepts/agent-harness.

AISI (AI Security Institute) — UK government body that evaluates the capabilities and safety of frontier AI models. Known for structured multi-step attack scenario evaluations; published findings that Claude Opus 4.6 completed 15.6/32 enterprise attack steps in March 2026. See concepts/frontier-ai-cyber-capabilities.

AIxCC (AI Cyber Challenge) — DARPA-sponsored challenge for AI-enabled vulnerability discovery and remediation; cited by NCSC as an example of AI reducing attacker windows of opportunity through faster patching. See concepts/frontier-ai-cyber-capabilities.

Dual-Use AI Capabilities — AI capabilities that are equally applicable to offensive and defensive cyber operations. Vulnerability identification, exploit development, and reconnaissance automation all serve both red teams and attackers. AISI's evaluation framework explicitly acknowledges this property. See concepts/frontier-ai-cyber-capabilities and concepts/ai-red-teaming.

NCSC (National Cyber Security Centre) — UK government agency providing cyber security guidance. Published March 2026 blog arguing defenders must assume some attackers already have capable AI tools and must proactively deploy AI for defense. See concepts/frontier-ai-cyber-capabilities.

Shape the Battlefield — The structural defensive advantage in cyber security: defenders can configure and tune their own environment to disadvantage attackers, while attackers must adapt to whatever they find. AI amplifies this advantage at scale. When defenders lose this advantage (weak baselines, low AI adoption), the information gap between attackers and defenders narrows quickly. See concepts/frontier-ai-cyber-capabilities.

Agentic Knowledge Management (AKM) — The practice of using AI agents to actively apply knowledge from a personal knowledge graph to current problems, rather than passively storing it in notes. See concepts/personal-knowledge-management.

AI Skills (Agent Skills) — Executable AI behaviors derived from specific knowledge (e.g., notes from a book), which can be invoked by an agent to apply that knowledge to real problems. Unlike passive notes, skills are active — they show up when relevant. See concepts/personal-knowledge-management.

Atomic Notes — Notes that capture exactly one idea, written in your own words, and designed to link to other notes. The building block of a personal knowledge graph. The primary heuristic for atomicity is connectability: can this idea link to other ideas? See concepts/personal-knowledge-management.

80% Rule — The observation that AI typically delivers ~80% of desired output on the first attempt; the remaining 20% requires human domain expertise and iteration. AI delivers 100% only on simple, well-scoped tasks. See concepts/ai-for-small-business.

Agentic Engineering — Professional software development using coding agents to produce production-quality code, emphasizing engineering judgment, testing, and review — as distinct from vibe coding. Coined by entities/simon-willison. See concepts/agentic-engineering.

Agentic Workflows — Composable LLM system patterns: prompt chaining, routing, parallelization, orchestrator-workers, and evaluator-optimizer. Distinct from true autonomous agents (which dynamically direct their own processes). See concepts/agentic-workflows.

ACI (Agent-Computer Interface) — The set of tool definitions and documentation that an LLM agent uses to interact with systems. Analogous to HCI; deserves equal engineering investment. Coined in Anthropic's "Building Effective Agents" guide. See concepts/agentic-workflows.

AI Red Teaming — The practice of probing the safety and security of generative AI systems by emulating real-world attacks and failure modes. Distinct from safety benchmarking: benchmarks compare models on fixed datasets; red teaming discovers novel harm categories and context-specific risks. See concepts/ai-red-teaming.

ACATS (Automated Customer Account Transfer Service) — A FINRA-regulated system for transferring brokerage accounts between institutions. Exploited by fraudsters who open a new account in the victim's name and initiate a transfer — no credential compromise required. Fidelity offers an account lock; most brokerages do not. See concepts/llm-tier-security.

Mythos — Anthropic's frontier model (preview announced Apr 2026) demonstrating qualitatively higher exploit-finding capability: 595 tier-1/2 crashes and 10 tier-5 (full control-flow hijack) results on patched OSS-Fuzz targets, vs. a single tier-3 crash for Sonnet 4.6/Opus 4.6. The "LLM-tier security" threat model is largely premised on this capability becoming broadly accessible. See concepts/llm-tier-security and concepts/ai-red-teaming.

Supply-Chain Attack — An attack where malicious code is injected into a legitimate software package or dependency, infecting all downstream users. Increasing due to AI-assisted attack automation lowering the cost of compromise. See concepts/llm-tier-security.

FIDO2 / WebAuthn — Authentication standard used by YubiKeys and passkeys. Phishing-resistant because the browser only issues a credential scoped to the exact current domain — relay attacks cannot succeed. See concepts/llm-tier-security.

OpenSnitch — An interactive outbound application firewall for Linux that prompts for approval when an application makes an unexpected network connection. Detects malware attempting to phone home and eliminates silent telemetry as an attack surface. See concepts/llm-tier-security.

XPIA (Cross-Prompt Injection Attack) — A prompt injection attack where malicious instructions are hidden inside documents that an agent retrieves and processes (e.g., via RAG). Exploits the LLM's inability to distinguish data from instructions. See concepts/prompt-injection and concepts/ai-red-teaming.

Crescendo — A multi-turn jailbreak technique: gradually escalate requests across a conversation until the model complies with content it would have refused in turn 1. Available as an automated strategy in PyRIT. See concepts/ai-red-teaming.

PyRIT — Microsoft's open-source Python framework for AI red teaming; includes prompt datasets, converters, automated attack strategies (TAP, PAIR, Crescendo), and multimodal scorers. See concepts/ai-red-teaming.

RAI (Responsible AI) Harms — A category of AI safety impacts related to generation of harmful content (hate speech, violence, self-harm, bias). Subjective, probabilistic, and difficult to measure — contrasted with traditional security vulnerabilities which are reproducible and objective. See concepts/ai-red-teaming.

Prompt Chaining — An agentic workflow pattern where a task is decomposed into sequential LLM calls, each processing the output of the previous. See concepts/agentic-workflows.

Orchestrator-Workers — An agentic workflow pattern where a central orchestrator LLM dynamically decomposes a task and delegates subtasks to worker LLMs. See concepts/agentic-workflows.

Evaluator-Optimizer — An agentic workflow pattern where one LLM generates a response and a second evaluates and critiques it in a feedback loop. See concepts/agentic-workflows.

Bright Data — Commercial web scraping infrastructure providing 150M+ residential IPs, automatic CAPTCHA solving, and geolocation targeting; accessed via MCP in the four-tier progressive scraping system. See guides/progressive-web-scraping.

PAI (Personal AI) — Daniel Miessler's open-source repository of AI skills and workflows for Claude Code; aims to democratize advanced AI capabilities. See guides/progressive-web-scraping.

Ask Claude First — Learning principle: instead of searching YouTube or documentation, ask the AI itself what it can do for your specific situation. The AI asks clarifying questions and tailors suggestions to you. See concepts/ai-for-small-business.

Business Data Pipeline — The pattern of aggregating data from multiple business tools (CRM, email, project management, meetings) into a central database and granting AI read access, enabling cross-system business intelligence from a single prompt. See concepts/ai-for-small-business.

Challenger Disaster (AI) — entities/simon-willison's prediction that a high-profile, catastrophic prompt injection event will eventually force the industry to take security seriously — analogous to the Space Shuttle Challenger disaster and the normalization of deviance that preceded it. See concepts/prompt-injection.

Agent Skills Open Standard — An open specification for AI agent skills (agentskills.io), published December 2025 by Anthropic. A skill built for Claude can run in other tools (ChatGPT, Cursor, etc.) that adopt the spec. See concepts/claude-code-skills.

Claude Code — An Anthropic tool enabling non-programmers to create software by describing desired behavior in natural language, and professionals to run multiple coding agents in parallel. Available locally and as a web/hosted version. See concepts/claude-code.

Claude Code Skills — Modular SKILL.md-based capabilities that extend Claude Code; invoke with /skill-name or let Claude load them automatically. Follow the agentskills.io open standard. See concepts/claude-code-skills.

/batch — Bundled Claude Code skill that orchestrates large-scale codebase changes in parallel; decomposes work into 5–30 units and spawns one background agent per unit in an isolated git worktree. See concepts/claude-code-skills.

/simplify — Bundled Claude Code skill that spawns three parallel review agents to check recently changed files for code quality issues, then applies fixes. See concepts/claude-code-skills.

context: fork — SKILL.md frontmatter value that runs a skill in an isolated subagent context; the skill content becomes the subagent's prompt without access to the conversation history. See concepts/claude-code-skills.

disable-model-invocation — SKILL.md frontmatter field; true prevents Claude from auto-invoking the skill, removing its description from context entirely. Use for side-effect workflows like /deploy. See concepts/claude-code-skills.

Claude Cowork — Anthropic's desktop application for non-technical agentic work; operates in an isolated virtual machine; organizes documents, extracts data, and drafts summaries. Distinct from Claude Code (which targets developers). See concepts/claude-code.

Claude Opus 4.6 — Anthropic's most capable frontier model as of early 2026; noted for writing quality and extended thinking capability. See guides/ai-tool-selection.

Claude Opus — The most capable Claude model as of March 2026, recommended for complex tasks. See concepts/ai-for-small-business.

Dashboard Model — A predicted near-future human-AI work pattern: one person reviews a queue of actions the AI wants to take and approves or denies them, replacing what previously required teams of 100. See concepts/ai-for-small-business.

Dark Factory Pattern — Software engineering approach where no engineer writes or reads the code; AI agents build and test everything autonomously. Named by analogy to automated factories that run with the lights off. Pioneered by StrongDM (Aug 2025) and validated by OpenAI Codex team (Aug 2025 – Mar 2026, ~1M lines). See concepts/ai-inflection-point and concepts/harness-engineering.

Eye of Sauron — Informal term (from Lord of the Rings) used by entities/chuck-kyle for an AI with read access to all company data, able to synthesize activity across all systems on demand. See concepts/ai-for-small-business.

GoHighLevel — A CRM platform used for managing leads, conversations, and sales pipelines. One of six systems in entities/chuck-kyle's data pipeline.

llama.cpp — Open-source inference runtime built on GGML; the foundation of the local AI ecosystem. Supports CUDA (NVIDIA), Metal (Apple Silicon), Vulkan (cross-platform), and CPU-only inference. Includes llama-server for OpenAI-compatible API exposure. See guides/local-agent-stack.

LLM Knowledge Base — A personal research system where an LLM incrementally compiles raw source documents into a structured Markdown wiki, then operates on it for Q&A, output generation, and linting. Described by Karpathy; no RAG needed at ~100 articles / ~400K words. See concepts/llm-knowledge-base.

Lethal Trifecta — A prompt injection scenario with three conditions: (1) agent has access to private data, (2) attacker can insert malicious instructions, (3) agent has an exfiltration mechanism. The most dangerous class of prompt injection vulnerability. See concepts/prompt-injection.

Monday.com — A project management platform for tracking tasks, assignments, and project status. One of six systems in entities/chuck-kyle's data pipeline.

Golden Principles — Opinionated mechanical rules baked into a repository to keep an agent-generated codebase coherent over time; enforced via custom linters and recurring cleanup agents. Examples: prefer shared utility packages over hand-rolled helpers; validate data at boundaries, don't probe shapes "YOLO-style." See concepts/harness-engineering.

GGUF (GGML Universal File Format) — Self-describing binary container for quantized LLM weights; carries architecture parameters, tokenizer config, quantization details, and weights. De facto standard for local model distribution on Hugging Face. See guides/local-agent-stack.

Grammar-Constrained Decoding — Inference technique that constrains LLM output to a formal grammar (e.g., GBNF), forcing valid JSON matching a tool schema. Eliminates malformed function calls — the most common failure mode in local agent setups. See guides/local-agent-stack.

Harness Engineering — The human engineering role in an agent-first codebase: designing environments, specifying intent, and building feedback loops — rather than writing code. Coined by the OpenAI Codex team. See concepts/harness-engineering and concepts/agent-harness.

MCP (Model Context Protocol) — A standardized protocol for AI agents to invoke external tools via MCP servers; includes an OAuth 2.1-based authorization framework for authenticated access. See concepts/mcp-authentication.

Models/Apps/Harnesses — A three-layer framework for understanding AI tools, introduced by Ethan Mollick: Models are the underlying AI intelligence; Apps are the products users interact with; Harnesses give AI autonomous tool-use capability. See guides/ai-tool-selection.

NotebookLM — Google's document-centric AI application; accepts documents, videos, websites, and files as input and generates interactive knowledge bases, mind maps, and AI podcasts summarizing source material. Has a free tier. See guides/ai-tool-selection.

Managed Identity as FIC (MI-as-FIC) — Azure security pattern where a managed identity serves as a Federated Identity Credential, allowing a service to authenticate to Entra without managing client secrets. Used in production MCP server deployments. See concepts/mcp-authentication.

Marginalia — Notes written in book margins. Not knowledge management — ideas trapped in source material cannot connect to anything outside that book and create no leverage. See concepts/personal-knowledge-management.

Inline Delegation — The pattern of writing a single sentence in an Obsidian note and having Claude Code execute it as a task; turns writing into a command surface. "One sentence in Obsidian → agent handles the rest." See concepts/obsidian-claude-code-os.

OAuth 2.1 — The authorization framework underlying MCP's authentication spec; MCP clients act as OAuth clients, servers as resource servers, and identity providers (e.g., Entra) as authorization servers. See concepts/mcp-authentication.

On-Behalf-Of (OBO) Flow — OAuth pattern where a service exchanges an inbound user token for a new token scoped to a downstream API, allowing it to call that API as the signed-in user. Used in MCP servers to call Microsoft Graph. See concepts/mcp-authentication.

Normalization of Deviance — Sociological phenomenon where repeated near-misses build false confidence in an unsafe system, eventually culminating in a major failure. Applied to AI security by Simon Willison. See concepts/prompt-injection.

Obsidian CLI — A command-line interface for Obsidian that exposes not just file content but the vault's relationship graph (bidirectional links between notes) to external tools like Claude Code. Without it, agents are file readers; with it, they can traverse the knowledge graph. See concepts/obsidian-claude-code-os.

Pelican Riding a Bicycle Benchmark — A qualitative LLM benchmark created by entities/simon-willison: generate an SVG image of a pelican riding a bicycle. Correlates unexpectedly well with overall model capability; now widely known in the AI community.

CVE-2026-33579 — Critical privilege escalation vulnerability in OpenClaw (CVSS 8.1–9.8): any caller with operator.pairing scope could silently obtain full admin access. Patched April 2026; two-day window between patch and CVE publication. 63% of 135K exposed instances ran without auth. See concepts/openclaw-security.

OpenClaw — Open-source personal AI assistant (also: Clawdbot, Moltbot); first line of code Nov 2025, Super Bowl ad ~3.5 months later. Canonical example of the lethal trifecta in the wild. Safe to run in Docker with a dedicated email address. See concepts/openclaw-security, concepts/prompt-injection, and guides/openclaw-docker.

Progressive Disclosure — Context management pattern for agent-first codebases: agents start from a short AGENTS.md table of contents and navigate to deeper documentation as needed, rather than being given all information upfront. See concepts/harness-engineering.

Personal Knowledge Graph (PKG) — The network of connected atomic notes across all domains and time periods. Value grows non-linearly with size; each new note has more potential connections than the last. See concepts/personal-knowledge-management.

Progressive Disclosure — Context management pattern for agent-first codebases: agents start from a short AGENTS.md table of contents and navigate to deeper documentation as needed, rather than being given all information upfront. See concepts/harness-engineering.

Pre-registration (MCP) — Client authorization method where the MCP server pre-registers known client IDs with the identity provider. The only method Microsoft Entra ID supports; required for VS Code + Entra MCP implementations. See concepts/mcp-authentication.

Prompt Injection — A class of LLM security vulnerabilities where untrusted text in the model's input overrides developer instructions. Coined by entities/simon-willison in 2022. See concepts/prompt-injection.

Reference Files — Vault notes that capture the current state of a project (status, blockers, decisions, links) to eliminate session re-explanation overhead. Instead of re-explaining context at the start of each Claude Code session, you load the reference file. See concepts/obsidian-claude-code-os.

Red/Green TDD — Test-driven development shorthand for coding agents: write the test first (watch it fail = red), then implement (watch it pass = green). The two-word phrase is jargon agents understand, replacing a paragraph of instructions. See concepts/agentic-engineering.

Tool-Model Independence — The philosophy of caring about learning AI workflows generally rather than being attached to a specific product (ChatGPT, Claude, etc.); switch to whatever model is currently best. See entities/chuck-kyle.

Vibe Coding — Building software without reading or understanding the code; going entirely on results and iteration. Coined by Andrej Karpathy; appropriate for personal prototypes but not production software. See concepts/agentic-engineering.

YOLO Mode — Informal name for Claude Code's dangerously-skip-permissions flag, which disables per-action approval prompts. Enables running multiple agents in parallel without constant interruptions. OpenAI's Codex uses the explicit label "YOLO mode." See concepts/claude-code.

Dataview — Obsidian plugin that runs queries over YAML frontmatter in notes (tags, dates, source counts) and generates live dynamic tables and lists. Requires articles to have structured YAML frontmatter. See concepts/obsidian-claude-code-os.

Memex — Vannevar Bush's 1945 theoretical personal knowledge machine: a desk device storing all documents on microfilm with "associative trails" linking related materials. Cited by Karpathy as the historical antecedent to LLM knowledge bases — private, curated, connections as valuable as the documents themselves. See entities/vannevar-bush and concepts/llm-knowledge-base.

Obsidian Web Clipper — Browser extension that converts web articles to Markdown and saves them directly into an Obsidian vault. Primary tool for ingesting raw sources into raw/articles/. See concepts/obsidian-claude-code-os.

Persistent, Compounding Artifact — Karpathy's characterization of an LLM wiki: a knowledge store that accumulates and integrates over time, as opposed to RAG systems that re-derive knowledge from scratch on each query. See concepts/llm-knowledge-base.

qmd — On-device hybrid search engine for Markdown files (BM25 + vector + LLM reranking); built with Node.js/Bun and node-llama-cpp; exposes a CLI and MCP server. Recommended for LLM wiki search at 100+ articles. See concepts/llm-knowledge-base.

Ralph Loop — Anthropic's two-phase pattern for long-running tasks spanning multiple context windows: an Initializer Agent sets up the environment, then a Coding Agent in every subsequent session reads git logs and progress files to orient itself, picks the highest-priority incomplete feature, works and commits. The filesystem provides continuity. See concepts/agent-harness.

ReAct Loop (TAO Cycle) — The Thought-Action-Observation cycle at the heart of an agent harness: assemble prompt → call LLM → parse output → execute tool calls → feed results back → repeat until done. See concepts/agent-harness.

Schema (wiki) — The CLAUDE.md or AGENTS.md configuration document that defines wiki structure, conventions, article templates, and LLM workflows. What makes an LLM a disciplined wiki maintainer rather than a generic chatbot. Co-evolved with the LLM over time. See concepts/llm-knowledge-base.

Scaffolding (LLM) — Temporary infrastructure wrapping an LLM that enables it to accomplish tasks it couldn't otherwise. Key insight: as models improve, scaffolding complexity should decrease. Manus was rebuilt five times in six months, each time removing complexity. See concepts/agent-harness.

TerminalBench — Benchmark for evaluating coding agents. LangChain jumped from outside top 30 to rank 5 (v2.0) by changing only the harness, demonstrating that harness design can matter more than model choice. See concepts/agent-harness.

Sub-Agent — A specialized LLM instance running in its own isolated context; accepts a focused task, returns only the final output (not reasoning or intermediate steps). Constraints: cannot communicate with other sub-agents, cannot spawn child agents, all coordination flows through the parent. About compression, not just speed. See concepts/agentic-workflows.

Agent Team — A multi-agent architecture where agents maintain shared context, communicate in real time, and adapt dynamically. Composed of a lead agent (assigns/synthesizes), teammates (execute), and a shared task layer (tracks progress and dependencies). See concepts/agentic-workflows.

Context-Based Decomposition — The correct heuristic for multi-agent task splitting: ask what information a task actually needs and keep tasks together if they share deep context. Contrasts with role-based decomposition (planner/developer/tester), which loses context at every handoff. See concepts/agentic-workflows.

Narrative Drift — The process by which a multi-agent AI system's official institutional record gradually diverges from ground truth through role-faithful information compression. Each agent compresses incoming information to fit its function; downstream agents inherit and further compress the compressed version; eventually the collective record omits or misrepresents the original cause. Distinct from individual agent misalignment — each agent may act reasonably, but their collective outputs produce a false account. See concepts/multi-agent-misalignment.

Multi-Agent Misalignment — An emergent failure mode in which individually aligned AI agents collectively produce false, misleading, or harmful institutional outcomes. The failure arises from organizational topology — role-bounded agents compressing information to fit their function — not from any single agent misbehaving. Demonstrates that individual alignment is necessary but not sufficient for organizational integrity. See concepts/multi-agent-misalignment.

MAST (Multi-Agent Failure Vocabulary) — A vocabulary for categorizing multi-agent AI failures. Key terms: inter-agent misalignment (agents' outputs are inconsistent with each other), reasoning-action mismatch (agent's written reasoning contradicts the state it records), incomplete verification (no agent checks whether the collective record matches ground truth). Applied by Rohit Krishnan to the Helios Field Services narrative drift experiment. See concepts/multi-agent-misalignment.

Homo Agenticus — Rohit Krishnan's term for the behavioral profile of current AI agents: highly capable at following instructions and role descriptions, but lacking human-style initiative. Agents are "prisoners to their instructions" — they do not proactively escalate, challenge shared narratives, or act outside defined scope. This property, while desirable for safety, causes brittle collective behavior in multi-agent organizations. See concepts/multi-agent-misalignment.

State Keeper Agent — A proposed multi-agent architecture pattern where one dedicated agent maintains the authoritative truth across an organization, reconciling all other agents' outputs to prevent narrative drift. The single-agent result (which does not drift) validates the intuition: if one entity "knows everything," drift cannot occur. See concepts/multi-agent-misalignment.

DESIGN.md — A version-controlled file format open-sourced by Google Stitch (April 2026) for encoding design rules, preferences, and conventions that AI agents can read and enforce. Platform-agnostic; designed for export/import across projects. Part of the broader context-files family alongside CLAUDE.md and AGENTS.md. See concepts/context-files and entities/google-stitch.

Context Files — A family of version-controlled Markdown files that provide persistent, machine-readable project context to AI agents: CLAUDE.md (Claude Code project context), AGENTS.md (general agent instructions), DESIGN.md (design rules), SKILL.md (reusable skill definitions). Emerging de facto standard for agent-readable project documentation. See concepts/context-files.

SB24-205 — Colorado Senate Bill 24-205; an AI regulation bill that would have imposed broad requirements on AI companies operating in Colorado. The DOJ filed a constitutional challenge on April 25, 2026, and Colorado agreed within hours to delay enforcement against all AI companies pending legislative repeal. See concepts/ai-regulation.

State Preemption (AI) — The constitutional doctrine that federal authority supersedes state-level AI regulation under the Supremacy Clause and Commerce Clause. Applied by the DOJ in its challenge to Colorado SB24-205: AI is inherently interstate and international; state-level requirements create unconstitutional burdens. Sets a precedent for federal blocking of state AI laws. See concepts/ai-regulation.

/compact — A Claude Code command that compresses long conversation histories to free up context window space while preserving key decisions and context. Recommended for use when sessions get very long. See concepts/claude-code and concepts/context-files.

.claudeignore — A Claude Code configuration file (analogous to .gitignore) that specifies files and directories Claude should not access. Used to protect sensitive configuration files, credentials, and private data. See concepts/context-files and concepts/claude-code.