Topic hub · Last updated May 9, 2026
Agent Ops
Most agents look great in a demo and break in week two. Agent Ops is the operating discipline that prevents that: evals you trust, logs you read, retries you tune, and a way of knowing when an upgrade silently regressed a workflow that has been quietly running for months.
Below: everything we have published on agent ops.
Start here
Coverage of evals, observability, and the day-two operating discipline is on the roadmap. The first workflow article in this topic lands once we publish our internal eval harness.
Workflows & explainers
- When to use /goal vs. /loop in Claude Code Two Claude Code commands that both keep it running without you re-prompting. `/goal` drives one job forward, turn after turn, until a finish line you define is …
- What is the Compound Engineering plugin for Claude Code? The Compound Engineering plugin is a free add-on for [Claude Code](/learn/claude-code) that bolts a plan, build, review, learn loop plus a library of 37 skills …
- When to use a goal vs. a dynamic workflow in Claude Code Two new Claude Code features that both let it keep working without you. A dynamic workflow goes wide. A goal keeps going. One question tells you which to reach …
- How do I keep my OpenCLAW agent from posting things on the internet? Short version: your agent posts and sends with your credentials, on the channels you connected, so you keep it from going rogue with three things. A human-appro…
- What are dynamic workflows in Claude Code? Claude Code's new ability to plan a big job, run tens to hundreds of agents on its pieces in parallel, check the work, and hand you one finished result. Anthrop…
- What is Microsoft Scout and how can you use it? Microsoft's first always-on "Autopilot" agent. It runs on the exact open [harness](/articles/what-is-a-harness) operators here already build with, [OpenCLAW](/a…
- AI agents vs AI assistants: which one for which job? Assistant: you ask, it responds, you steer. Agent: you hand over a goal, it plans and acts on its own. The only question that matters is which one fits the job …
- A competitor-monitoring routine that pings you on real moves A scheduled job that watches your competitors for you and alerts you only when something genuinely changes, so you stop missing the moves and stop drowning in n…
- Your daily executive brief, assembled before you wake up A scheduled AI job that runs before you wake, reads your inbox and calendar, pulls the one metric you watch, and leaves a single-page brief waiting for you, so …
- What are connectors in Claude Code? The plug that lets Claude reach into the tools you already use, your email, Slack, your numbers, and do the work there instead of just talking about it.
- What is a context window? The model's working memory: everything it can see right now, measured in tokens. If the thing it needs isn't in the window, the model doesn't ask for it. It gue…
- What is an AI agent? An AI model put to work in a loop: it decides, takes an action, looks at the result, and repeats until the job is done. A chatbot talks and stops. An agent acts…
- What is gbrain? A memory layer for your AI agent, open-sourced by the CEO of Y Combinator. It is the fix for the thing that quietly breaks every agent: it forgets.
- What is Google Antigravity? Google's agent harness. The engine room underneath Gemini Spark, and Google's answer to the open harness layer that OpenCLAW and Hermes occupy.
- What is Google Spark and how can you use it? Google's 24/7 personal AI agent. The mainstream, no-terminal version of the "agent that works while you sleep" that operators here build with Claude Code and Op…
- What is MCP? The Model Context Protocol. The universal plug that lets any AI app connect to any tool or data source. Think USB-C for agents: one port, and everything that sp…
- What is RAG? Retrieval-augmented generation. The trick that turns an AI that knows the internet into an AI that knows your company. Every "chat with your docs" tool you have…
- Codex vs Claude Code I run Claude Code every day. Codex is the coding agent your team is arguing about, and there is a good chance you already pay for it without knowing. Here is ho…
- What is agentskills.io? The open standard that makes the skills you build portable. Write a capability once, and run it on whatever tool you use next.
- What is Codex? OpenAI's coding agent. The same idea as Claude Code, wearing an OpenAI badge, and probably already bundled in the ChatGPT plan you pay for every month.
- What is a cloud VM? A computer you rent that lives in someone else's data center. Always on. Accessible from anywhere. Where you run an agent when you do not want to keep your lapt…
- What is a SQLite database? A whole database that lives in a single file on your computer. No server. No admin. No login. The thing most agent [harnesses](/articles/what-is-a-harness) use …
- What are skills in Claude Code? The folder that turns a workflow you ran once into a capability Claude executes on command, forever.
- Run autonomous workflows 24/7 with Claude Code Routines Set one recurring task to run on Anthropic's servers, with your laptop closed.
- What is a CLAUDE.md file? The markdown file Claude reads on startup so it stops asking "what business are we in?" every time you open it.
- What is a cron? The thing that makes your agent wake up on its own and do things for you without you having to be there to ask.
All articles & long reads
- 10 ways a CEO can put AI to work this week Three of every four professionals already let an AI notetaker sit in their meetings. You probably pay for ChatGPT or Claude and use it the way you'd use a smart…
- How to get your team to actually use AI You gave the speech. You bought the seats. Three weeks later nobody's using it. The gap isn't access. It's adoption, and adoption follows the person at the top.
- Forget who's winning the harness wars. Pick one and start. Hermes had a loud week and the timelines lit up. For a CEO, the winner of that fight matters far less than the fact that you still have not picked a side.
- Hand your agent a half-day of work and walk away Most CEOs hand an AI a five-minute task and then hover over it. The operators getting real leverage hand it half a day of work, set one checkpoint, and go to a …
- How I replaced a $10K/month agency with an AI stack The agency sent beautiful reports every month. Charts that looked great and meant nothing in the bank account. The stack that replaced them costs a few hundred …
- Build the second brain before you build the agent Everyone wants the agent. The ones that actually earn their keep are standing on a pile of your decisions, meetings, and notes. Build the pile first, and the ag…
- Why most AI agents fall apart in real work (and how to fix it) The agent that nailed your demo is not getting dumber. It's running out of context. The fix is not a smarter model; it's setting the agent up to succeed.
- Why AI sounds confident when it's wrong (and how to catch it) Lawyers keep getting sanctioned for filing briefs full of court cases that don't exist. The AI didn't flag a single one. It wrote them the way it writes everyth…
The Thursday 3
Keep going, every Thursday
The Thursday 3 is a free weekly email. Three workflows that put you in the top 1% of CEOs. 90-second read.
Get the newsletter →