Topic hub · Last updated May 9, 2026

Frontier Watch

A new model ships every six weeks. Most of the coverage is benchmarks and vibes. The CEO-useful version is: did anything change about what you should be running, what you should be paying for, or what your team should be learning. Frontier Watch is that filter.

Below: everything we have published on the frontier.

Start here

The News stream is the daily feed; this hub aggregates the frontier-tagged subset. As more news posts land tagged with frontier-watch, they appear here automatically.

Workflows & explainers

Explainer IntermediateJune 8, 2026
What is Claude Mythos? Claude Mythos is Anthropic's vulnerability-finding model: a general-purpose AI that hunts security holes in real software faster than any human team. Anthropic …
Explainer IntermediateJune 6, 2026
What is the Compound Engineering plugin for Claude Code? The Compound Engineering plugin is a free add-on for [Claude Code](/learn/claude-code) that bolts a plan, build, review, learn loop plus a library of 37 skills …
Explainer IntermediateJune 4, 2026
What is Microsoft Scout and how can you use it? Microsoft's first always-on "Autopilot" agent. It runs on the exact open [harness](/articles/what-is-a-harness) operators here already build with, [OpenCLAW](/a…
Explainer BeginnerJune 3, 2026
What AI can and cannot do for a CEO right now In a Harvard and BCG experiment with 758 consultants, AI made them about 40% better at some tasks and about 19% worse at others. The tasks looked identical. Tha…
Explainer BeginnerJune 3, 2026
How AI actually works (plain English) Under the hood of every chat tab: a pattern machine that predicts the next word, one token at a time. Beautiful fluency, no built-in truth filter, and a working…
Comparison BeginnerJune 2, 2026
ChatGPT vs Claude vs Gemini: which should your company standardize on? Three excellent models, one practical question: which one do you make the company default? The honest answer depends less on benchmarks than on what you already…
Explainer BeginnerJune 2, 2026
What is a frontier model? The handful of most capable AI models at the leading edge, the ones that cost hundreds of millions of dollars to train and reset the bar every few months. When …
Explainer BeginnerJune 2, 2026
What is a large language model (LLM)? The engine inside ChatGPT, Claude, and Gemini. It predicts the next word from patterns it read across the internet. Brilliant at fluency, indifferent to truth, …
Explainer AdvancedJune 2, 2026
What is gbrain? A memory layer for your AI agent, open-sourced by the CEO of Y Combinator. It is the fix for the thing that quietly breaks every agent: it forgets.
Explainer IntermediateJune 2, 2026
What is Google Antigravity? Google's agent harness. The engine room underneath Gemini Spark, and Google's answer to the open harness layer that OpenCLAW and Hermes occupy.
Explainer IntermediateJune 2, 2026
What is Google Spark and how can you use it? Google's 24/7 personal AI agent. The mainstream, no-terminal version of the "agent that works while you sleep" that operators here build with Claude Code and Op…
Comparison IntermediateJune 1, 2026
Codex vs Claude Code I run Claude Code every day. Codex is the coding agent your team is arguing about, and there is a good chance you already pay for it without knowing. Here is ho…
Explainer IntermediateJune 1, 2026
What is a coding agent? The category that Claude Code, Codex, and Cursor all belong to. Not an AI that talks about code. One that reads your files, edits them, runs them, and checks it…
Explainer BeginnerJune 1, 2026
What is Codex? OpenAI's coding agent. The same idea as Claude Code, wearing an OpenAI badge, and probably already bundled in the ChatGPT plan you pay for every month.

All articles & long reads

Q&A AdvancedJune 2, 2026
Forget who's winning the harness wars. Pick one and start. Hermes had a loud week and the timelines lit up. For a CEO, the winner of that fight matters far less than the fact that you still have not picked a side.
Q&A BeginnerJune 2, 2026
Which AI releases actually matter, and which can I ignore? The majors ship one or two notable releases a week. A public tracker has already logged more than 300 model releases in 2026, across dozens of organizations. Yo…
Q&A BeginnerJune 2, 2026
Why AI sounds confident when it's wrong (and how to catch it) Lawyers keep getting sanctioned for filing briefs full of court cases that don't exist. The AI didn't flag a single one. It wrote them the way it writes everyth…

← All topics

Frontier Watch

Start here

Workflows & explainers

All articles & long reads

Keep going, every Thursday