Back

Devin's 80% Moment: Background Agents and 7x PRs

"To actually test that change we have to reason through how do you first run these applications to orchestrate with each other with the right version of the code."

Watch the recap video here

Recap

  • 00:00-03:00 - : Background agents move software work from IDE babysitting to async cloud sessions.
  • 03:00-06:00 - : Cognition shares Devin usage claims while Cole Murray explains OpenInspect as a response to client pain.
  • 06:00-12:00 - : OpenInspect offers an open-source path; Cognition sells deployment, compute, integrations, and adoption help.
  • 12:00-18:00 - : Secrets, control planes, repo setup, and full VMs matter for real codebases.
  • 21:00-27:00 - : Testing and video verification become the practical center of reviewable agent work.
  • 27:00-36:00 - : Slack, GitHub, logs, databases, and project memory often matter more than generic connector checklists.
  • 36:00-43:00 - : Useful multi-agent work looks like controlled parallelism, not unmanaged swarms.
  • 43:00-57:00 - : Unchecked auto-merge can create code smells and codebase decay without review, linting, and module boundaries.
  • 57:00-72:00 - : Agent-ready codebases need safe local dev paths, mocks, observability, and realistic test environments.

Context

The source is a Latent.Space episode with Cognition's Walden Yan and OpenInspect creator Cole Murray. Cognition makes Devin, an autonomous software-engineering agent. OpenInspect is an open-source background-agent system that companies can adapt for their own workflows. The episode discusses background agents, cloud agents, Devin usage, OpenInspect, testing, repo setup, secrets, virtual machines, GitHub and Slack integrations, memory, review loops, and enterprise adoption. Walden describes Devin usage inside Cognition, and Cole describes OpenInspect deployments and background-agent infrastructure.

Technical Need To Know

  • Background agent: An AI coding system that runs away from the user's local machine.
  • Pull request: A proposed code change that can be reviewed and merged.
  • Sandbox: An isolated environment for safe command execution and code edits.
  • Control plane: The service coordinating sessions, tools, permissions, and infrastructure.
  • Scoped secrets: Credentials limited to the task.
  • Repo setup: Making a codebase install, run, and test in a fresh environment.
  • Full VM: A full virtual machine for realistic app and tool execution.
  • Video verification: A recording that shows what the agent tested.
  • MCP: Model Context Protocol, useful but not a replacement for deep first-party integrations.

Nuanced Take

Background agents raise the value of specs, module boundaries, lint rules, tests, screenshots, video evidence, and human review. They become useful when they return evidence-rich work packets rather than simply creating more code.