The logical conclusion of AI Driven development is a move from “It’s dangerous to ship code to production that you haven’t read” will be replaced with “It’s dangerous to ship code to production that you had to read”

Agentic development is a paradigm shift.

My anxiety of teams shipping unreviewed and untested code to production knows no bounds. And I’m not just worried about the temporary behavior that is currently happening, I’m worried about the long term incentives of people forgetting that change control is a discipline. If it’s working it’s working, but when it’s not working? Shipping even more code has never been the answer, even the before days when code wasn’t “free”

What the fuck happened to common sense?

There are an ocean of articles that just seem accept the fact that we’re going to talk to our code, never read it, maybe never document it? Ship it, and if something’s fucked up, you just ask the LLM to rewrite it.

  • If you’re not reading your code, how will you know if your tests are passing because the change didn’t break your tests, or because the LLM changed your tests.

What tools do we need?

  • We need tools that can quickly tell us about which tests changed. I don’t know if Cucumber is having a resurgence, but maybe it should be, and it should include this feature.
  • Maybe we need to be able to automatically refactor one big change into multiple little changes.
  • We don’t need less libraries, we need more libraries. Maybe we need microlibraries, or like MCP for code snippets. Maybe instead of recreating the node.js left-pad problem with literal microlibraries, we need like version-controlled tested code patterns the AI can keep up to date. With a comment in the code referencing the index of the code pattern. I’m not sure exactly, but the idea of abandoning libraries because all the code can just be written by the LLM from scratch is bananas. Change control is good. Battle-tested code is good.
  • We probably still needs specs. A shared language around what the code was intended to do. Schroedinger’s coding spec. If person A wrote the code based on input A, then I don’t think it’s write to say that the spec is the code. The spec is the prompt person A wrote. But we need a disciplined way to make changes to a code base with predictable results. Tweaking a value and knowing eactly what will happen.

I think the biggest observation I would make is there’s been a lot of discipline developed over the past ten years. About automated checks / linting in CI. About atomic changes. Rather than abandoning those ideas in the face of a new paradigm where you just talk to your code. We need to double down on the harness. We need to be reviewing specs and reports, not code.