Where we let AI drive — and where we don't
AI-accelerated delivery isn't about letting a model write your product. It's knowing precisely which 20% of the work it should absorb — and which 80% a human must stay in charge of.
Every agency now says they “use AI.” It’s become as meaningless as saying you use a computer. The interesting question isn’t whether — it’s where. Because the line between work AI should do and work it absolutely shouldn’t is where good software is won or lost.
We’ve drawn that line carefully, from a lot of projects. Here’s exactly where it falls.
What AI is genuinely great at
There’s a large category of engineering work that is necessary, repetitive, and — frankly — not where craft lives. This is where we let AI move fast:
- Boilerplate and scaffolding. Wiring up a new route, generating a typed client from a schema, stamping out the tenth CRUD resource that looks like the previous nine. AI does this in seconds and rarely gets it wrong, because there’s nothing novel to get wrong.
- First-pass tests. Generating the obvious unit tests — the happy path, the null check, the boundary case — gives our engineers a running start. They then add the tests that actually matter: the weird ones that come from understanding the domain.
- Translation and transformation. Porting a function between languages, converting a data shape, refactoring a pattern across fifty files. Mechanical work with a clear right answer.
- The blank page. A first draft of a component, a config, a migration. Not because we ship it as-is, but because editing is faster than starting from zero.
The common thread: these are tasks where the answer is known, just tedious to type. AI removes the typing. That’s the 20% it absorbs — and it’s real time, handed straight back to you as a lower invoice.
Where a human must stay in charge
Then there’s the work where handing the wheel to a model is how projects quietly go wrong. AI assists here, but it never decides.
Architecture. The shape of a system — what’s a service and what’s a function, where state lives, how data flows, what happens when a dependency fails — is a series of judgment calls with long consequences. A model will happily generate an architecture. Whether it’s the right one for your scale, your team, and your budget is exactly the question it can’t answer, because it doesn’t carry the context or the accountability.
Security and trust boundaries. Auth, permissions, how untrusted input is handled, what’s exposed at the edge. This is the code where “looks plausible” and “is correct” are dangerously far apart. A subtly wrong access check passes every demo and fails the one time it matters. A human owns this, reviews every line, and reasons about the attacker — not just the user.
The danger of AI-generated code isn’t that it looks wrong. It’s that it looks right — confidently, fluently right — while being subtly, expensively wrong.
Product and UX judgment. What to build, what to cut, what a confused user actually needs at a specific moment. This is empathy and taste applied to a real human’s problem. A model can generate a hundred variations; it can’t tell you which one respects your user’s time.
The final read. Every line that ships gets a human review. Not a skim — a read by someone who understands why it’s there and what it touches. AI can draft. It cannot be accountable.
Why “human-led” is a feature, not a hedge
It would be cheaper, in the short term, to let AI run further and review less. Some shops do exactly that, and you can usually tell — the product works in the demo and gets strange at the edges, where nobody was really thinking.
We made the opposite bet. AI never drives the work; our engineers and designers do. AI just clears the repetitive 20% off their plate so they can spend more of their hours on the architecture, the security, and the polish — the parts that decide whether your product is good.
That’s the whole model in one sentence: AI handles the busywork so senior humans can pour their time into the craft. The savings are real because the time saved is real. The quality holds because a human never stopped being in charge of the parts that decide it.
The agencies who’ll struggle are the ones who got the line wrong — who let the model drive the 80% it shouldn’t, and dressed it up as innovation. We’d rather be boringly disciplined about it. Your product is not the place to find out where the line actually was.