Skip to main content
Published 2026-05-04

6 min read

Why architecture-first delivery controls AI behavior

Part 3 of 3. Most of the risk does not come from the model itself. It comes from how loosely structured the surrounding system is. Generators and scaffolding narrow what AI can produce before any review takes place.

TL;DR

  • Open-ended prompting on a loosely structured codebase is the highest-risk configuration. Architecture-first delivery with narrow AI involvement is the lowest.
  • Generators and scaffolds include observability, compliance boundaries, and naming conventions in every file they produce, so new work follows the structure by default.
  • Corsair generates code internally first. We use vendor models only for small, reviewable changes where the trade-off between data exposure and payoff is clear.

Share this article

Corsair Media Group

Corsair Media Group

Most of the risk is in the surrounding system

Copied

Most discussion of AI risk focuses on review discipline. Who reads the diff, how large the change is, and whether the engineer who merges the code understands what is going into production. Those controls matter. The constraint upstream of review is what the model is writing against in the first place.

When a team applies AI to an unstructured codebase, the model has wide freedom to make architectural decisions. It will choose patterns, introduce abstractions, and set up integrations based on the prompt and on what its training data suggests. Those choices can look reasonable in isolation while conflicting with the structure that the team plans to operate over the long term.

Architectural drift is harder to spot than a missing null check

Copied

Reviewers then have to catch architectural drift in addition to bugs. Architectural drift is harder to see in a diff than a missing null check. A single file can look fine on its own and still move the system toward an integration pattern, a naming convention, or an abstraction that the team did not choose.

That kind of drift compounds. By the time it appears in operations, the code is already in production, the tests already accept the new pattern, and the cost of reversing the direction is much higher than the cost of preventing it at the point of generation.

What architecture-first delivery actually constrains

Copied

The countermeasure is to have the architecture written down before AI is involved. Templates, scaffolding, and generators encode the rules of the system inside the files they produce. Observability hooks, compliance boundaries, naming conventions, and integration points are present in the output before any prompt is written.

When AI then fills in details inside that scaffolding, the range of decisions it can make is narrow. It is no longer choosing the shape of the system. It is completing pre-defined sections inside a system that has already documented its rules.

Architecture-first delivery reduces AI risk for teams that plan to use the tools deliberately. Strong generators do not replace human review. They reduce the number of decisions a reviewer has to evaluate, so that the review focuses on local correctness rather than on whether the change is moving the architecture in a direction the team did not choose.

How Corsair works: generators first, models for narrow tasks

Copied

We build internal systems that produce code, including templates, scaffolds, and generators. The architecture is built into every new file rather than re-decided with each prompt. Once a generator already encodes a pattern, a general-purpose model usually adds little beyond what the generator already produces.

For new prototypes and large refactors, models can still shorten discovery when the diff is small enough to review carefully. Our usual sequence is to scaffold internally, write a handcrafted patch, and only then optionally send two to ten lines per file through a vendor model. At that point the decision is an economic one. The data exposure is weighed against the payoff, and both are visible to the engineer who owns the merge.

Model output is most useful when the task is reshaping data structures from end to end. Rename fields. Pass a new structure through the types and serializers. Touch the obvious call sites so that the compiler and the tests reveal what was missed. We still treat that work as an aggressive refactor. The changes are reviewable, the test suite runs, and someone on our team owns the merge.

Is exposing the codebase to a vendor model worth polishing the last handful of lines when local generators already handled the boilerplate?

When we use a vendor model, we are usually adding two handwritten lines on top of generated base code. The default is generators and architecture-first scaffolds rather than open-ended prompting. Fewer one-off integration points end up in the repository. Observability such as OpenTelemetry appears alongside compliance hooks and internal conventions before any large amount of context is sent to the cloud. Vendor models touch only small, reviewable changes where the risk and the payoff are both clear.

Do you have to use AI?

Copied

Do you have to use AI? No. Deterministic generators already cover most of routine delivery. Vendor models can be an optional accelerator inside a disciplined workflow. Our recommendation is to reach a point where you do not depend on third-party AI tooling for routine engineering, then introduce models only where the evidence shows that they help more than your existing generators and review practice.

Architecture and deterministic generation can already keep up with much of what teams ask AI to do. Introduce AI in narrow, well-defined sections of the work, where the payoff is clear and the accountability does not change hands. That sequence keeps the failure modes from Part 2 contained, and it keeps the workflow patterns from Part 1 inside boundaries that the team can actually defend.

With rules in place AI speeds up delivery, without them it makes the codebase less consistent

Copied

Open-ended prompting on a loosely structured codebase is the highest-risk configuration we have encountered. Architecture-first work with narrowly scoped AI is the lowest. What you allow the tool to decide before anyone reads the diff matters more than which vendor release ships next, and it matters more than how careful any later cleanup becomes.

Architectural discipline determines how much benefit you get from models. With clear rules in place, AI can speed up delivery inside those rules. Without them, generated work tends to make the codebase less consistent rather than more consistent.

If this matches your situation, then reach out through our contact page so that we can discuss where architecture-first delivery would start for your work.

If you want architecture-first delivery with generators ahead of open-ended prompting, then talk with Corsair about your next build.

Contact Corsair