Human Approval Layers in Enterprise AI

"Human in the loop" is the most popular phrase in enterprise AI. It is also the most underspecified. Two workflows can both claim it and look completely different — one is cheap and scalable, the other has the reviewer redoing the AI's job every time. The phrase is a description, not a design.

There are at least four distinct approval layer designs, each appropriate for different workflows. Picking the right one is one of the highest-leverage decisions in the rollout.

Layer 1: Approve-before-ship

Every AI output goes to a human, who approves or rejects before the output leaves the workflow. Highest assurance, highest cost, lowest scale. Right for high-stakes, low-volume work: legal opinions, regulatory submissions, exec communication.

The trap: teams default to this layer because it feels safe. For most workflows, it is too heavy. The throughput gain disappears in review time.

Layer 2: Exception-only review

The AI ships routine outputs directly. The workflow flags exceptions — by confidence score, by missing inputs, by policy match — and only exceptions go to a human. Right for medium-stakes, high-volume work: support response drafting, expense triage, first-pass contract review.

The trap: the exception rules have to be designed and maintained. A workflow with weak exception rules sends too much to humans (defeats the point) or too little (lets bad output ship).

Layer 3: Sampling review

Every output ships immediately. A random or weighted sample is reviewed after the fact. Drift is detected by review trends rather than per-output gates. Right for low-stakes, high-volume work where the consequences of a single bad output are small: classification, summarization, internal note-taking.

The trap: works only when individual failures are recoverable. Not appropriate for outputs that are hard to retract once shipped.

Layer 4: Tiered review

A combination. The output is classified into a tier on arrival, and each tier gets a different review treatment: approve-before-ship for high-stakes, exception-only for medium, sampling for low. Right for departments with heterogeneous work where the same workflow handles many output kinds.

The trap: more complexity to operate. Worth it when the workflow volume is high and the work mix is broad.

How to pick

The choice is structural: what are the stakes per output, and what is the volume? Plot the workflow on those two axes and the right layer becomes obvious. Once you've picked the layer, the rest of the workflow design — intake, citations, exception rules, sampling rate — falls out of it.

"Human in the loop" is the start of a design conversation, not the end of one.

How did this land?

Related field notes

April 22, 20262 min
How to Stop AI Hallucinations in Business Workflows
Hallucinations are a workflow problem, not a model problem. Most of them are caused by giving the model the wrong job — and they disappear when the workflow gives it the right one.
April 8, 20262 min
Why Most AI Automations Fail After 30 Days
The 30-day cliff is real. Pilots that looked promising stop being used, get worked around, or quietly disappear. Three structural reasons, and what to design against.
March 25, 20262 min
The Hidden Cost of Unmanaged AI Adoption
When AI use is invisible inside the organization, the cost is not on the invoice. It is in the quality drift, the unowned outputs, and the audit trail that does not exist. A short read on the cost of doing nothing.

Layer 1: Approve-before-ship

Layer 2: Exception-only review

Layer 3: Sampling review

Layer 4: Tiered review

How to pick

How to Stop AI Hallucinations in Business Workflows

Why Most AI Automations Fail After 30 Days

The Hidden Cost of Unmanaged AI Adoption

Ready to map your AI workflow?