Skip to main content
  1. Ethics, Autonomy, and Delegation/

Earned Autonomy

·1825 words·9 mins
Table of Contents

Rajesh has been building AI recommendation systems for six years. He knows the failure mode better than most. The system starts conservative because it does not yet know the user. It gets ignored because it is too cautious. The product team bumps up the default autonomy to reduce friction. The system starts acting on insufficient knowledge. Users complain about decisions made without them. The product team adds confirmation prompts. Users get fatigue and click through without reading. The confirmations become theater.

The pattern repeats across every AI product that sets autonomy as a static configuration. Either the default is too restrictive and users abandon it, or it is too permissive and users feel overridden. The answer to this is not a better default. It is a different structure entirely.

Autonomy should be earned through demonstrated competence. Not assumed at onboarding. Not purchased by the subscription tier. Not granted by the terms of service. Earned, through a record of correct decisions across a range of real situations, with the person’s explicit agreement at each step up.

Competence, not time

An agent that has operated for six months without error has earned something. An agent that has operated for six months without being tested has earned nothing. A system that has processed twelve routine medication refills has demonstrated that it can handle routine refills reliably. It has not demonstrated that it can handle a refill when the prescription changes, when a new drug interaction appears, or when the person’s insurance coverage shifts. Each new scenario type requires its own competence demonstration.

This is the foundational principle: earned autonomy tracks demonstrated competence across the actual range of situations the system encounters, not just the common ones. The system that has only been tested in normal conditions has not been tested. Edge case handling is a required component of the evidence package, not a bonus feature.

The evidence package

Earned autonomy in any domain rests on a portfolio of signals rather than a single metric.

Accuracy: how often did the system make the right decision, as assessed by the person’s subsequent behavior? Not the system’s internal confidence score. The person’s actual response. An action the system was confident about but the person overrode is not an accurate action. The measure of accuracy is the person’s endorsement, not the model’s probability.

Escalation appropriateness: did the system ask when it should have, and act when it should have? A system that acts on everything it should have escalated is overconfident. A system that escalates everything it should have handled autonomously provides no value. The right calibration is visible in the pattern of the person’s responses: she approves escalations quickly when they were warranted and expresses frustration when they were not.

Error recovery: when the system made a mistake, did it catch it? Correct it? Adjust to prevent the same error in the future? A system that makes mistakes and learns from them earns more trust than a system that makes fewer mistakes but treats them as unpredictable anomalies. Error recovery is a signal about the system’s self-awareness and its responsiveness to feedback, both of which are prerequisites for higher delegation.

Edge case handling: has the system encountered unusual situations and handled them appropriately? Appropriate handling of an edge case does not necessarily mean solving it autonomously. It includes recognizing that the situation is unusual, escalating to the right level, and doing so in a way that gives the person what she needs to make the decision. A system that escalates edge cases well is demonstrating exactly the judgment that earns wider latitude.

Person satisfaction: does the person’s behavior indicate comfort with the system’s decisions? Not survey responses. The person who consistently reviews the system’s actions without overriding them is communicating something. So is the person who consistently overrides. The system tracks this behavioral signal across domains and action types as one of the inputs to the competence assessment.

Five progression levels

Earned autonomy moves through five levels. Each level requires demonstrated competence at the previous level before progression can occur, and each level transition requires the person’s explicit agreement. The system proposes advancement. The person decides.

Level 1 is Observe. The system watches and learns for the first thirty days. No autonomous action. No pushed recommendations. The system builds a baseline understanding of the person’s patterns, preferences, and the shape of the domain. This period costs the person nothing in terms of decisions she needs to make, and it gives the system the baseline it needs to begin making useful recommendations in the next level.

Level 2 is Recommend. The system recommends actions and waits for the person to approve or reject. Every recommendation is explicit, labeled as a recommendation, and accompanied by a brief explanation of the reasoning. The person’s approvals and rejections are the primary learning signal at this level. Which criteria is she applying that the system had not anticipated? Where does her judgment consistently differ from the system’s? Level 2 is where the competence record begins to accumulate in earnest. Moving from Level 1 to Level 2 requires only the passage of the observation window. Moving from Level 2 upward requires actual evidence.

Level 3 is Act and Notify. The system acts on routine decisions and notifies the person afterward. The person reviews and can reverse any action. Moving to Level 3 requires twenty or more correct recommendations with no major errors and no consistent pattern of overrides in the action type being elevated. “Correct” is measured by the person’s subsequent assessment. A recommendation she accepted and then reversed does not count as correct. A recommendation she accepted and found valuable does.

Level 4 is Act and Report. The system handles routine decisions in the domain and reports exceptions only. The person receives a summary rather than individual notifications for routine actions. Moving to Level 4 requires fifty or more correct autonomous actions with demonstrated appropriate escalation of edge cases. The escalation appropriateness signal matters particularly at this level: a system that has been at Level 3 and consistently escalated edge cases correctly has demonstrated that it knows its own limits. That knowledge is a prerequisite for the wider latitude of Level 4.

Level 5 is Full Delegation. The system handles everything in this domain. The person receives periodic summaries and can request detail at any time. Moving to Level 5 requires one hundred or more correct actions, demonstrated edge case handling across a meaningful range of situations, and the person’s explicit consent. Level 5 is never available for healthcare clinical decisions or for any action covered by hard constraints. It is appropriate for domains where the stakes genuinely permit it: entertainment, routine scheduling, ambient home management.

Each level transition is a proposal from the system, not an automatic advancement. The system proposes when the evidence package is complete. The person decides whether to grant the next level. If she declines, the system continues at the current level without reducing its capability or its quality. The proposal will come again when circumstances warrant, but the timing of the next proposal is calibrated to give the current level time to accumulate more evidence rather than asking repeatedly.

The reverse direction

Autonomy moves in both directions, and the system treats both directions with equal respect.

Margaret starts managing her own medication schedule after the system handled it for a year. The system provides the schedule in whatever format she prefers. It monitors adherence without nagging. It offers to resume handling if Margaret asks, or if adherence drops below a safety threshold and she does not respond to a check-in. It does not treat her re-engagement as a problem. It does not ask her to confirm that she is sure. It does not reduce its service quality in other areas because she reclaimed one.

The system’s job is to serve the person, not to be needed by the person. These are not the same objective. A system designed for the latter will find ways, through optimization, to make itself harder to leave: more convenient to use than to bypass, more capable than the person’s own unaided judgment, more tightly integrated into the person’s daily life than any service she might replace it with. The earned autonomy architecture is not designed for any of this. The dependency protection mechanism actively works against it.

Dependency detection

The system monitors its own indispensability. Warning signals accumulate across three dimensions: the person has not manually engaged with a domain in ninety or more days; the system handles everything in that domain with no person-initiated action; the pattern persists without variation. When all three align, the dependency flag activates for that action type.

The response is not withdrawal of service. It is an invitation to stay connected: “I have been handling your grocery orders for three months. Would you like to review this week’s order before I place it?” Not a warning. Not a lecture about dependency. A low-friction offer that gives the person an opportunity to remain aware of and involved in what the system does on her behalf.

The invitation is offered once per dependency cycle. If the person declines or does not respond, the system continues at its current level and checks again after the next qualifying period. The invitation is never coercive. The person who wants the system to handle everything and prefers not to be reminded that it is doing so has that option. The system surfaces the invitation once. Her response governs what happens next.

Where the architecture is now

The five-level progression framework and the evidence package scoring are operational. The dependency detection ninety-day observation window is operational. The language of the re-engagement offers is calibrated by domain but still being refined across the first user cohort: the optimal level of salience for the invitation is domain-dependent in ways that are still being learned.

Cross-domain earning transfer, where strong demonstrated competence in one domain provides a partial starting evidence base in a closely adjacent domain, is twelve months out. The architecture supports the transfer. The safety analysis for calibrating how much to weight prior domain competence in a new domain, without creating an incorrect sense of earned autonomy that has not been specifically demonstrated, is still in progress.

Cross-References
#

The Human Agency Scale (BMT-04.01). The autonomy framework that earned autonomy operates within and earns toward, including the domain modifiers that determine the ceiling for each domain.

What the System Learns (BMT-02.05). P-RLHF as the preference learning mechanism that supports the behavioral signal accumulation underlying competence assessment.

Trust Tiers and What They Unlock (BMT-03.02). The parallel architecture through which external agents earn trust through demonstrated reliable behavior, using similar evidence package logic.

The Retention Flywheel (BMT-10.04). Earned autonomy as the mechanism that drives retention: the system that has two years of specific competence in a person’s domain cannot be replicated by a competitor starting on day one.

Technical Appendix BMT-04.02-A is available to partners and investors at partners.bluemirror.tech.