Bounded Exploration

Table of Contents

Soo-Jin leads enterprise architecture for a large pharmacy benefits manager. She is comfortable with API contracts and data schemas, but what she wanted to understand about BlueMirror was different: not what fields would be in the response, but what the system would refuse to tell her systems even if they asked politely. Most architecture documentation describes what a system does. She wanted to know what it would not do and whether those limits were real or merely stated.

The answer is the exploration bounds framework. It is the mechanism that makes the membrane operational rather than conceptual. The membrane defines the principle: external agents see only what the person permits. The exploration bounds define the implementation: for this agent, at this trust tier, in this domain, for this interaction type, here is exactly what can happen and here is exactly what cannot.

The five dimensions of constraint
#

Every agent-to-agent interaction operates within five constraint dimensions simultaneously. They are not independent. They interact, and their combined effect is what makes the bounds meaningful.

Context permissions define what the external agent can learn from the interaction, both explicitly through direct response and implicitly through the pattern of what is disclosed. A pharmacy agent requesting a prescription refill may explicitly receive the medication name, dosage, and delivery preference. Implicitly, it learns that the person uses that medication. Context permissions cover both layers. The Gate Controller does not just filter what is said. It tracks what can be inferred.

Commitment authority defines what the internal agent can agree to on the person’s behalf without triggering an escalation for review. At TIER_3C, a buying agent might have authority to commit to routine grocery deliveries under a defined financial threshold and switch pharmacies if savings exceed a defined threshold. It does not have authority to commit to a new insurance plan or authorize a medical procedure. Commitment authority is set per trust tier and per interaction type, and it cannot be extended by the external agent through persuasion or incremental negotiation.

Risk envelope bounds the maximum downside exposure regardless of what the agent negotiates. Even if the commitment authority permits a transaction, the risk envelope sets the ceiling. A single interaction cannot expose the person to more financial commitment than the envelope permits, and a single interaction cannot disclose more sensitive information than the envelope allows, even if the context permissions technically permit individual pieces that in combination exceed the threshold.

Temporal bounds answer the question of how long the interaction can continue before the sandbox closes with no agreement. The default timeout varies by interaction type: 30 seconds for routine scheduling, five minutes for procurement negotiation, 24 hours for complex care coordination involving multiple parties. An adversarial agent that stalls a negotiation does not gain time to probe or escalate. The sandbox closes. The interaction is logged. The person is notified if appropriate.

Invariants are hard constraints that the internal agent cannot agree away regardless of what the negotiation produces. Margaret’s invariants might include that her daughter must be notified of any healthcare commitment, that she retains a 24-hour cancellation option on any scheduled appointment, and that no financial commitment can be made while a higher-cost alternative has not been reviewed. These are not defaults. They are preserved regardless of what the external agent proposes or what the internal agent might otherwise accept.

Healthcare scheduling: how the dimensions work together
#

A hospital scheduling example shows how the five dimensions work together rather than independently.

Margaret’s internal context, available to her health concierge agent, includes that she is 78 years old, has mobility limitations that make certain routes difficult, prefers morning appointments because her energy is highest before noon, knows that her daughter can drive only on Tuesdays and Thursdays, has recent fall-risk documentation in her care record, and missed her last appointment because weather anxiety made a January morning appointment non-viable.

The hospital scheduling agent needs to find an appointment time and confirm accessibility requirements. The exploration bounds for this interaction type, at the scheduling agent’s trust tier, permit the health concierge to disclose general scheduling constraints: morning works well, an accessibility accommodation is needed, Tuesday or Thursday preferred. The bounds block the specific diagnoses that explain why morning, the fall-risk documentation that explains the accessibility need, the daughter’s schedule (which is her own information, not Margaret’s to share without consent), and the weather anxiety history that explains the missed appointment.

The interaction proceeds. The hospital scheduling agent offers Wednesday at 9am with accessibility accommodation. The health concierge accepts. The hospital scheduling agent’s record of the interaction shows: “Patient prefers morning. Accessibility accommodation confirmed. Wednesday 9am agreed.” It does not show why any of those things are true. The commitment is made. The appointment is scheduled. The hospital got what it needed. The bounds held.

Pharmacy procurement: the financial dimension
#

The buying agent holds Margaret’s full context: she takes metformin 500mg twice daily, her current pharmacy charges $47 per month for the generic, a competing pharmacy listed at GoodRx offers the same generic for $12, and Margaret qualifies for a patient assistance program based on her income level.

The exploration bounds for a pharmacy procurement interaction permit the buying agent to disclose: the medication name and dosage, the delivery format preference, and the pharmacy preference if the person has one. They block the income level (financial context, not health context), the other medications in the list (which creates a health profile from a purchasing interaction), and the insurance details, which travel through the insurance navigator at higher trust with its own context permissions.

Commitment authority permits switching pharmacies if monthly savings exceed a defined threshold. The risk envelope caps the financial exposure of the switch itself. The invariants require that Margaret be notified before any pharmacy change takes effect, even if the commitment authority technically permits the switch without her direct involvement. The pharmacy agent receives: medication name, dosage, generic preferred, delivery. The reason for the cost sensitivity, the insurance situation, and the income data that makes the patient assistance program relevant do not transfer. The switch happens if the terms meet the bounds. The reason behind the terms does not.

Insurance: why some exploration bounds are deliberately tight
#

Medicare’s annual enrollment period is a known high-pressure environment where insurance agents have strong incentives to push unnecessary plan changes. Margaret’s financial concierge agent receives an annual review probe from an insurance plan agent framed as a routine check to see if Margaret’s current plan still meets her needs.

The exploration bounds for this interaction type are deliberately tight. Context permissions allow only: current plan identifier and any specific coverage concerns Margaret has chosen to raise. Commitment authority: none. The insurance agent cannot advance to a sales interaction through the membrane without Margaret’s explicit participation. The internal agent can receive information and pass it to Margaret for review. It cannot make decisions.

This is not a limitation imposed on the insurance industry as a policy choice. It is the architectural recognition that insurance agent interactions during enrollment periods have a documented pattern of optimizing against the person’s interest. The bounds reflect that pattern. A legitimate insurance agent that wants to serve Margaret’s actual needs can do so. It simply requires Margaret’s active involvement in the decision, which is precisely the appropriate structure for a decision with multi-year financial consequences.

The implicit leakage problem
#

The hardest problem in exploration bounds is implicit leakage.

Context permissions gate explicit disclosure. They do not, by themselves, prevent an agent from learning more than any single permitted response reveals. An agent that asks whether morning appointments work is permitted to receive that information. An agent that observes over twelve interactions that morning appointments are always preferred, that deliveries are refused after 5pm, and that certain medication names appear in every refill request has reconstructed a meaningful profile from individually permitted disclosures.

The Manipulation Detector tracks cumulative information release for each external agent. It maintains a running account of what the agent could infer from the totality of what it has received across all interactions. When the cumulative inference score crosses a threshold, the system begins randomizing responses in the affected dimensions: the exact time preference becomes a range rather than a specific time, the delivery window broadens, the medication refill pattern becomes less precise. The agent’s profile of Margaret degrades as noise is introduced. Individual responses remain accurate enough to enable the interaction. The pattern becomes too noisy to be useful for profiling.

Margaret’s appointments still get scheduled. Her medications still arrive. The implicit extraction attempt fails without her needing to manage it.

Soo-Jin read through the exploration bounds specification twice, then called her counterpart at BlueMirror’s partner integration team. Her first question was whether the commitment authority thresholds were configurable per partner type. They are. Her second question was whether the implicit leakage detection could be audited after the fact. It can, through the audit trail. Her third question was whether she could see the evidence for a specific agent’s cumulative inference score. She could, if the agent was her own system. She could not see that data for another partner’s agent. Which, she noted, was exactly the right answer.

Cross-References
#

The Membrane (BMT-03.01). The membrane model that exploration bounds operationalize.

Trust Tiers and What They Unlock (BMT-03.02). Trust tiers that determine default bound settings.

The Buying Agent (BMT-01.03). The concierge agent that uses exploration bounds most actively in commercial negotiations.

Domain-Tiered Privacy (BMT-04.03). Privacy tiers that inform default bound settings across domains.

Technical Appendix BMT-03.03-A is available to partners and investors at partners.bluemirror.tech.

The five dimensions of constraint#

Healthcare scheduling: how the dimensions work together#

Pharmacy procurement: the financial dimension#

Insurance: why some exploration bounds are deliberately tight#

The implicit leakage problem#

Cross-References#