Skip to main content
  1. The Deployment Model/

The Three-Zone Architecture

·2362 words·12 mins

Carmen Delgado manages technology procurement for a network of twelve Area Agencies on Aging across the Southwest. She has evaluated nine AI-for-seniors platforms in the past two years. Every one of them assumed the subscriber owned a specific device: a tablet, a smart speaker, a dedicated terminal. Every one of them failed the same test. She asked what happens when the seventy-eight-year-old widow on $1,847 a month does not own the device, does not want the device, or cannot operate the device. The answer was always some version of “she needs the device.”

BlueMirror’s answer was different. The architecture does not require any specific hardware in the subscriber’s home. It serves the subscriber who has a purpose-built edge device, the subscriber who has a smartphone, and the subscriber who has neither. The product is the same product. The deployment path differs. What changes across paths is where inference runs, how privacy is enforced, and what happens when the internet goes down. What does not change is the concierge architecture, the thirteen agents, the Memory of Context, or the depth of reasoning available to the subscriber.

Why not a GB10 in every home
#

The original architecture assumed a GB10 pair in every subscriber’s home. Two NVIDIA GB10 units, 256 gigabytes of unified memory, NVLink-C2C interconnect. Approximately $10,000 per pair at current pricing. For a population whose median monthly income is $1,847, a $10,000 device is not a consumer product. It is not even a realistic institutional subsidy for most channels. The GB10 pair is excellent compute. It is wrong compute for this deployment.

A purpose-built edge device at $150 to $300 is closer to reachable. Some institutional channels, particularly PACE programs (BMT-09.03), will fund it. Some subscribers or their adult children will purchase it. But it is not universal either. Some subscribers will own a smartphone capable of hosting Zone 1 inference. Some will not own a smartphone that supports it. Some will have neither a capable phone nor any interest in a dedicated device.

The architecture cannot assume hardware uniformity in a population where hardware ownership varies by income, geography, disability status, and personal preference. The three-zone compute model handles the variation.

Zone 1: the local processing tier
#

Zone 1 is a capability, not a product. It runs the privacy-critical Tiny LMs on whichever device can host them: the Safety Filter, Privacy Filter, Cognitive State Estimator, Emotion Detector, Orientation Assessor, Confusion Detector, Speech-to-Intent, and Voice Tone Analyzer. Approximately 850 million parameters, quantized to roughly 425 megabytes after AWQ 4-bit compression. These models handle the data that is most sensitive: cognitive state, emotional state, raw voice, raw sensor signals. When Zone 1 is present, this data is processed locally and never transmitted.

Three variants exist.

Zone 1-Dedicated is a purpose-built edge device in the subscriber’s home. Hardware class: Qualcomm QCS8550-class SoC or equivalent edge AI platform, 8 to 16 gigabytes of memory, NPU accelerator. Target price: $150 to $300 at volume. The device is always on, always local, and functions as more than an inference engine. It is the sensor hub (BLE receiver for health wearables, Zigbee/Z-Wave/Thread for home sensors), the display output hub (HDMI or wireless to a tablet, smart display, or TV), and the wearable data receiver. It runs Zone 1 inference continuously, processes safety monitoring in the background, and delivers medication reminders and safety alerts even during network outages.

Zone 1-Phone is the subscriber’s existing smartphone running the BlueMirror app with Tiny LMs downloaded locally. The same eight models, the same inference, the same privacy boundary. The phone must meet minimum hardware requirements: sufficient RAM to hold the quantized model portfolio alongside the operating system, an NPU or GPU capable of running the inference at acceptable latency. Not every phone qualifies. The phones that do qualify provide architectural privacy enforcement identical to Zone 1-Dedicated for the data those models process. What Zone 1-Phone lacks is the always-on sensor hub, the dedicated display output, and the resilience of a device that has no other job. The phone’s battery, the phone’s connectivity, and the phone’s competing workload all constrain Zone 1-Phone in ways that Zone 1-Dedicated is not constrained.

No Zone 1 is the subscriber whose phone hardware is too low-spec to host the Tiny LMs, who has no smartphone at all, or who accesses the platform through a basic web interface or interactive voice response system. Privacy-critical inference runs upstream in Zone 2 or Zone 3. The privacy posture for this subscriber is contractually enforced through the healthcare data processing agreement rather than architecturally enforced through local processing. The subscriber still receives the same concierge architecture, the same agents, the same reasoning depth. The privacy mechanism differs.

The Zone 1 capability is the same across the two variants that have it. The Cognitive State Estimator running on a dedicated device and the same estimator running on a smartphone produce the same output from the same model weights. The resilience and sensor-hub functions differ because the hardware platforms differ.

Zone 2: the Community Pane
#

Zone 2 is a shared regional compute node. A GB10 pair supplemented by AMD 64-gigabyte mini PCs, deployed at a PACE facility, care agency office, health system data center, or edge co-location facility. A single Community Pane node serves 150 to 500 subscribers depending on concurrent usage patterns and the hardware configuration at the deployment site.

Zone 2 runs the heavy inference: Response Generator, Intent Classifier, Empathy Responder, all Domain Expert models, MoC Router, Escalation Classifier, Trust Evaluator, and the remaining Specialized Function models. It holds the full Memory of Context for each subscriber it serves, including all five layers, the P-RLHF individual preference model, and session history. For subscribers with Zone 1 (Dedicated or Phone), Zone 2 receives only privacy-filtered data from the subscriber’s local device. The Privacy Filter in Zone 1 validates every outbound transmission before it leaves. For subscribers without Zone 1, Zone 2 receives data directly from the subscriber’s client devices under her consent grants, and Zone 2 runs the privacy-critical models itself.

Zone 2 coverage depends on regional deployment. Not every region has a Community Pane at launch. Not every region will have one at maturity, though the deployment plan targets coverage for 80 to 90 percent of the subscriber base by month 36 (BMT-09.03). A subscriber in a region without Zone 2 coverage operates without it. Her queries route to Zone 3 for all inference that Zone 1 does not handle locally, or for all inference if she has no Zone 1.

Per-subscriber compute cost when Zone 2 is present: approximately $5 to $7 per month. The shared infrastructure amortizes the GB10 pair across the subscriber population the node serves, which is what makes the per-subscriber cost viable where a per-home GB10 pair was not.

Zone 3: the cloud reasoning layer
#

Zone 3 is always present. Every subscriber has Zone 3. It performs deep multi-domain reasoning: queries that require simultaneous context from many domains, novel question types that the Zone 2 SLM portfolio cannot yet handle, and long-form analysis that exceeds Zone 2 capacity. Zone 3 is not coordination overhead. It is the reasoning ceiling of the system.

At launch (Phase 1), Zone 3 handles every query for every subscriber. Zone 1 has not deployed. Zone 2 has not deployed. The commercial API operating under a healthcare data processing agreement (BMT-07.01) fulfills Zone 3 inference for the entire subscriber base. This is the starting state.

As the architecture matures through Phase 2 (months 12 to 18, Zone 1 deploys for subscribers who acquire a Local Pane or whose smartphone supports the Tiny LMs) and Phase 3 (months 18 to 36, full SLM portfolio across Zone 1 and Zone 2 where deployed), Zone 3’s share of total inference decreases for subscribers who gain Zone 1 and Zone 2 coverage. But Zone 3 never disappears. It continues to handle the deep reasoning that exceeds regional capacity and to serve subscribers whose deployment path includes only Zone 3.

Per-subscriber Zone 3 inference cost varies by path. For a subscriber with Zone 1 and Zone 2 coverage, Zone 3 handles approximately 10 to 15 percent of queries at maturity: the complex ones. Her Zone 3 cost is $2 to $5 per month. For a Zone 3-only subscriber whose entire workload runs on Zone 3 throughout all phases, the cost is $8 to $14 per month.

The six deployment paths
#

Three Zone 1 variants multiplied by two Zone 2 states produce six deployment paths. Zone 3 is always present. Each path is a first-class deployment.

Path A: Z1-Dedicated + Z2 + Z3. The full-stack subscriber. A Local Pane device in her home, a Community Pane node in her region, and Zone 3 for deep reasoning. Maximum privacy (architectural enforcement at Zone 1, regional processing at Zone 2, cloud only for complex queries). Maximum offline resilience (the Local Pane continues operating during network outages). This is the path most PACE-enrolled subscribers will have, because PACE programs typically fund both the hardware and host the Community Pane node (BMT-09.03).

Path B: Z1-Dedicated + Z3. The subscriber bought or received a Local Pane device but lives outside a Zone 2 region. Privacy-critical inference runs locally. Everything else routes to Zone 3. Offline resilience for Zone 1 functions. No regional node available.

Path C: Z1-Phone + Z2 + Z3. The subscriber uses her smartphone for Zone 1 processing, and a Community Pane node exists in her region. Privacy-critical inference runs on her phone. Heavy inference runs at Zone 2. Deep reasoning at Zone 3. Privacy posture is architecturally enforced on a device she controls. Offline resilience is bounded by the phone’s battery and connectivity.

Path D: Z1-Phone + Z3. Smartphone Zone 1, no regional coverage. Privacy-critical inference runs on her phone. Everything else routes to Zone 3.

Path E: No Z1 + Z2 + Z3. The subscriber has neither a dedicated device nor a phone capable of hosting the Tiny LMs, but lives in a Zone 2 region. Zone 2 handles her full inference workload including the privacy-critical models. Zone 3 handles deep reasoning. Privacy posture is contractually enforced through the DPA governing Zone 2 operations.

Path F: No Z1 + Z3. The cloud reasoning layer serves the subscriber end-to-end. No local processing. No regional node. Every query routes to Zone 3. This is the subscriber who has no smartphone, no Local Pane device, and accesses the platform through a basic web interface, IVR, or text message. Her privacy posture is contractually enforced through the Zone 3 DPA.

The architecture does not degrade product capability for the paths with less local hardware. Path F receives the same concierge architecture, the same thirteen agents, the same MoC, the same deep-reasoning ceiling as Path A. What differs is privacy posture and offline resilience, not product availability.

The privacy hierarchy across paths
#

Privacy posture is path-dependent. The architecture is specific about which subscriber gets which kind.

For Path A, the most sensitive data (cognitive state, emotional state, raw voice, raw sensor signals) is processed in Zone 1 and never transmitted upstream. The privacy posture is architecturally enforced by the physical fact that the data does not leave the device. For Path C, the same data is processed on the subscriber’s phone. The enforcement mechanism is the same: local processing on hardware the subscriber controls.

For Paths E and F, the same data categories are processed by Zone 2 or Zone 3 under the healthcare DPA. The contractual protections are real: no retention beyond the inference request lifecycle, no use for the provider’s own model training, HIPAA technical safeguard compliance, audit rights, 72-hour breach notification. These are not nominal protections. But they are contractual protections, not architectural ones. The threat model is different. A breach of the DPA by the provider exposes this data. A breach of a Zone 1 device requires physical access to the subscriber’s home.

None of these is a degraded posture. They are different postures with different threat models, and the architecture describes them as such (BMT-04.07).

At Phase 3 maturity, the approximate processing distribution for a Path A subscriber is 15 to 20 percent on Zone 1, 55 to 60 percent on Zone 2, and the balance on Zone 3 for queries that exceed Zone 2 capacity. A Path F subscriber runs 100 percent on Zone 3. The other paths fall between.

The consent boundary#

Every cross-zone transmission requires consent. The consent architecture (BMT-04.03, BMT-05.05) governs both directions, but the physical location of the consent boundary differs by path.

For subscribers with Zone 1 (Dedicated or Phone), the consent boundary is at the Zone 1 to upstream transition. The Privacy Filter validates outbound transmissions before they leave the device or phone. Raw cognitive data, raw emotional data, raw voice, and raw sensor signals do not cross this boundary. What crosses is processed output: “cognitive state: normal,” not the behavioral observations that produced the assessment.

For subscribers without Zone 1, the consent boundary is at the client to upstream transition. The equivalent validation runs at the platform’s coordinator layer before data enters Zone 2 or Zone 3. The consent architecture governs what the subscriber has authorized for processing, what data categories require per-interaction consent, and what the system refuses to transmit regardless of consent (BMT-04.06).

The boundary is not a new concept introduced by the three-zone model. It is the same architectural element from Series 04, applied at different physical locations depending on the subscriber’s deployment path. The consent semantics are identical. The enforcement location moves.

Cross-References
#

BMT-06.03 Edge Intelligence. The canonical three-zone compute architecture that Series 09 operationalizes into deployment paths.

BMT-07.01 Where Your Data Lives. Data residency as the storage complement to edge compute, showing how data physically resides in the zone where it is processed.

BMT-02.03 The Thirty Models. The SLM portfolio whose distribution across zones defines what runs where for each deployment path.

BMT-04.07 Privacy as Architecture. The ethical framework that treats privacy as a structural property, applied here to path-dependent privacy postures.

BMT-10.01 The Unit Economics. Per-path unit economics that follow from the three-zone architecture’s cost structure.

Technical Appendix BMT-09.01-A is available to partners and investors at partners.bluemirror.tech.