StellaOps Architecture Overview (Sprint 19)

Ownership: Architecture Guild • Docs Guild
Audience: Service owners, platform engineers, solution architects
Related: High-Level Architecture, Concelier Architecture, Policy Engine Architecture, Aggregation-Only Contract

This dossier summarises the end-to-end runtime topology after the Aggregation-Only Contract (AOC) rollout. It highlights where raw facts live, how ingest services enforce guardrails, and how downstream components consume those facts to derive policy decisions and user-facing experiences.


1 · System landscape

graph TD
    subgraph Edge["Clients & Automation"]
        CLI[stella CLI]
        UI[Console SPA]
        APIClients[CI / API Clients]
    end
    Gateway[API Gateway
(JWT + DPoP scopes)] subgraph Scanner["Fact Collection"] ScannerWeb[Scanner.WebService] ScannerWorkers[Scanner.Workers] Agent[Agent Runtime] end subgraph Ingestion["Aggregation-Only Ingestion (AOC)"] Concelier[Concelier.WebService] Excititor[Excititor.WebService] RawStore[(MongoDB
advisory_raw / vex_raw)] end subgraph Derivation["Policy & Overlay"] Policy[Policy Engine] Scheduler[Scheduler Services] Notify[Notifier] end subgraph Experience["UX & Export"] UIService[Console Backend] Exporters[Export / Offline Kit] end Observability[Telemetry Stack] CLI --> Gateway UI --> Gateway APIClients --> Gateway Gateway --> ScannerWeb ScannerWeb --> ScannerWorkers ScannerWorkers --> Concelier ScannerWorkers --> Excititor Concelier --> RawStore Excititor --> RawStore RawStore --> Policy Policy --> Scheduler Policy --> Notify Policy --> UIService Scheduler --> UIService UIService --> Exporters Exporters --> CLI Exporters --> Offline[Offline Kit] Observability -.-> ScannerWeb Observability -.-> Concelier Observability -.-> Excititor Observability -.-> Policy Observability -.-> Scheduler Observability -.-> Notify

Key boundaries:

  • AOC border. Everything inside the Ingestion subgraph writes only immutable raw facts plus link hints. Derived severity, consensus, and risk remain outside the border.
  • Policy-only derivation. Policy Engine materialises effective_finding_* collections and emits overlays; other services consume but never mutate them.
  • Tenant enforcement. Authority-issued DPoP scopes flow through Gateway to every service; raw stores and overlays include tenant strictly.
  • Hybrid reachability attestations. Scanner/Attestor always publish graph-level DSSE for reachability graphs; optional edge-bundle DSSEs capture high-risk/runtime/init edges. Policy/Signals consume both, with graph DSSE as the minimum bar and edge bundles used for quarantine/dispute flows.

2 · Aggregation-Only Contract focus

2.1 Responsibilities at the boundary

AreaServicesResponsibilities under AOCForbidden under AOC
Ingestion (Concelier / Excititor)StellaOps.Concelier.WebService, StellaOps.Excititor.WebServiceFetch upstream advisories/VEX, verify signatures, compute linksets, append immutable documents to advisory_raw / vex_raw, emit observability signals, expose raw read APIs.Computing severity, consensus, suppressions, or policy hints; merging upstream sources into a single derived record; mutating existing documents.
Policy & OverlayStellaOps.Policy.Engine, SchedulerJoin SBOM inventory with raw advisories/VEX, evaluate policies, issue effective_finding_* overlays, drive remediation workflows.Writing to raw collections; bypassing guard scopes; running without recorded provenance.
Experience layersConsole, CLI, ExportersSurface raw facts + policy overlays; run stella aoc verify; render AOC dashboards and reports.Accepting ingestion payloads that lack provenance or violate guard results.

2.2 Raw stores

CollectionPurposeKey fieldsNotes
advisory_rawImmutable vendor/ecosystem advisory documents._id, tenant, source.*, upstream.*, content.raw, linkset, supersedes.Idempotent by (source.vendor, upstream.upstream_id, upstream.content_hash).
vex_rawImmutable vendor VEX statements.Mirrors advisory_raw; identifiers.statements summarises affected components.Maintains supersedes chain identical to advisory flow.
Change streams (advisory_raw_stream, vex_raw_stream)Feed Policy Engine and Scheduler.operationType, documentKey, fullDocument, tenant, traceId.Scope filtered per tenant before delivery.

2.3 Guarded ingestion sequence

sequenceDiagram
    participant Upstream as Upstream Source
    participant Connector as Concelier/Excititor Connector
    participant Guard as AOCWriteGuard
    participant Mongo as MongoDB (advisory_raw / vex_raw)
    participant Stream as Change Stream
    participant Policy as Policy Engine

    Upstream-->>Connector: CSAF / OSV / VEX document
    Connector->>Connector: Normalize transport, compute content_hash
    Connector->>Guard: Candidate raw doc (source + upstream + content + linkset)
    Guard-->>Connector: ERR_AOC_00x on violation
    Guard->>Mongo: Append immutable document (with tenant & supersedes)
    Mongo-->>Stream: Change event (tenant scoped)
    Stream->>Policy: Raw delta payload
    Policy->>Policy: Evaluate policies, compute effective findings

2.4 Authority scopes & tenancy

ScopeHolderPurposeNotes
advisory:ingest / vex:ingestConcelier / Excititor collectorsAppend raw documents through ingestion endpoints.Paired with tenant claims; requests without tenant are rejected.
advisory:read / vex:readDevOps verify identity, CLIRun stella aoc verify or call /aoc/verify.Read-only; cannot mutate raw docs.
effective:writePolicy EngineMaterialise effective_finding_* overlays.Only Policy Engine identity may hold; ingestion contexts receive ERR_AOC_006 if they attempt.
findings:readConsole, CLI, exportsConsume derived findings.Enforced by Gateway and downstream services.

3 · Data & control flow highlights

  1. Ingestion: Concelier / Excititor connectors fetch upstream documents, compute linksets, and hand payloads to AOCWriteGuard. Guards validate schema, provenance, forbidden fields, supersedes pointers, and append-only rules before writing to Mongo.
  2. Verification: stella aoc verify (CLI/CI) and /aoc/verify endpoints replay guard checks against stored documents, mapping ERR_AOC_00x codes to exit codes for automation.
  3. Policy evaluation: Mongo change streams deliver tenant-scoped raw deltas. Policy Engine joins SBOM inventory (via BOM Index), executes deterministic policies, writes overlays, and emits events to Scheduler/Notify.
  4. Experience surfaces: Console renders an AOC dashboard showing ingestion latency, guard violations, and supersedes depth. CLI exposes raw-document fetch helpers for auditing. Offline Kit bundles raw collections alongside guard configs to keep air-gapped installs verifiable.
  5. Observability: All services emit ingestion_write_total, aoc_violation_total{code}, ingestion_latency_seconds, and trace spans ingest.fetch, ingest.transform, ingest.write, aoc.guard. Logs correlate via traceId, tenant, source.vendor, and content_hash.

4 · Offline & disaster readiness

  • Offline Kit: Packages raw Mongo snapshots (advisory_raw, vex_raw) plus guard configuration and CLI verifier binaries so air-gapped sites can re-run AOC checks before promotion.
  • Recovery: Supersedes chains allow rollback to prior revisions without mutating documents. Disaster exercises must rehearse restoring from snapshot, replaying change streams into Policy Engine, and re-validating guard compliance.
  • Migration: Legacy normalised fields are moved to temporary views during cutover; ingestion runtime removes writes once guard-enforced path is live (see Migration playbook).

5 · Replay CAS & deterministic bundles

  • Replay CAS: Content-addressed storage lives under cas://replay/<sha256-prefix>/<digest>.tar.zst. Writers must use StellaOps.Replay.Core helpers to ensure lexicographic file ordering, POSIX mode normalisation (0644/0755), LF newlines, and zstd level 19 compression. Bundle metadata (size, hash, created) feeds the platform-wide replay_bundles collection defined in docs/data/replay_schema.md.
  • Artifacts: Each recorded scan stores three bundles:
    1. manifest.json (canonical JSON, hashed and signed via DSSE).
    2. inputbundle.tar.zst (feeds, policies, tools, environment snapshot).
    3. outputbundle.tar.zst (SBOM, findings, VEX, logs, Merkle proofs). Every artifact is signed with multi-profile keys (FIPS, GOST, SM, etc.) managed by Authority. See docs/replay/DETERMINISTIC_REPLAY.md §2–§5 for the full schema.
  • Reachability subtree: When reachability recording is enabled, Scanner uploads graphs & runtime traces under cas://replay/<scan-id>/reachability/graphs/ and cas://replay/<scan-id>/reachability/traces/. Manifest references (StellaOps.Replay.Core) bind these URIs along with analyzer hashes so Replay + Signals can rehydrate explainability evidence deterministically.
  • Storage tiers: Primary storage is Mongo (replay_runs, replay_subjects) plus the CAS bucket. Evidence Locker mirrors bundles for long-term retention and legal hold workflows (docs/modules/evidence-locker/architecture.md). Offline kits package bundles under offline/replay/<scan-id> with detached DSSE envelopes for air-gapped verification.
  • APIs & ownership: Scanner WebService produces the bundles via record mode, Scanner Worker emits Merkle metadata, Signer/Authority provide DSSE signatures, Attestor anchors manifests to Rekor, CLI/Evidence Locker handle retrieval, and Docs Guild maintains runbooks. Responsibilities are tracked in docs/implplan/SPRINT_185_shared_replay_primitives.md through SPRINT_187_evidence_locker_cli_integration.md.
  • Operational policies: Retention defaults to 180 days for hot CAS storage and 2 years for cold Evidence Locker copies. Rotation and pruning follow the checklist in docs/runbooks/replay_ops.md.

6 · References


7 · Compliance checklist

  • [ ] AOC guard enabled for all Concelier and Excititor write paths in production.
  • [ ] Mongo schema validators deployed for advisory_raw and vex_raw; change streams scoped per tenant.
  • [ ] Authority scopes (advisory:*, vex:*, effective:*) configured in Gateway and validated via integration tests.
  • [ ] stella aoc verify wired into CI/CD pipelines with seeded violation fixtures.
  • [ ] Console AOC dashboard and CLI documentation reference the new ingestion contract.
  • [ ] Offline Kit bundles include guard configs, verifier tooling, and documentation updates.
  • [ ] Observability dashboards include violation, latency, and supersedes depth metrics with alert thresholds.

Last updated: 2025-11-03 (Replay planning refresh).