Surface.Env Design (Epic: SURFACE-SHARING)
Status: Draft v1.0 — aligns with tasks
SURFACE-ENV-01..05,SCANNER-ENV-01..03,ZASTAVA-ENV-01..02,OPS-ENV-01.Audience: Scanner Worker/WebService engineers, Zastava engineers, DevOps/Ops teams.
1. Goals
Surface.Env centralises configuration discovery for every component that touches the shared Scanner “surface” (cache, manifests, secrets). The library replaces ad-hoc environment lookups with a deterministic, validated contract that:
- Works identically across Scanner Worker, Scanner WebService, BuildX plug-ins, Zastava Observer/Webhook, and future consumers (Scheduler planners, CLI runners).
- Supports both connected and air-gapped deployments with clear defaults.
- Records configuration intent (tenant isolation, cache limits, TLS, feature flags) so Surface.Validation can enforce preconditions before any work executes.
2. Architecture Overview
+-----------------------+
| Host (Worker/WebSvc) |
| - IConfiguration |
| - ILogger |
| |
| +-----------------+ |
| | SurfaceEnv | | loads env vars / config file
| | - Provider |--+------------------------------+
| | - Validators | |
| +-----------------+ |
| | |
| | IResolvedSurfaceConfiguration |
| v v
| Surface.FS / Surface.Secrets / Surface.Validation consumers
+-------------------------------------------------------------
Surface.Env exposes ISurfaceEnvironment which returns an immutable SurfaceEnvironmentSettings record. Hosts call SurfaceEnvBuilder.Build() during startup, passing optional configuration overrides (for example, Helm chart values). The builder resolves environment variables, applies defaults, and executes Surface.Validation rules before handing settings to downstream services.
3. Configuration Schema
3.1 Common keys
| Variable | Description | Default | Notes |
|---|---|---|---|
SCANNER_SURFACE_FS_ENDPOINT | Base URI for Surface.FS / RustFS / S3-compatible store. | required | Throws SurfaceEnvironmentException when RequireSurfaceEndpoint = true. When disabled (tests), builder falls back to https://surface.invalid so validation can fail fast. Also binds Surface:Fs:Endpoint from IConfiguration. |
SCANNER_SURFACE_FS_BUCKET | Bucket/container used for manifests and artefacts. | surface-cache | Must be unique per tenant; validators enforce non-empty value. |
SCANNER_SURFACE_FS_REGION | Optional region for S3-compatible stores. | null | Needed only when the backing store requires it (AWS/GCS). |
SCANNER_SURFACE_CACHE_ROOT | Local directory for warm caches. | <temp>/stellaops/surface | Directory is created if missing. Override to /var/lib/stellaops/surface (or another fast SSD) in production. |
SCANNER_SURFACE_CACHE_QUOTA_MB | Soft limit for on-disk cache usage. | 4096 | Enforced range 64–262144 MB; validation emits SURFACE_ENV_CACHE_QUOTA_INVALID outside the range. |
SCANNER_SURFACE_PREFETCH_ENABLED | Enables manifest prefetch threads. | false | Workers honour this before analyzer execution. |
SCANNER_SURFACE_TENANT | Tenant namespace used by cache + secret resolvers. | TenantResolver(...) or "default" | Default resolver may pull from Authority claims; you can override via env for multi-tenant pools. |
SCANNER_SURFACE_FEATURES | Comma-separated feature switches. | "" | Compared against SurfaceEnvironmentOptions.KnownFeatureFlags; unknown flags raise warnings. |
SCANNER_SURFACE_TLS_CERT_PATH | Path to PEM/PKCS#12 file for client auth. | null | When present, SurfaceEnvironmentBuilder loads the certificate into SurfaceTlsConfiguration. |
SCANNER_SURFACE_TLS_KEY_PATH | Optional private-key path when cert/key are stored separately. | null | Stored in SurfaceTlsConfiguration for hosts that need to hydrate the key themselves. |
3.2 Secrets provider keys
| Variable | Description | Notes |
|---|---|---|
SCANNER_SURFACE_SECRETS_PROVIDER | Provider ID (kubernetes, file, inline, future back-ends). | Defaults to kubernetes; validators reject unknown values via SURFACE_SECRET_PROVIDER_UNKNOWN. |
SCANNER_SURFACE_SECRETS_ROOT | Path or base namespace for the provider. | Required for the file provider (e.g., /etc/stellaops/secrets). |
SCANNER_SURFACE_SECRETS_NAMESPACE | Kubernetes namespace used by the secrets provider. | Mandatory when provider = kubernetes. |
SCANNER_SURFACE_SECRETS_FALLBACK_PROVIDER | Optional secondary provider ID. | Enables tiered lookups (e.g., kubernetes → inline) without changing code. |
SCANNER_SURFACE_SECRETS_ALLOW_INLINE | Allows returning inline secrets (useful for tests). | Defaults to false; Production deployments should keep this disabled. |
SCANNER_SURFACE_SECRETS_TENANT | Tenant override for secret lookups. | Defaults to SCANNER_SURFACE_TENANT or the tenant resolver result. |
3.3 Component-specific prefixes
SurfaceEnvironmentOptions.Prefixes controls the order in which suffixes are probed. Every suffix listed above is combined with each prefix (e.g., SCANNER_SURFACE_FS_ENDPOINT, ZASTAVA_SURFACE_FS_ENDPOINT) and finally the bare suffix (SURFACE_FS_ENDPOINT). Configure prefixes per host so local overrides win but global scanner defaults remain available:
| Component | Suggested prefixes (first match wins) | Notes |
|---|---|---|
| Scanner.Worker / WebService | SCANNER | Default – already added by AddSurfaceEnvironment. |
| Zastava Observer/Webhook (planned) | ZASTAVA, SCANNER | Call options.AddPrefix("ZASTAVA") before relying on ZASTAVA_* overrides. |
| Future CLI / BuildX plug-ins | CLI, SCANNER | Allows per-user overrides without breaking shared env files. |
This approach means operators can define a single env file (SCANNER_*) and only override the handful of settings that diverge for a specific component by introducing an additional prefix.
3.4 Configuration precedence
The builder resolves every suffix using the following precedence:
- Environment variables using the configured prefixes (e.g.,
ZASTAVA_SURFACE_FS_ENDPOINT, thenSCANNER_SURFACE_FS_ENDPOINT, then the bareSURFACE_FS_ENDPOINT). - Configuration values under the
Surface:*section (for exampleSurface:Fs:Endpoint,Surface:Cache:Rootinappsettings.jsonor Helm values). - Hard-coded defaults baked into
SurfaceEnvironmentBuilder(temporary directory,surface-cachebucket, etc.).
SurfaceEnvironmentOptions.RequireSurfaceEndpoint controls whether a missing endpoint results in an exception (default: true). Other values fall back to the default listed in § 3.1/3.2 and are further validated by the Surface.Validation pipeline.
4. API Surface
public interface ISurfaceEnvironment
{
SurfaceEnvironmentSettings Settings { get; }
IReadOnlyDictionary<string, string> RawVariables { get; }
}
public sealed record SurfaceEnvironmentSettings(
Uri SurfaceFsEndpoint,
string SurfaceFsBucket,
string? SurfaceFsRegion,
DirectoryInfo CacheRoot,
int CacheQuotaMegabytes,
bool PrefetchEnabled,
IReadOnlyCollection<string> FeatureFlags,
SurfaceSecretsConfiguration Secrets,
string Tenant,
SurfaceTlsConfiguration Tls)
{
public DateTimeOffset CreatedAtUtc { get; init; }
}
public sealed record SurfaceSecretsConfiguration(
string Provider,
string Tenant,
string? Root,
string? Namespace,
string? FallbackProvider,
bool AllowInline);
public sealed record SurfaceTlsConfiguration(
string? CertificatePath,
string? PrivateKeyPath,
X509Certificate2Collection? ClientCertificates);
ISurfaceEnvironment.RawVariables captures the exact env/config keys that produced the snapshot so operators can export them in diagnostics bundles.
SurfaceEnvironmentOptions configures how the snapshot is built:
ComponentName– used in logs/validation output.Prefixes– ordered list of env prefixes (see § 3.3). Defaults to["SCANNER"].RequireSurfaceEndpoint– throw when no endpoint is provided (defaulttrue).TenantResolver– delegate invoked whenSCANNER_SURFACE_TENANTis absent.KnownFeatureFlags– recognised feature switches; unexpected values raise warnings.
Example registration:
builder.Services.AddSurfaceEnvironment(options =>
{
options.ComponentName = "Scanner.Worker";
options.AddPrefix("ZASTAVA"); // optional future override
options.KnownFeatureFlags.Add("validation");
options.TenantResolver = sp => sp.GetRequiredService<ITenantContext>().TenantId;
});
Consumers access ISurfaceEnvironment.Settings and pass the record into Surface.FS, Surface.Secrets, cache, and validation helpers. The interface memoises results so repeated access is cheap.
5. Validation
SurfaceEnvironmentBuilder only throws SurfaceEnvironmentException for malformed inputs (non-integer quota, invalid URI, missing required variable when RequireSurfaceEndpoint = true). The richer validation pipeline lives in StellaOps.Scanner.Surface.Validation and runs via services.AddSurfaceValidation():
- SurfaceEndpointValidator – checks for a non-placeholder endpoint and bucket (
SURFACE_ENV_MISSING_ENDPOINT,SURFACE_FS_BUCKET_MISSING). - SurfaceCacheValidator – verifies the cache directory exists/is writable and that the quota is positive (
SURFACE_ENV_CACHE_DIR_UNWRITABLE,SURFACE_ENV_CACHE_QUOTA_INVALID). - SurfaceSecretsValidator – validates provider names, required namespace/root fields, and tenant presence (
SURFACE_SECRET_PROVIDER_UNKNOWN,SURFACE_SECRET_CONFIGURATION_MISSING,SURFACE_ENV_TENANT_MISSING).
Validators emit SurfaceValidationIssue instances with codes defined in SurfaceValidationIssueCodes. LoggingSurfaceValidationReporter writes structured log entries (Info/Warning/Error) using the component name, issue code, and remediation hint. Hosts fail startup if any issue has Error severity; warnings allow startup but surface actionable hints.
6. Integration Guidance
- Scanner Worker: register
AddSurfaceEnvironment,AddSurfaceValidation,AddSurfaceFileCache, andAddSurfaceSecretsbefore analyzer/services (seesrc/Scanner/StellaOps.Scanner.Worker/Program.cs).SurfaceCacheOptionsConfiguratoralready binds the cache root fromISurfaceEnvironment. - Scanner WebService: identical wiring, plus
SurfacePointerService/ScannerSurfaceSecretConfiguratorreuse the resolved settings (Program.csdemonstrates the pattern). - Zastava Observer/Webhook: will reuse the same helper once the service adds
AddSurfaceEnvironment(options => options.AddPrefix("ZASTAVA"))so per-component overrides function without diverging defaults. - Scheduler / CLI / BuildX (future): treat
ISurfaceEnvironmentas read-only input; secret lookup, cache plumbing, and validation happen before any queue/enqueue work.
Readiness probes should invoke ISurfaceValidatorRunner (registered by AddSurfaceValidation) and fail the endpoint when any issue is returned. The Scanner Worker/WebService hosted services already run the validators on startup; other consumers should follow the same pattern.
6.1 Validation output
LoggingSurfaceValidationReporter produces log entries that include:
Surface validation issue for component Scanner.Worker: SURFACE_ENV_MISSING_ENDPOINT - Surface FS endpoint is missing or invalid. Hint: Set SCANNER_SURFACE_FS_ENDPOINT to the RustFS/S3 endpoint.
Treat SurfaceValidationIssueCodes.* with severity Error as hard blockers (readiness must fail). Warning entries flag configuration drift (for example, missing namespaces) but allow startup so staging/offline runs can proceed. The codes appear in both the structured log state and the reporter payload, making it easy to alert on them.
7. Security & Observability
- Surface.Env never logs raw values; only suffix names and issue codes appear in logs.
RawVariablesis intended for diagnostics bundles and should be treated as sensitive metadata. - TLS certificates are loaded into memory and not re-serialised; only the configured paths are exposed to downstream services.
- To emit metrics, register a custom
ISurfaceValidationReporter(e.g., wrapping Prometheus counters) in addition to the logging reporter.
8. Offline & Air-Gap Support
- Defaults assume no public network access; point
SCANNER_SURFACE_FS_ENDPOINTat an internal RustFS/S3 mirror. - Offline bundles must capture an env file (Ops track this under the Offline Kit tasks) so operators can seed
SCANNER_*values before first boot. - Keep
docs/modules/devops/runbooks/zastava-deployment.mdin sync so Zastava deployments reuse the same env contract.
9. Testing Strategy
- Unit tests for each resolver/validator.
- Integration tests for Worker & Observer verifying that missing configuration causes deterministic failures.
- Golden tests for configuration precedence (component overrides, defaults).
10. Open Questions / Future Work
- Dynamic refresh of environment (watch ConfigMap) is out of scope for v1.
- Evaluate adding support for environment discovery via
IConfigurationonly (no env vars) for Windows service deployments.
11. References
- Surface.FS Design (
docs/modules/scanner/design/surface-fs.md) - Surface.Secrets Design (
docs/modules/scanner/design/surface-secrets.md) - Surface.Validation Design (
docs/modules/scanner/design/surface-validation.md) - AirGap mode overview (
docs/airgap/airgap-mode.md)