Air-Gapped Deployment (Phase 11)

Dual-server architecture for managing hosts on isolated networks that have no internet access. One server (Collector) lives on the public side, mirrors upstream package and CVE feeds, and burns the result to optical media. A second server (Repository) lives on the air-gapped side, ingests the media, serves it as a local mirror, and lets managed agents update normally without ever touching the internet.

Architecture

Two servers, both running the same SysManage binary. The role: key in sysmanage.yaml picks which half of the air-gap pair this server runs as. License is the same Enterprise SKU on both — one license includes both engines.

Collector — public-side. Loads airgap_collector_engine. Captures package mirrors (apt-mirror, dnf reposync, pkg fetch), CVE / NVD snapshots, and CIS / STIG compliance feeds. Bundles the result into one or more ISO images, signs the manifest with ed25519, optionally burns to physical media.
Repository — private-side. Loads airgap_repository_engine. Ingests optical media, verifies the signature against the configured public key, verifies per-file SHA-256 hashes, copies the payload into a local mirror tree, and generates native repo metadata so each managed agent's package manager can consume it. Also rewrites each agent's APT / DNF / pkg config to point at the local mirror.
Standard — the default. No air-gap; manages hosts directly online. This is what every pre-Phase-11 deployment was, and what new deployments default to when role: is omitted from sysmanage.yaml.

The two halves never share a database. Collector and Repository communicate exclusively through the disc — there is no network connection between them by design. The Repository records the manifest's signer_fingerprint + collector_iso_label so audit trails can correlate runs across the gap by inspecting both halves' databases after the fact.

Setting the server role

Set the role from the web UI — no config-file editing required. Go to Settings → Server Role and pick one of three options, each shown with an explanation of what it means and how it's used:

Standard (no air gap) — the default; manages hosts directly over the network. Leave every ordinary server here.
Air-Gap Collector (online side) — the internet-connected half that mirrors upstream repos and builds the signed ISO media.
Air-Gap Repository (disconnected side) — the isolated half that ingests the media and serves packages to hosts on the private network.

The choice is stored in the server's database (the server_configuration singleton), not in sysmanage.yaml. Earlier releases used a top-level role: key in the YAML; that key is now ignored, so you can remove it. Restart the SysManage server after changing the role for it to take full effect.

The role is also exposed via the unauthenticated GET /api/v1/server-info endpoint, which the web UI uses to render a chip in the header bar so operators can see at a glance which half of an air-gap pair they're looking at. The authenticated GET/PUT /api/v1/server-role endpoints back the Settings screen.

If the role is set to Collector or Repository but the corresponding Pro+ engine isn't loaded (typically a license problem), the chip turns red and a tooltip explains what's missing. The server keeps running in degraded mode so you can fix the license without losing access to the rest of the UI.

Collection cycle (Collector side)

Triggered by an operator (UI, API, or cron) on the public-side server. The Phase 12 "Option B" rework sources the bundle from snapshots of configured mirror_repository trees rather than running a second upstream fetch per run — one download from upstream (via the repository_mirroring engine's normal sync) is reused for both the LAN mirror and the air-gap bundle. The engine emits a deployment plan that runs locally on the collector itself; it does not dispatch to managed agents.

Request — POST /api/v1/airgap/collector/runs with a body listing each target as a {mirror_id} picked from /api/mirror-repositories, plus iso_label, media_size_bytes, include_cve, include_compliance, and optional burn_device. Distro / version / repos are derived server-side from each mirror's known_version catalog entry.
Validation — each picked mirror must exist, be enabled, have a known_version_id, and share a host_id with the other picks (the collection plan dispatches to one agent, so cross-host targets aren't supported in v1). Version strings are constrained to a tight regex so they cannot smuggle shell metacharacters into the dispatch templates; iso_label is constrained the same way.
Auto-snapshot — at run creation, one snapshot is dispatched per target mirror. Each AirgapCollectionTarget row pins its source_snapshot_id to the placeholder snapshot row so the orchestrator knows which snapshot to bundle once the agent reports completion. Snapshots run in parallel; the run sits at QUEUED until every target's last_snapshot_message_id clears.
Mirror (snapshot-sourced) — once all snapshots are ready, the orchestrator builds build_snapshot_collection_run_plan with a source_snapshots map of "<distro>:<version>" → snapshot directory path. The plan rsyncs each snapshot tree into per-target staging directories. No upstream apt-mirror / reposync / pkg fetch runs at collection time — the bundle is provably byte-for-byte identical to the snapshot. Each rsync has a 4-hour timeout ceiling; the agent's in-flight journal (Phase 11.6) covers the multi-hour run across WebSocket reconnects.
CVE / compliance snapshot — if requested, NVD and vendor-advisory feeds are captured at the same instant so the resulting media set has a coherent point-in-time view of public vulnerability data.
ISO build — xorriso -as mkisofs bundles the staging tree into one ISO at /var/lib/sysmanage/airgap-iso/<run_id>.iso with the manifest embedded at /manifest.json. The post-build step computes the ISO's SHA-256 for the transfer log.
Sign — the manifest is canonicalized (sorted keys, no whitespace) and signed with the operator's ed25519 private key. The envelope carries the signature, the SHA-256 of the public key as a fingerprint, the algorithm identifier, and the manifest format version.
Burn (optional) — if burn_device is set on the run row, the orchestrator advances ISO_BUILT → BURNING by dispatching build_burn_plan at that device path (typically /dev/sr0). The burn-step timeout ceiling is 4 hours so slow drives don't trip a watchdog mid-write. When burn_device is NULL the run goes ISO_BUILT → COMPLETE directly — the "build a downloadable ISO and stop" flow that doesn't touch optical media.
Download — GET /api/v1/airgap/collector/runs/{id}/iso streams the ISO file. For multi-disc runs, GET /runs/{id}/discs lists the disc files and /iso?disc=N picks a specific one. The Air-Gap Collections UI auto-detects multi-disc runs and pops a picker dialog when the operator clicks Download.

Lifecycle states the airgap_run_tick orchestrator drives: QUEUED (waiting on per-target snapshots) → MIRRORING (rsync from snapshot trees into staging) → STAGING_COMPLETE (single-disc only; multi-disc skips this) → BUILDING_ISO (xorriso) → ISO_BUILT → BURNING (only when burn_device set) → COMPLETE. Any failure transitions the run to FAILED with the surfaced error message visible in the UI status column.

Multi-disc spanning (Phase 12). When sum(target.source_snapshot.size_bytes) > run.media_size_bytes the orchestrator routes to build_snapshot_multidisc_collection_plan, which first-fit-decreasing bin-packs whole target trees onto discs. Each disc gets its own staging dir + rsync + xorriso + sha256 in one inline plan; output is <run_id>-disc-<N>.iso. The orchestrator surfaces a clear error if any single target's tree exceeds the disc size — file-level splitting (chopping one mirror tree across discs) is deferred; the operator picks larger media (BD-25 → BD-50) or splits into separate runs.

Ingestion cycle (Repository side)

Operator inserts the optical media into the air-gapped server, then triggers ingestion via the UI or API. The repository engine refuses to serve any byte until the signature and file hashes have verified.

Mount — the engine emits a deployment plan that mounts the ISO read-only at /mnt/sysmanage-airgap-ingest.
Verify signature — the manifest envelope is decoded; the engine refuses to ingest a manifest whose format_version is newer than its own software version (forward-compat fail-safe). The signature is verified against the public key configured in the repository's sysmanage.yaml. Strict mode (the default) rejects the HMAC-SHA-256 fallback envelope that the collector produces only when cryptography isn't installed.
Verify file hashes — every file listed in the manifest is streamed through SHA-256 and compared against the manifest's expected hash. The first mismatch aborts the entire ingestion; the partial copy is rolled back.
Copy — once verification passes, rsync -a --delete-after copies the payload from the mount point to the local repository root at /var/lib/sysmanage/airgap-repo.
Generate native repo metadata — per distro family: apt-ftparchive packages | gzip > Packages.gz for APT, createrepo_c --update for DNF / zypper, pkg repo for FreeBSD, apk index for Alpine.
Unmount — the ISO is unmounted; the operator can eject the disc.
Repoint agents — the engine emits a per-host plan that rewrites each managed agent's package-manager configuration: /etc/apt/sources.list.d/sysmanage-airgap.list for Debian / Ubuntu, /etc/yum.repos.d/sysmanage-airgap.repo for RHEL family / openSUSE, /usr/local/etc/pkg/repos/sysmanage-airgap.conf for FreeBSD. Existing agents pick up the change on their next package operation.

Per-distro install channels

For new child hosts created on the air-gapped network, sysmanage-agent itself needs to install from a substitutable channel — direct GitHub-release downloads can't be substituted with a private mirror. Phase 11.8 switched the per-distro agent_install_commands to consume the published upstream channels instead:

Ubuntu / Debian — Launchpad PPA (ppa:bceverly/sysmanage-agent). Repository's build_agent_repoint_plan rewrites /etc/apt/sources.list.d/ to point at the local mirror before the install runs.
Fedora / RHEL / Rocky / Alma / Oracle Linux / CentOS Stream — Fedora Copr (bceverly/sysmanage-agent). Substitutable via /etc/yum.repos.d/sysmanage-airgap.repo.
openSUSE / SLES — Open Build Service (home:bceverly/sysmanage-agent). Substitutable via the same yum/zypper repo file pattern.
FreeBSD / OpenBSD / NetBSD — currently direct downloads. The repository engine can still rewrite pkg.conf to point at the local mirror, so the agent comes up via a different path than its first-boot script expects but still without internet access. A formal upstream ports / pkgsrc submission is tracked as a follow-up.

Compliance context across the gap

Phase 11.3 wires the repository's freshness state into compliance reports so operators can distinguish two failure modes that look identical without context:

not_applied — a newer version is on the local mirror but hasn't been installed on this host yet. Fix locally; cheap.
not_transferred — the public CVE feed (captured by the collector at last media transfer) lists a fix that isn't on the local mirror at all. Fix requires a new collection / burn / ingest cycle; expensive.
current — the host's installed version matches the mirror's latest, and no public-side CVE has surfaced since last transfer.

Each repository tracks days since last ingest. Compliance UI surfaces a freshness label: current (≤ 7 days), stale (8–30 days), very_stale (> 30 days), or never. Use this to size the air-gap transfer cadence against your organization's tolerance for delayed patch availability.

Troubleshooting

Ingestion fails with "signature does not match manifest payload"

The repository's configured public key doesn't match the private key that signed this disc. Either the disc was tampered with in transit, or the operator regenerated the keypair on the collector without distributing the new public key to the repository. Re-export the public key from the collector and replace the value in the repository's sysmanage.yaml.

Ingestion fails with "manifest format_version exceeds repository's max"

The collector has been upgraded to a new SysManage version that changed the manifest format, but the repository hasn't been upgraded yet. Upgrade the repository server to the same version, then re-ingest. The format-version check is a deliberate forward-compat fail-safe — older repositories cannot be tricked into ingesting media with new fields they don't understand.

Role chip in the header bar is red

The role: in sysmanage.yaml is set to collector or repository, but the corresponding Pro+ engine isn't loaded. Most common cause: the Enterprise license isn't valid (expired, missing, or doesn't include the air-gap engines). Check the License page in Settings; if the license is fine, check /var/log/sysmanage/server.log for module-loader errors.

Multi-hour collection cycle gets killed mid-run

Should not happen as of Phase 11.6 — the agent now writes a per-plan in-flight journal at ~/.sysmanage-agent/inflight/<message_id>.json before each subprocess starts, with a 30-second heartbeat. If the agent process restarts (or the WebSocket bounces), the journal carries enough state for the agent to either re-attach to the still-live subprocess or emit a synthetic command_result so the server's mirror row clears. If you see a hung mirror anyway, file an issue with the contents of the inflight journal.