Fleet Engine
Server-side fleet management: bulk operations across selectable host sets, host groups with parent/child hierarchies and dynamic-criteria membership, rolling deployments with batched rollout windows and failure thresholds, scheduled fleet operations, and per-operation progress tracking.
Overview
The Fleet Engine adds operation-at-scale primitives on top of the open-source single-host APIs: a HostSelector filter DSL that resolves to a list of host IDs, persisted BulkOperation records with per-host result tracking, RollingDeployment with configurable batch size and failure-threshold halt, and HostGroup with optional dynamic-criteria membership that re-evaluates on every selector resolution.
Open Source vs Professional+
- Community Edition: simple expand-into-per-host plans via
backend/services/bulk_op_planner.py(no persistence, no batching, explicit host_ids only) - Professional+:
HostSelectorfilter DSL — by tag, platform, group, approval status, or arbitrary criteria withequals/contains/in/matchesops - Professional+: persistent host groups with parent/child hierarchy and dynamic-criteria membership
- Professional+: rolling deployments with batched rollout windows + automatic halt on excess failures
- Professional+: persisted
BulkOperationrecords, per-host result tracking, aggregatedOperationProgresssnapshots - Professional+: cron-triggered scheduled fleet operations
Host Selector DSL
A HostSelector is a compound filter combining explicit host IDs with optional criteria (AND-joined). Convenience shortcuts (platforms, tags, groups, approval_status) compile into equivalent criteria before evaluation:
POST /api/v1/fleet/select
{
"explicit_host_ids": ["..uuid-special.."],
"criteria": [
{"field": "platform_release", "op": "equals", "value": "ubuntu"},
{"field": "fqdn", "op": "contains", "value": ".prod."}
],
"platforms": ["linux"],
"tags": ["webserver"],
"approval_status": "approved"
}
→ {"host_ids": [...], "count": 47}
Host Groups
A HostGroup bundles an explicit host list, dynamic-criteria, and an optional parent group. Membership re-evaluates on every resolve_group_hosts call, so newly-onboarded hosts that match a group's criteria appear automatically. The engine rejects parent assignments that would form a cycle, and reparents a deleted group's children to the deleted group's parent (preserving hierarchy depth-1).
GET /api/v1/fleet/groups— list groups;?parent_id=filters childrenPOST /api/v1/fleet/groups— create or update; cycle-checkedGET /api/v1/fleet/groups/{id}/hosts— resolve effective membership against the live inventoryDELETE /api/v1/fleet/groups/{id}— reparents children
Bulk Operations
Eight operation types: run_script, deploy_file, service_control, install_package, remove_package, reboot, shutdown, and apply_deployment_plan (the meta-op that ships an entire pre-built deploy plan). Each BulkOperation record tracks per-host status (queued / running / succeeded / failed / skipped) and rolls up to an aggregate (queued / running / succeeded / partial / failed / cancelled).
POST /api/v1/fleet/bulk
{
"operation_type": "install_package",
"selector": { "platforms": ["linux"], "tags": ["webserver"] },
"parameters": { "packages": ["nginx"] }
}
GET /api/v1/fleet/bulk/{id}/progress
{"total": 47, "succeeded": 30, "failed": 2, "running": 15, "queued": 0, "percent_complete": 68.1}
Rolling Deployments
A RollingDeployment partitions the resolved host set into batches of batch_size, dispatching one batch at a time with a configurable batch_delay_seconds between them. The engine automatically halts when failures exceed failure_threshold_pct: queued hosts are cancelled and the rollout flips to failed. Operators can pause / resume / cancel mid-flight.
POST /api/v1/fleet/rolling— create with batch_size + batch_delay + failure_threshold_pctGET /api/v1/fleet/rolling/{id}/progress— aggregated counters snapshotPOST /api/v1/fleet/rolling/{id}/pause//resume//cancel
Scheduled Fleet Operations
A ScheduledFleetOperation binds an operation type + selector + cron expression. The selector resolves on each fire, so a schedule that targets tags=[prod] automatically picks up newly-tagged hosts. The engine validates the 5-field cron form on insert and tracks last_run + last_status.
Feature Codes
Endpoints in this engine are gated by these feature codes (returned 402 by the open-source server when the module isn't loaded):
fleet_groups— groups CRUD + selector resolutionfleet_bulk_operations— bulk op endpointsfleet_rolling_deployments— rolling deploymentsfleet_scheduled_operations— scheduled fleet opsfleet_config_deployment— reserved for fleet-wide config deploy
Architecture
The engine is a Cython-compiled binary (fleet_engine.so) loaded by the open-source server's module_loader at startup. It exports the same get_module_info() + get_fleet_router() contract as the other Pro+ modules, so mount_fleet_routes() wires it under /api/v1/fleet alongside the 402 stub routes.