⭐ PRO+

Fleet Engine

Server-side fleet management: bulk operations across selectable host sets, host groups with parent/child hierarchies and dynamic-criteria membership, rolling deployments with batched rollout windows and failure thresholds, scheduled fleet operations, and per-operation progress tracking.

Overview

The Fleet Engine adds operation-at-scale primitives on top of the open-source single-host APIs: a HostSelector filter DSL that resolves to a list of host IDs, persisted BulkOperation records with per-host result tracking, RollingDeployment with configurable batch size and failure-threshold halt, and HostGroup with optional dynamic-criteria membership that re-evaluates on every selector resolution.

Open Source vs Professional+

Community Edition: simple expand-into-per-host plans via backend/services/bulk_op_planner.py (no persistence, no batching, explicit host_ids only)
Professional+: HostSelector filter DSL — by tag, platform, group, approval status, or arbitrary criteria with equals / contains / in / matches ops
Professional+: persistent host groups with parent/child hierarchy and dynamic-criteria membership
Professional+: rolling deployments with batched rollout windows + automatic halt on excess failures
Professional+: persisted BulkOperation records, per-host result tracking, aggregated OperationProgress snapshots
Professional+: cron-triggered scheduled fleet operations

Host Selector DSL

A HostSelector is a compound filter combining explicit host IDs with optional criteria (AND-joined). Convenience shortcuts (platforms, tags, groups, approval_status) compile into equivalent criteria before evaluation:

POST /api/v1/fleet/select
{
  "explicit_host_ids": ["..uuid-special.."],
  "criteria": [
    {"field": "platform_release", "op": "equals", "value": "ubuntu"},
    {"field": "fqdn",             "op": "contains", "value": ".prod."}
  ],
  "platforms": ["linux"],
  "tags": ["webserver"],
  "approval_status": "approved"
}

→ {"host_ids": [...], "count": 47}

Host Groups

A HostGroup bundles an explicit host list, dynamic-criteria, and an optional parent group. Membership re-evaluates on every resolve_group_hosts call, so newly-onboarded hosts that match a group's criteria appear automatically. The engine rejects parent assignments that would form a cycle, and reparents a deleted group's children to the deleted group's parent (preserving hierarchy depth-1).

GET /api/v1/fleet/groups — list groups; ?parent_id= filters children
POST /api/v1/fleet/groups — create or update; cycle-checked
GET /api/v1/fleet/groups/{id}/hosts — resolve effective membership against the live inventory
DELETE /api/v1/fleet/groups/{id} — reparents children

Bulk Operations

Eight operation types: run_script, deploy_file, service_control, install_package, remove_package, reboot, shutdown, and apply_deployment_plan (the meta-op that ships an entire pre-built deploy plan). Each BulkOperation record tracks per-host status (queued / running / succeeded / failed / skipped) and rolls up to an aggregate (queued / running / succeeded / partial / failed / cancelled).

POST /api/v1/fleet/bulk
{
  "operation_type": "install_package",
  "selector": { "platforms": ["linux"], "tags": ["webserver"] },
  "parameters": { "packages": ["nginx"] }
}

GET /api/v1/fleet/bulk/{id}/progress
{"total": 47, "succeeded": 30, "failed": 2, "running": 15, "queued": 0, "percent_complete": 68.1}

Rolling Deployments

A RollingDeployment partitions the resolved host set into batches of batch_size, dispatching one batch at a time with a configurable batch_delay_seconds between them. The engine automatically halts when failures exceed failure_threshold_pct: queued hosts are cancelled and the rollout flips to failed. Operators can pause / resume / cancel mid-flight.

POST /api/v1/fleet/rolling — create with batch_size + batch_delay + failure_threshold_pct
GET /api/v1/fleet/rolling/{id}/progress — aggregated counters snapshot
POST /api/v1/fleet/rolling/{id}/pause / /resume / /cancel

Scheduled Fleet Operations

A ScheduledFleetOperation binds an operation type + selector + cron expression. The selector resolves on each fire, so a schedule that targets tags=[prod] automatically picks up newly-tagged hosts. The engine validates the 5-field cron form on insert and tracks last_run + last_status.

Feature Codes

Endpoints in this engine are gated by these feature codes (returned 402 by the open-source server when the module isn't loaded):

fleet_groups — groups CRUD + selector resolution
fleet_bulk_operations — bulk op endpoints
fleet_rolling_deployments — rolling deployments
fleet_scheduled_operations — scheduled fleet ops
fleet_config_deployment — reserved for fleet-wide config deploy

Architecture

The engine is a Cython-compiled binary (fleet_engine.so) loaded by the open-source server's module_loader at startup. It exports the same get_module_info() + get_fleet_router() contract as the other Pro+ modules, so mount_fleet_routes() wires it under /api/v1/fleet alongside the 402 stub routes.