Reboot Orchestration Module
Safe parent host reboot with automatic child host orchestration. Cleanly stops running containers before reboot, then automatically restarts them after the host comes back online.
Overview
The Reboot Orchestration module provides a safe, automated workflow for rebooting parent hosts that have running child hosts (WSL or LXD containers). Instead of abruptly rebooting and risking data loss in containers, the orchestrator follows a controlled sequence to ensure all children are cleanly stopped before the reboot and automatically restarted afterward.
Open Source vs Professional+
The open-source Community Edition includes a pre-check endpoint that warns about running child hosts before a reboot. The Professional+ Reboot Orchestration module adds full automated orchestration:
- Community Edition: Pre-check warning shows running child hosts before reboot
- Professional+: Automated stop, reboot, and restart orchestration
- Professional+: Real-time orchestration status tracking with timestamps
- Professional+: Automatic child host recovery after parent reboot
Orchestration State Machine
Each orchestrated reboot follows a well-defined state machine. The orchestration progresses through these states:
shutting_down → rebooting → pending_restart → restarting → completed | failed
- shutting_down — Stop commands sent to all running child hosts; waiting for confirmation
- rebooting — All children stopped; reboot command issued to parent host
- pending_restart — Parent host is rebooting; waiting for agent to reconnect
- restarting — Agent reconnected; start commands sent for previously-running children
- completed — All child hosts successfully restarted
- failed — One or more child hosts failed to restart; error details available
API Reference
The Reboot Orchestration module provides three API endpoints:
Reboot Pre-Check (Open Source)
/api/host/{host_id}/reboot/pre-check
Returns information about running child hosts so the user can make an informed decision before rebooting. Available in the open-source Community Edition.
Path Parameters
host_id(string) — UUID of the parent host
Response (200 OK)
{
"has_running_children": true,
"running_children": [
{
"id": "uuid",
"child_name": "web-container-01",
"child_type": "lxd",
"status": "running"
}
],
"running_count": 1,
"total_children": 3,
"has_container_engine": true
}
Orchestrated Reboot (Pro+)
/api/host/{host_id}/reboot/orchestrated
Initiates an orchestrated reboot sequence. Stops all running child hosts, reboots the parent, and automatically restarts children after the agent reconnects. Requires a Professional+ license with the container_engine module.
Path Parameters
host_id(string) — UUID of the parent host
Response (200 OK)
{
"orchestration_id": "uuid",
"status": "shutting_down",
"child_count": 2
}
Error Responses
402— Professional+ license required. Upgrade to access orchestrated reboot.400— Host is not active, or no running child hosts found.
Orchestration Status
/api/host/{host_id}/reboot/orchestration/{orchestration_id}
Returns the current status of a reboot orchestration, including timestamps for each phase and per-child restart status.
Path Parameters
host_id(string) — UUID of the parent hostorchestration_id(string) — UUID of the orchestration record
Response (200 OK)
{
"orchestration_id": "uuid",
"parent_host_id": "uuid",
"status": "completed",
"child_hosts_snapshot": [
{
"id": "uuid",
"child_name": "web-container-01",
"child_type": "lxd",
"pre_reboot_status": "running"
}
],
"child_hosts_restart_status": [
{
"id": "uuid",
"child_name": "web-container-01",
"restart_status": "running",
"error": null
}
],
"shutdown_timeout_seconds": 120,
"initiated_by": "admin",
"initiated_at": "2025-01-15T10:30:00",
"shutdown_completed_at": "2025-01-15T10:31:15",
"reboot_issued_at": "2025-01-15T10:31:15",
"agent_reconnected_at": "2025-01-15T10:33:45",
"restart_completed_at": "2025-01-15T10:34:20",
"error_message": null
}
Frontend Behavior
When rebooting a parent host with running child hosts, the frontend provides an enhanced experience:
- The reboot dialog shows a pre-check summary of running child hosts
- With Pro+, an "Orchestrated Reboot" button replaces the standard reboot for hosts with children
- A progress indicator polls the orchestration status endpoint to show real-time state transitions
- A notification is shown when the orchestration completes or if any children fail to restart
Timeout and Error Handling
The orchestration service includes built-in safeguards to handle failures at any stage:
- Shutdown timeout: If child hosts do not stop within the configured timeout (default 120 seconds), the reboot proceeds anyway
- Agent reconnect: The orchestration transitions from "rebooting" to "restarting" when the parent's agent sends its first heartbeat after reboot
- Partial failure: If some children fail to restart, the orchestration completes with an error message indicating which children failed
- Audit logging: All orchestration events are recorded in the audit log for compliance and troubleshooting
Requirements
- Professional or Enterprise license with container_engine module enabled
- Parent host must have at least one running child host (WSL or LXD)
- User must have the REBOOT_HOST security role
- SysManage agent running on the parent host with active connectivity