Observability Engine
Fleet-wide telemetry: OpenTelemetry collector deployment + lifecycle, Graylog log-forwarder sidecar attachment, and Grafana provisioning. Generates per-platform deployment plans (systemd / launchd / sysvinit / Windows services / FreeBSD rc.d) so a single server-side configuration rolls out consistently across mixed OS fleets.
Overview
The Observability Engine owns the deployment, configuration, and routing of three telemetry surfaces: the OpenTelemetry collector running on each agent (metrics + traces), the Graylog log-forwarder sidecar (syslog / file / journald shipping), and Grafana (dashboards + datasources auto-provisioned for SysManage's Postgres data). Per-host plans are generated from server-side templates so the agent never has to know how to configure these services.
Tier & Licensing
- Enterprise tier: full observability deployment.
- Community Edition: read-only telemetry status (is OTEL running? is Graylog attached?). No deploy / configure.
Telemetry Surfaces
OpenTelemetry Collector
A per-host OTEL collector that scrapes node-level metrics (CPU, memory, disk, network), forwards traces from instrumented apps, and ships them to a configurable backend. The engine generates platform-specific install bundles — binary + config + service unit — for Linux (systemd), FreeBSD (rc.d), OpenBSD (rc.d), NetBSD (rc.d), macOS (launchd), and Windows (Windows service).
Graylog Log Forwarder
A lightweight log-shipper sidecar that attaches to a Graylog input. Sources are configurable per host: syslog files, journald, application log files, or Windows Event Log. Detach + re-attach without restarting the SysManage agent.
Grafana Provisioning
Connect a Grafana instance and the engine auto-provisions its Postgres datasource (read-only credentials), an admin folder, and a starter dashboard pack: host inventory, package compliance, vulnerability status, agent connectivity, and update / OS-upgrade activity. Custom dashboards added through Grafana's UI are not touched by re-provisioning.
Open Source vs Professional+
- Community Edition: agent reports OTEL service status and Graylog attach state via existing status messages so the host details page shows the current connectivity.
- Professional+: deploy, start, stop, restart, and uninstall the OTEL collector across the fleet from one server-side configuration.
- Professional+: attach hosts to a Graylog input with one click; configure source paths and TLS per host.
- Professional+: connect Grafana, auto-provision the SysManage datasource + dashboards.
OTEL Deploy Flow
- User selects target hosts (singly or via fleet selector).
- User picks the backend (e.g. OTLP HTTP endpoint, OTLP gRPC, Prometheus remote-write).
- Engine generates the per-platform deployment plan: download the otelcol binary, drop the rendered
config.ymlin/etc/otelcol/(or platform equivalent), install the service unit, enable + start. - Agent runs the plan via
apply_deployment_plan; per-step result rolls up into deployment status. - Status sweep runs every minute and refreshes the per-host OTEL state; the host details page renders it.
Using the UI
From a host's Host Details page:
- Deploy OpenTelemetry action triggers the per-host install plan; Start / Restart / Stop OpenTelemetry Service drive lifecycle.
- Connect Host to Graylog opens the Graylog attach dialog; the engine plan installs and configures the log forwarder.
- Enable Grafana Integration on Settings configures the datasource + provisions dashboards.
API Endpoints
POST /api/v1/observability/otel/deploy— deploy the collector to one or more hostsPOST /api/v1/observability/otel/{action}— lifecycle: start / stop / restart / uninstallGET /api/v1/observability/otel/status— per-host OTEL service statusPOST /api/v1/observability/graylog/attach— attach a host (or set of hosts) to a Graylog inputPOST /api/v1/observability/graylog/detach— detachGET /api/v1/observability/graylog/status— per-host attach state + source listPOST /api/v1/observability/grafana/connect— register a Grafana instance and provision dashboardsGET /api/v1/observability/grafana/dashboards— list provisioned dashboards
Required Permissions
Deploy OpenTelemetry,Start OpenTelemetry Service,Stop OpenTelemetry Service,Restart OpenTelemetry ServiceConnect Host to Graylog,Enable Graylog IntegrationEnable Grafana Integration
Troubleshooting
- If the host shows OTEL as not running after a successful deploy, the engine's status sweep takes up to ~60s to refresh; force a re-check via the host details page's refresh action.
- Graylog over TLS requires the agent host's CA bundle to trust the Graylog certificate; the attach dialog accepts a custom CA file path which the engine deploys to the agent.
- Grafana provisioning uses the read-only Postgres credentials configured under
observability.grafana.datasourcein/etc/sysmanage.yaml; rotate via the Grafana datasource UI rather than re-running the connect flow.