Documentation > Professional+ > Observability Engine
⭐ PRO+

Observability Engine

Fleet-wide telemetry: OpenTelemetry collector deployment + lifecycle, Graylog log-forwarder sidecar attachment, and Grafana provisioning. Generates per-platform deployment plans (systemd / launchd / sysvinit / Windows services / FreeBSD rc.d) so a single server-side configuration rolls out consistently across mixed OS fleets.

Overview

The Observability Engine owns the deployment, configuration, and routing of three telemetry surfaces: the OpenTelemetry collector running on each agent (metrics + traces), the Graylog log-forwarder sidecar (syslog / file / journald shipping), and Grafana (dashboards + datasources auto-provisioned for SysManage's Postgres data). Per-host plans are generated from server-side templates so the agent never has to know how to configure these services.

Tier & Licensing

  • Enterprise tier: full observability deployment.
  • Community Edition: read-only telemetry status (is OTEL running? is Graylog attached?). No deploy / configure.

Telemetry Surfaces

OpenTelemetry Collector

A per-host OTEL collector that scrapes node-level metrics (CPU, memory, disk, network), forwards traces from instrumented apps, and ships them to a configurable backend. The engine generates platform-specific install bundles — binary + config + service unit — for Linux (systemd), FreeBSD (rc.d), OpenBSD (rc.d), NetBSD (rc.d), macOS (launchd), and Windows (Windows service).

Graylog Log Forwarder

A lightweight log-shipper sidecar that attaches to a Graylog input. Sources are configurable per host: syslog files, journald, application log files, or Windows Event Log. Detach + re-attach without restarting the SysManage agent.

Grafana Provisioning

Connect a Grafana instance and the engine auto-provisions its Postgres datasource (read-only credentials), an admin folder, and a starter dashboard pack: host inventory, package compliance, vulnerability status, agent connectivity, and update / OS-upgrade activity. Custom dashboards added through Grafana's UI are not touched by re-provisioning.

Open Source vs Professional+

  • Community Edition: agent reports OTEL service status and Graylog attach state via existing status messages so the host details page shows the current connectivity.
  • Professional+: deploy, start, stop, restart, and uninstall the OTEL collector across the fleet from one server-side configuration.
  • Professional+: attach hosts to a Graylog input with one click; configure source paths and TLS per host.
  • Professional+: connect Grafana, auto-provision the SysManage datasource + dashboards.

OTEL Deploy Flow

  1. User selects target hosts (singly or via fleet selector).
  2. User picks the backend (e.g. OTLP HTTP endpoint, OTLP gRPC, Prometheus remote-write).
  3. Engine generates the per-platform deployment plan: download the otelcol binary, drop the rendered config.yml in /etc/otelcol/ (or platform equivalent), install the service unit, enable + start.
  4. Agent runs the plan via apply_deployment_plan; per-step result rolls up into deployment status.
  5. Status sweep runs every minute and refreshes the per-host OTEL state; the host details page renders it.

Using the UI

From a host's Host Details page:

  • Deploy OpenTelemetry action triggers the per-host install plan; Start / Restart / Stop OpenTelemetry Service drive lifecycle.
  • Connect Host to Graylog opens the Graylog attach dialog; the engine plan installs and configures the log forwarder.
  • Enable Grafana Integration on Settings configures the datasource + provisions dashboards.

API Endpoints

  • POST /api/v1/observability/otel/deploy — deploy the collector to one or more hosts
  • POST /api/v1/observability/otel/{action} — lifecycle: start / stop / restart / uninstall
  • GET /api/v1/observability/otel/status — per-host OTEL service status
  • POST /api/v1/observability/graylog/attach — attach a host (or set of hosts) to a Graylog input
  • POST /api/v1/observability/graylog/detach — detach
  • GET /api/v1/observability/graylog/status — per-host attach state + source list
  • POST /api/v1/observability/grafana/connect — register a Grafana instance and provision dashboards
  • GET /api/v1/observability/grafana/dashboards — list provisioned dashboards

Required Permissions

  • Deploy OpenTelemetry, Start OpenTelemetry Service, Stop OpenTelemetry Service, Restart OpenTelemetry Service
  • Connect Host to Graylog, Enable Graylog Integration
  • Enable Grafana Integration

Troubleshooting

  • If the host shows OTEL as not running after a successful deploy, the engine's status sweep takes up to ~60s to refresh; force a re-check via the host details page's refresh action.
  • Graylog over TLS requires the agent host's CA bundle to trust the Graylog certificate; the attach dialog accepts a custom CA file path which the engine deploys to the agent.
  • Grafana provisioning uses the read-only Postgres credentials configured under observability.grafana.datasource in /etc/sysmanage.yaml; rotate via the Grafana datasource UI rather than re-running the connect flow.