> For the complete documentation index, see [llms.txt](https://docs.rumi.systems/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.rumi.systems/rumi-management/concepts/telemetry-model.md).

# Telemetry Model

Rumi Management implements a telemetry pipeline that collects runtime metrics from running containers and stores them in a time-series database for dashboarding and alerting.

## Heartbeats

Rumi containers emit **heartbeats** at a configurable interval (default: 5 seconds). Each heartbeat contains a snapshot of the container's runtime state, including:

* **Memory metrics** — Heap and non-heap usage, physical memory, entity counts, pool allocation statistics.
* **Latency metrics** — Messaging latency waypoints, per-transaction latency samples.
* **Thread metrics** — Thread counts and per-thread statistics.
* **GC metrics** — Garbage collection counts and pause times.
* **Application metrics** — Store operations (hit/miss/eviction rates), message counts, disruptor ring buffer statistics, transaction latencies.
* **Custom metrics** — User-defined gauges and counters exposed by application code.

Heartbeats are delivered over the Rumi messaging bus to the Agent.

## Collection Pipeline

The Agent's telemetry pipeline processes heartbeats through the following stages:

```
Container Heartbeat (Rumi Messaging)
       │
       ▼
 XVMHeartbeatCollector
       │
       ▼
 RogNodeToPointTransform (converts ROG nodes to PointData)
       │
       ▼
 PointStore (writes to storage backend)
       │
       ▼
 InfluxDB (time-series database)
```

### XVMHeartbeatCollector

The collector receives heartbeat messages from containers and extracts metrics. It processes:

* Container-level system metrics (CPU, memory, GC, threads).
* Application-level metrics (store operations, message throughput, latency).
* Custom application metrics (user gauges and counters).

### RogNodeToPointTransform

The transform converts Rumi Object Graph (ROG) nodes from the heartbeat data into `PointData` objects suitable for time-series storage. Each point consists of:

* **Measurement name** — Identifies the metric category (e.g., `application`, `system`, `application.store`).
* **Tags** — Indexed dimensions for filtering (e.g., `system_name`, `vm_name`, `app_name`).
* **Fields** — Metric values (e.g., `process_cpu_load`, `memory_heap_used`).
* **Timestamp** — When the measurement was taken.

### PointStore

The `PointStore` is a pluggable storage abstraction with the following implementations:

| Implementation                 | Description                                       |
| ------------------------------ | ------------------------------------------------- |
| **InfluxBatchAwarePointStore** | Writes to InfluxDB with batching for efficiency   |
| **BasicPointStore**            | Generic implementation for other storage backends |
| **StubBatchAwarePointStore**   | Stub for testing                                  |

The storage backend is configured via the `nv.agent.point.store` property (`influx`, `fake_influx`, or `none`).

## InfluxDB Storage

When using InfluxDB, metrics are stored under the `rumi_heartbeat_rp` retention policy. The Agent creates this retention policy and manages the database connection.

For a detailed reference of all InfluxDB measurements, tags, and fields, see the [InfluxDB Measurements Reference](/rumi-management/reference/agent/influxdb-measurements.md).

## Custom Collectors

The telemetry pipeline supports custom collectors via the `XVMHeartbeatCollectorPlugin` interface. Custom collectors can:

* Extract additional metrics from heartbeat data.
* Transform and enrich metrics before storage.
* Write to additional storage backends.

See the [Custom Collectors Guide](/rumi-management/guides/agent/custom-collectors.md) for details on implementing collector plugins.

## Dashboard Consumption

The telemetry data stored in InfluxDB is consumed by Rumi Monitor (Grafana) through pre-built dashboards. Users can also create custom dashboards by querying the InfluxDB measurements directly. The dashboards use template variables for filtering by system, container, application, and host.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.rumi.systems/rumi-management/concepts/telemetry-model.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
