> For the complete documentation index, see [llms.txt](https://docs.rumi.systems/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.rumi.systems/performance/overview.md).

# Overview

This section provides comprehensive performance information for Rumi, including benchmarking results, test methodologies, and performance analysis.

## Performance Documentation

Rumi performance documentation is organized into two main sections:

### 1. Canonical Benchmark

The [Canonical Benchmark](/performance/canonical-benchmark.md) section presents results from the official end-to-end performance benchmark used to measure Rumi's core capabilities.

**What it measures**: Complete **Receive-Process-Send flow** of a clustered microservice, including messaging, state management, persistence, cluster replication, and consensus protocol.

**Key metrics**:

* **Wire-to-Wire Latency**: Time from inbound message arrival to outbound message transmission (50th, 99th, 99.9th percentiles)
* **Maximum Throughput**: Messages processed per second under saturated load

**Test program**: ESProcessor from the Rumi Performance Benchmark Suite

**Results available**: Performance metrics for Rumi 4.0 releases

➡️ [View Canonical Benchmark Results](/performance/canonical-benchmark.md)

### 2. Performance Benchmark Suite

The [Performance Benchmark Suite](/performance/benchmark-suite.md) section documents the collection of benchmarking tools that measure individual Rumi runtime components.

**What it includes**: 7 modules, each focusing on a specific component:

* **Time Module**: Time API overhead
* **Serialization Module**: Message encoding/decoding performance
* **Link Module**: Cluster replication transport
* **Messaging Module**: Pub/sub messaging layer
* **Persistence Module**: Message and data persistence
* **Storage Module**: Object store operations
* **AEP Module**: End-to-end canonical benchmark (described above)

**Purpose**: Isolate and measure individual component performance, understand performance characteristics, validate configurations

**Source code**: [github.com/neeveresearch/nvx-rumi-perf](https://github.com/neeveresearch/nvx-rumi-perf)

➡️ [Explore Benchmark Suite](/performance/benchmark-suite.md)

## Quick Start

### View Latest Results

See the latest canonical benchmark results:

* [Rumi 4.0.4 Performance](https://github.com/rumidata/nvx-rumi-docs/blob/claude-refactor/performance/canonical-benchmark/test-results/4.0.4.md) - Latest tested release

### Run Your Own Tests

Download and run the benchmark suite:

1. **Download distribution** from the Neeve artifact repository:

   ```bash
   wget https://nexus.n5corp.com/repository/maven-public/com/neeve/nvx-rumi-perf-dist/4.0.4/nvx-rumi-perf-dist-4.0.4-linux-x86-64.tar.gz
   ```
2. **Extract and run**:

   ```bash
   tar xvf nvx-rumi-perf-dist-4.0.4-linux-x86-64.tar.gz
   cd nvx-rumi-perf-4.0.4
   $JAVA_HOME/bin/java -cp "libs/*" com.neeve.perf.aep.engine.ESProcessor --count 10000 --rate 5000
   ```
3. **See documentation** for detailed configuration and parameters:
   * [Test Description](/performance/canonical-benchmark/test-description.md) - Complete methodology
   * [Benchmark Suite Documentation](/performance/benchmark-suite.md) - All modules and tools

## Understanding Performance

### Latency Characteristics

Rumi is optimized for ultra-low latency:

* **Typical latency**: 27-30µs median (wire-to-wire, including network)
* **Tail latency**: 99.9th percentile within 1.5x of median
* **Configuration**: Performance varies by CPU configuration and optimization mode

### Throughput Characteristics

Rumi supports high-volume scenarios:

* **Typical throughput**: 280K+ messages/second per microservice instance
* **Scaling**: Linear scaling with message complexity and CPU resources
* **Configuration**: Best throughput with minimal CPU configuration for lightweight handlers

### Performance Factors

Key factors affecting Rumi performance:

1. **Message Access Method**: Direct access (serializer/deserializer) vs Indirect (POJO) - \~10% latency difference, 2.4x throughput difference
2. **CPU Configuration**: MinCPU, Default, or MaxCPU - affects parallelization vs coordination overhead
3. **Optimization Mode**: Latency or Throughput - different JVM and runtime tuning
4. **Hardware**: CPU, memory, storage (NVME vs SSD), network (InfiniBand vs Ethernet)
5. **Network Tuning**: VMA, RDMA enablement (not enabled in baseline tests)

## Next Steps

* [**Canonical Benchmark**](/performance/canonical-benchmark.md) - View official end-to-end performance results
* [**Benchmark Suite**](/performance/benchmark-suite.md) - Explore component-level benchmarks
* [**Test Description**](/performance/canonical-benchmark/test-description.md) - Understand test methodology in detail
* [**Test Results**](/performance/canonical-benchmark/test-results.md) - Browse results by Rumi release


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.rumi.systems/performance/overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
