> For the complete documentation index, see [llms.txt](https://docs.rumi.systems/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.rumi.systems/performance/benchmark-suite/modules/serialization-module.md).

# Serialization Module

The Serialization module benchmarks message encoding and decoding performance using Rumi's Xbuf2 binary serialization format.

## Overview

Message serialization/deserialization is a critical operation in messaging systems. This benchmark measures the overhead of:

* **Encoding**: Converting POJO messages to wire format
* **Decoding**: Converting wire format back to POJOs

The benchmark uses the same `Car` message model used in the [AEP Module](/performance/benchmark-suite/modules/aep-module.md) canonical benchmark.

## Test Program

**Class**: `com.neeve.perf.serialization.Driver`

The benchmark can be invoked through the Rumi Interactive CLI or directly.

## Message Formats

### xbuf2 / xbuf2.serial / rumi.xbuf2.serial

Tests serialization with sequential/predictable data:

```bash
java -cp "libs/*" com.neeve.perf.serialization.Driver --provider xbuf2.serial
```

**Characteristics**:

* Predictable data patterns
* Consistent serialized size
* Best-case performance

### xbuf2.random / rumi.xbuf2.random

Tests serialization with random data:

```bash
java -cp "libs/*" com.neeve.perf.serialization.Driver --provider xbuf2.random
```

**Characteristics**:

* Random data in all fields
* Variable serialized size
* More realistic performance

## Test Message

The `Car` message contains:

**Simple Fields**:

* timestamp (long)
* serialNumber (int)
* modelYear (short)
* available (boolean)
* code (enum)
* vehicleCode (string)

**Complex Fields**:

* engine (nested object)
* extras (bit set)
* someNumbers (int array)

**Repeated Fields**:

* performanceFigures (array of objects)
* fuelFigures (array of objects)

**Typical Size**: \~200 bytes serialized

## Command-Line Parameters

| Parameter    | Short | Default | Description                                                                                               |
| ------------ | ----- | ------- | --------------------------------------------------------------------------------------------------------- |
| `--provider` | `-p`  | xbuf2   | Serialization provider: `xbuf2`, `xbuf2.serial`, `xbuf2.random`, `rumi.xbuf2.serial`, `rumi.xbuf2.random` |

## Running the Benchmark

### Basic Usage

```bash
# Extract distribution
tar xvf nvx-perf-serialization-{version}-dist-linux-x86-64.tar.gz
cd nvx-perf-serialization-{version}

# Run with sequential data
$JAVA_HOME/bin/java -cp "libs/*" com.neeve.perf.serialization.Driver --provider xbuf2.serial
```

### Test with Random Data

```bash
$JAVA_HOME/bin/java -cp "libs/*" com.neeve.perf.serialization.Driver --provider xbuf2.random
```

## Interpreting Results

The benchmark outputs median and mean latencies for encoding and decoding operations.

**Example Output**:

```
Calculating nanoTime() overhead...
...22 nanos
PROV                      RUN TYPE  SIZE  MED   MEAN
rumi.xbuf2.serial         1   ENC   178   245   247
rumi.xbuf2.serial         1   DEC   178   238   240
rumi.xbuf2.serial         2   ENC   178   243   245
rumi.xbuf2.serial         2   DEC   178   236   238
rumi.xbuf2.serial         3   ENC   178   244   246
rumi.xbuf2.serial         3   DEC   178   237   239
```

### Result Columns

* **PROV**: Serialization provider
* **RUN**: Run number (multiple runs for consistency)
* **TYPE**: Operation type (ENC=encode, DEC=decode)
* **SIZE**: Serialized size in bytes
* **MED**: Median latency in nanoseconds
* **MEAN**: Mean latency in nanoseconds

### Typical Results (Linux x86-64)

| Operation | Sequential Data | Random Data | Size        |
| --------- | --------------- | ----------- | ----------- |
| Encode    | \~240-250ns     | \~250-280ns | \~178 bytes |
| Decode    | \~235-245ns     | \~245-275ns | \~178 bytes |

### Performance Characteristics

1. **Encode vs Decode**:
   * Encoding and decoding have similar overhead
   * Both operations are highly optimized
2. **Sequential vs Random**:
   * Random data \~5-10% slower due to less predictable access patterns
   * Sequential data represents best-case performance
3. **Message Size**:
   * Overhead scales roughly linearly with message complexity
   * The Car message is moderately complex

## Access Patterns

The benchmark demonstrates two message access patterns:

### Indirect Access (POJO)

Standard object-oriented access via getters/setters:

```java
// Encoding
Car car = Car.create();
car.setTimestamp(System.currentTimeMillis());
car.setSerialNumber(12345);
car.setManufacturer("Toyota");
// ... set other fields
byte[] encoded = car.encode();

// Decoding
Car decoded = Car.create();
decoded.decode(encoded);
long timestamp = decoded.getTimestamp();
int serialNumber = decoded.getSerialNumber();
String manufacturer = decoded.getManufacturer();
```

### Direct Access (Serializer/Deserializer)

Zero-copy access via serializers (shown in benchmark code):

```java
// Encoding
Car.Serializer serializer = new Car.Serializer();
serializer.handleTimestamp(System.currentTimeMillis());
serializer.handleSerialNumber(12345);
// ... handle other fields
byte[] encoded = serializer.getEncodedBytes();

// Decoding
Car.Deserializer deserializer = new Car.Deserializer();
deserializer.run(new MyCallback(), encoded);
```

**Direct access is faster (used in high-performance scenarios)**

## Performance Tuning

### For Lowest Latency

1. Use direct serialization (serializer/deserializer)
2. Reuse serializer/deserializer instances
3. Pre-allocate buffers
4. Minimize nested object depth

### For Ease of Use

1. Use indirect access (POJO getters/setters)
2. Accept \~10-15% overhead for better code readability
3. Good for most business applications

## Comparison with AEP Module

The [AEP Module](/performance/benchmark-suite/modules/aep-module.md) canonical benchmark includes serialization overhead as part of end-to-end latency:

* **Serialization Benchmark**: \~480ns (encode + decode)
* **AEP Benchmark**: \~27µs (includes serialization + all other operations)

**Serialization represents \~1.7% of end-to-end latency**

## Best Practices

### Message Design

1. **Keep messages compact**: Fewer fields = faster serialization
2. **Use primitives where possible**: Avoid excessive nesting
3. **Size arrays appropriately**: Large arrays increase overhead
4. **Consider field ordering**: Group frequently-accessed fields

### Code Patterns

```java
// GOOD: Reuse serializer instance
Car.Serializer serializer = new Car.Serializer();
for (Car car : cars) {
    serializer.reset();
    // populate serializer
    byte[] encoded = serializer.getEncodedBytes();
}

// BAD: Create new serializer each time
for (Car car : cars) {
    Car.Serializer serializer = new Car.Serializer(); // Allocation overhead!
    byte[] encoded = serializer.getEncodedBytes();
}
```

## Next Steps

* Review [AEP Module](/performance/benchmark-suite/modules/aep-module.md) to see serialization in end-to-end context
* Explore [Link Module](/performance/benchmark-suite/modules/link-module.md) for messaging transport benchmarks
* Return to [Benchmark Suite](/performance/benchmark-suite.md) for other modules


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.rumi.systems/performance/benchmark-suite/modules/serialization-module.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
