The Link module benchmarks cluster replication link latency, measuring the performance of the low-level transport used for replicating state between cluster members.
The Link layer provides the foundation for cluster replication in Rumi, offering:
Direct point-to-point communication between cluster members
Minimal overhead transport for state replication
Support for various network protocols (TCP, UDP, InfiniBand, etc.)
This benchmark measures the raw performance capabilities of the replication link layer.
The Link module provides multiple test program variants:
Streaming Tests
Measure unidirectional throughput:
Blocking Variant:
Sender: com.neeve.perf.link.BlockingStreamingSender
Receiver: com.neeve.perf.link.BlockingStreamingReceiver
Non-Blocking Variant:
Receiver: com.neeve.perf.link.NonBlockingStreamingReceiver (no sender variant)
RDMA Variant:
Sender: com.neeve.perf.link.RdmaStreamingSender
Receiver: com.neeve.perf.link.RdmaStreamingReceiver
The sender continuously sends messages to the receiver as fast as possible, measuring maximum sustained throughput.
Ping-Pong Tests
Measure round-trip latency:
Blocking Variant:
Sender: com.neeve.perf.link.BlockingPingPongSender
Receiver: com.neeve.perf.link.BlockingPingPongReceiver
Non-Blocking Variant:
Sender: com.neeve.perf.link.NonBlockingPingPongSender
Receiver: com.neeve.perf.link.NonBlockingPingPongReceiver
The sender sends a message and waits for a response from the receiver, measuring round-trip time.
Command-Line Parameters
Blocking Streaming Sender
Short
Long
Default
Description
Connection descriptor (e.g., tcp://192.168.1.7:12000&tcpnodelay=true)
Number of messages to send
Output periodic interval stats
--dontWriteLatenciesToFile
Suppress latency file output
Blocking Streaming Receiver
Short
Long
Default
Description
Output incremental throughput stats
Blocking Ping-Pong Sender
Short
Long
Default
Description
Number of messages to send
Calculate one-way latency
Output periodic interval stats
--dontWriteLatenciesToFile
Suppress latency file output
Blocking Ping-Pong Receiver
Short
Long
Default
Description
Output incremental throughput stats
Purpose: Measure maximum throughput Setup: Two machines connected via high-speed network Use Case: Validate network configuration and capacity
Purpose: Measure minimum latency Setup: Two machines connected via low-latency network Use Case: Validate network tuning and baseline latency
Running Benchmarks
Streaming Throughput Test
On Receiver Machine:
On Sender Machine:
Ping-Pong Latency Test
On Receiver Machine:
On Sender Machine:
Interpreting Results
Throughput Results
Latency Results
Network Configurations
Typical Results:
Throughput: 1-2M messages/second
Latency: 15-25µs round-trip
TCP over InfiniBand
Typical Results:
Throughput: 2-4M messages/second
Latency: 8-15µs round-trip
RDMA over InfiniBand
Typical Results:
Throughput: 5-10M messages/second
Latency: 2-5µs round-trip
Comparison with Higher Layers
Link layer provides the foundation for cluster replication:
Link Layer: ~10µs (raw replication transport)
Messaging Layer: ~15µs (adds SMA abstractions)
AEP Engine: ~27µs (adds transactions, persistence, full clustering with consensus)