Performance Benchmarks
Relevant source files
This document covers the performance benchmark suite for the allocator crate, which provides standardized testing methodology to compare the performance characteristics of different memory allocation implementations. The benchmarks evaluate allocation patterns commonly found in real-world applications and measure relative performance against the system allocator baseline.
For information about the individual allocator implementations being tested, see Allocator Implementations. For details about the integration test suite, see Integration Tests.
Benchmark Architecture
The benchmark infrastructure is built using the Criterion benchmarking framework and provides a controlled testing environment for evaluating allocator performance across different workload patterns.
Benchmark Framework Overview
flowchart TD
subgraph subGraph3["Allocator Instances"]
SYS_ALLOC["std::alloc::System"]
TLSF_ALLOC["AllocatorRc"]
SLAB_ALLOC["AllocatorRc"]
BUDDY_ALLOC["AllocatorRc"]
end
subgraph subGraph2["Memory Pool Management"]
POOL_NEW["MemoryPool::new()"]
POOL_SLICE["MemoryPool::as_slice()"]
ALLOC_ZEROED["std::alloc::alloc_zeroed()"]
LAYOUT["Layout::from_size_align(size, 4096)"]
end
subgraph subGraph1["Test Infrastructure"]
POOL["MemoryPool::new(POOL_SIZE)"]
BENCH_FN["bench(c, alloc_name, alloc)"]
CRITERION["Criterion::benchmark_group()"]
end
subgraph subGraph0["Benchmark Entry Point"]
MAIN["criterion_main!(benches)"]
GROUP["criterion_group!(benches, criterion_benchmark)"]
end
BENCH_FN --> BUDDY_ALLOC
BENCH_FN --> CRITERION
BENCH_FN --> POOL
BENCH_FN --> SLAB_ALLOC
BENCH_FN --> SYS_ALLOC
BENCH_FN --> TLSF_ALLOC
GROUP --> BENCH_FN
MAIN --> GROUP
POOL --> POOL_NEW
POOL --> POOL_SLICE
POOL_NEW --> ALLOC_ZEROED
POOL_NEW --> LAYOUT
POOL_SLICE --> BUDDY_ALLOC
POOL_SLICE --> SLAB_ALLOC
POOL_SLICE --> TLSF_ALLOC
The benchmark system creates a 128MB memory pool (POOL_SIZE = 1024 * 1024 * 128) for testing custom allocators, while the system allocator operates directly on system memory. Each allocator is wrapped in AllocatorRc to provide the standard library Allocator trait interface.
Sources: benches/collections.rs(L16) benches/collections.rs(L80 - L98) benches/utils/mod.rs(L9 - L18)
Test Scenario Implementation
flowchart TD
subgraph subGraph3["btree_map Implementation"]
BT_NEW["BTreeMap::new_in(alloc.clone())"]
BT_RNG["SmallRng::seed_from_u64(0xdead_beef)"]
BT_INSERT["m.insert(key, value)"]
BT_REMOVE["m.pop_first()"]
BT_CLEAR["m.clear()"]
end
subgraph subGraph2["vec_rand_free Implementation"]
VR_ALLOC["Vec::with_capacity_in(blk_size, alloc)"]
VR_RNG["SmallRng::seed_from_u64(0xdead_beef)"]
VR_SHUFFLE["index.shuffle(&mut rng)"]
VR_FREE["v[i] = Vec::new_in(alloc)"]
end
subgraph subGraph1["vec_push Implementation"]
VP_NEW["Vec::new_in(alloc.clone())"]
VP_LOOP["for _ in 0..n"]
VP_PUSH["v.push(0xdead_beef)"]
VP_DROP["drop(v)"]
end
subgraph subGraph0["Benchmark Functions"]
VEC_PUSH["vec_push(n, alloc)"]
VEC_RAND["vec_rand_free(n, blk_size, alloc)"]
BTREE_MAP["btree_map(n, alloc)"]
end
BTREE_MAP --> BT_NEW
BTREE_MAP --> BT_RNG
BT_INSERT --> BT_CLEAR
BT_REMOVE --> BT_CLEAR
BT_RNG --> BT_INSERT
BT_RNG --> BT_REMOVE
VEC_PUSH --> VP_NEW
VEC_RAND --> VR_ALLOC
VEC_RAND --> VR_RNG
VP_LOOP --> VP_PUSH
VP_NEW --> VP_LOOP
VP_PUSH --> VP_DROP
VR_RNG --> VR_SHUFFLE
VR_SHUFFLE --> VR_FREE
Each benchmark function implements a specific allocation pattern designed to stress different aspects of allocator performance. The functions use deterministic random seeds to ensure reproducible results across benchmark runs.
Sources: benches/collections.rs(L18 - L24) benches/collections.rs(L26 - L44) benches/collections.rs(L46 - L61)
Test Scenarios
The benchmark suite includes four distinct test scenarios that evaluate different aspects of allocator performance:
| Benchmark | Function | Parameters | Purpose |
|---|---|---|---|
| vec_push_3M | vec_push | n=3,000,000 | Sequential allocation stress test |
| vec_rand_free_25K_64 | vec_rand_free | n=25,000, blk_size=64 | Small block fragmentation test |
| vec_rand_free_7500_520 | vec_rand_free | n=7,500, blk_size=520 | Large block fragmentation test |
| btree_map_50K | btree_map | n=50,000 | Mixed allocation/deallocation pattern |
Sequential Allocation Test
The vec_push_3M benchmark evaluates pure allocation performance by pushing 3 million u32 values into a vector. This test measures:
- Allocation throughput for growing data structures
- Memory reallocation efficiency during vector growth
- Allocator overhead for sequential memory requests
Fragmentation Tests
The vec_rand_free tests evaluate allocator behavior under fragmentation stress:
Small Block Test (vec_rand_free_25K_64):
- Allocates 25,000 blocks of 64 bytes each
- Randomly deallocates blocks using shuffled indices
- Tests small allocation efficiency and fragmentation handling
Large Block Test (vec_rand_free_7500_520):
- Allocates 7,500 blocks of 520 bytes each
- Randomly deallocates blocks using shuffled indices
- Tests allocator behavior with larger allocation sizes
Mixed Workload Test
The btree_map_50K benchmark simulates realistic application behavior:
- Performs 50,000 operations on a
BTreeMap - 20% probability of removal operations (
rng.next_u32() % 5 == 0) - 80% probability of insertion operations
- Uses string keys with dynamic allocation
- Tests allocator performance under mixed allocation patterns
Sources: benches/collections.rs(L65 - L77) benches/collections.rs(L26 - L44) benches/collections.rs(L46 - L61)
Allocator Testing Matrix
The benchmark system tests four different allocators to provide comprehensive performance comparison:
Allocator Configuration
flowchart TD
subgraph subGraph3["Allocator Initialization"]
TLSF_NEW["TlsfByteAllocator::new()"]
SLAB_NEW["SlabByteAllocator::new()"]
BUDDY_NEW["BuddyByteAllocator::new()"]
end
subgraph subGraph2["Memory Pool"]
SYS["std::alloc::System"]
POOL_MEM["MemoryPool (128MB)"]
POOL_SLICE["&mut [u8] slice"]
end
subgraph subGraph1["Custom Allocators"]
TLSF_RC["AllocatorRc"]
SLAB_RC["AllocatorRc"]
BUDDY_RC["AllocatorRc"]
end
subgraph subGraph0["System Allocator"]
SYS["std::alloc::System"]
SYS_DESC["Direct system memoryBaseline reference"]
POOL_MEM["MemoryPool (128MB)"]
end
BUDDY_NEW --> BUDDY_RC
POOL_MEM --> POOL_SLICE
POOL_SLICE --> BUDDY_RC
POOL_SLICE --> SLAB_RC
POOL_SLICE --> TLSF_RC
SLAB_NEW --> SLAB_RC
SYS --> SYS_DESC
TLSF_NEW --> TLSF_RC
Each custom allocator is initialized with the same 128MB memory pool to ensure fair comparison. The AllocatorRc wrapper provides reference counting and the standard library Allocator trait implementation.
Sources: benches/collections.rs(L82 - L97) benches/utils/mod.rs(L10 - L14)
Memory Pool Utility
The MemoryPool utility provides controlled memory management for benchmark testing:
MemoryPool Implementation
The MemoryPool struct manages a fixed-size memory region for allocator testing:
// Memory pool allocation with 4KB alignment
let layout = Layout::from_size_align(size, 4096).unwrap();
let ptr = NonNull::new(unsafe { std::alloc::alloc_zeroed(layout) }).unwrap();
Key characteristics:
- Size: 128MB (
POOL_SIZE = 1024 * 1024 * 128) - Alignment: 4KB page alignment for optimal performance
- Initialization: Zero-filled memory using
alloc_zeroed - Lifetime: Automatic cleanup through
Dropimplementation
The pool provides a mutable slice interface (as_slice()) that allocators use for their internal memory management, ensuring all custom allocators operate within the same controlled memory environment.
Sources: benches/utils/mod.rs(L9 - L18) benches/utils/mod.rs(L21 - L25) benches/collections.rs(L16)
Running Benchmarks
The benchmark suite uses the Criterion framework for statistical analysis and result reporting. To execute the benchmarks:
cargo bench --features full
The --features full flag enables all allocator implementations for comprehensive testing. The benchmark configuration includes:
- Sample size: 10 iterations per test for statistical significance
- Measurement: Wall clock time for complete test scenarios
- Output: Statistical analysis including mean, standard deviation, and confidence intervals
Individual allocator benchmarks can be run using Criterion's filtering capability:
cargo bench -- tlsf # Run only TLSF allocator benchmarks
cargo bench -- vec_push # Run only vector push benchmarks
Sources: benches/collections.rs(L68) benches/collections.rs(L100 - L101)
Benchmark Results Interpretation
The benchmark results provide comparative performance metrics across allocators:
Performance Metrics
Throughput Comparison: Results show relative performance against the system allocator baseline, helping identify which allocators perform best for specific workload patterns.
Fragmentation Behavior: The random free tests reveal how effectively each allocator handles memory fragmentation and reuses freed blocks.
Mixed Workload Performance: The BTreeMap benchmark demonstrates allocator behavior under realistic application usage patterns with mixed allocation and deallocation operations.
Expected Performance Characteristics
- TLSF: Consistent O(1) allocation/deallocation with good fragmentation resistance
- Slab: Optimal for fixed-size allocations, may show overhead for variable sizes
- Buddy: Good general-purpose performance with power-of-two block sizes
- System: Reference baseline representing operating system allocator performance
Sources: benches/collections.rs(L63 - L78)