Naive Implementation
Relevant source files
Purpose and Scope
This document covers the sp-naive
feature implementation of the percpu crate, which provides a simplified fallback for single-CPU systems. This implementation treats all per-CPU variables as regular global variables, eliminating the need for architecture-specific per-CPU registers and memory area management.
For information about the full multi-CPU implementation details, see Memory Management Internals. For architecture-specific code generation in multi-CPU scenarios, see Architecture-Specific Code Generation.
Overview
The naive implementation is activated when the sp-naive
cargo feature is enabled. It provides a drastically simplified approach where:
- Per-CPU variables become standard global variables
- No per-CPU memory areas are allocated
- Architecture-specific registers (
GS_BASE
,TPIDR_ELx
,gp
,$r21
) are not used - All runtime functions become no-ops or return constant values
This approach is suitable for single-core embedded systems, testing environments, or scenarios where the overhead of true per-CPU data management is unnecessary.
Runtime Implementation
Naive Runtime Functions
The naive implementation provides stub versions of all runtime functions that would normally manage per-CPU memory areas and registers:
flowchart TD subgraph subGraph1["Return Values"] ZERO1["0"] ONE1["1"] ZERO2["0"] ZERO3["0"] NOOP1["no-op"] NOOP2["no-op"] ONE2["1"] end subgraph subGraph0["Naive Runtime Functions"] PSZ["percpu_area_size"] PNUM["percpu_area_num"] PBASE["percpu_area_base"] READ["read_percpu_reg"] WRITE["write_percpu_reg"] INIT_REG["init_percpu_reg"] INIT["init"] end INIT --> ONE2 INIT_REG --> NOOP2 PBASE --> ZERO2 PNUM --> ONE1 PSZ --> ZERO1 READ --> ZERO3 WRITE --> NOOP1
Function Stub Implementations
Function | Purpose | Naive Return Value |
---|---|---|
percpu_area_size() | Get per-CPU area size | Always0 |
percpu_area_num() | Get number of per-CPU areas | Always1 |
percpu_area_base(cpu_id) | Get base address for CPU | Always0 |
read_percpu_reg() | Read per-CPU register | Always0 |
write_percpu_reg(tp) | Write per-CPU register | No effect |
init_percpu_reg(cpu_id) | Initialize CPU register | No effect |
init() | Initialize per-CPU system | Returns1 |
Sources: percpu/src/naive.rs(L3 - L54)
Code Generation Differences
Macro Implementation Comparison
The naive implementation generates fundamentally different code compared to the standard multi-CPU implementation:
flowchart TD subgraph subGraph2["Naive Implementation"] NAV_OFFSET["gen_offsetaddr_of!(symbol)"] NAV_PTR["gen_current_ptraddr_of!(symbol)"] NAV_READ["gen_read_current_raw*self.current_ptr()"] NAV_WRITE["gen_write_current_rawdirect assignment"] end subgraph subGraph1["Standard Implementation"] STD_OFFSET["gen_offsetregister + offset"] STD_PTR["gen_current_ptrregister-based access"] STD_READ["gen_read_current_rawregister + offset"] STD_WRITE["gen_write_current_rawregister + offset"] end subgraph subGraph0["Source Code"] DEF["#[def_percpu]static VAR: T = init;"] end DEF --> NAV_OFFSET DEF --> STD_OFFSET NAV_OFFSET --> NAV_PTR NAV_PTR --> NAV_READ NAV_READ --> NAV_WRITE STD_OFFSET --> STD_PTR STD_PTR --> STD_READ STD_READ --> STD_WRITE
Key Code Generation Functions
The naive macro implementation in percpu_macros/src/naive.rs(L6 - L28) provides these core functions:
gen_offset()
: Returns::core::ptr::addr_of!(symbol) as usize
instead of calculating register-relative offsetsgen_current_ptr()
: Returns::core::ptr::addr_of!(symbol)
for direct global accessgen_read_current_raw()
: Uses*self.current_ptr()
for simple dereferencegen_write_current_raw()
: Uses direct assignment*(self.current_ptr() as *mut T) = val
Sources: percpu_macros/src/naive.rs(L6 - L28)
Memory Model Comparison
Standard vs Naive Memory Layout
flowchart TD subgraph subGraph0["Standard Multi-CPU Model"] TEMPLATE["Template .percpu Section"] CPU1_AREA["CPU 1 Data Area"] CPUN_AREA["CPU N Data Area"] CPU1_REG["CPU 1 Register"] CPUN_REG["CPU N Register"] subgraph subGraph1["Naive Single-CPU Model"] GLOBAL_VAR["Global Variable Storage"] DIRECT_ACCESS["Direct Memory Access"] CPU0_AREA["CPU 0 Data Area"] CPU0_REG["CPU 0 Register"] end end CPU0_REG --> CPU0_AREA CPU1_REG --> CPU1_AREA CPUN_REG --> CPUN_AREA GLOBAL_VAR --> DIRECT_ACCESS TEMPLATE --> CPU0_AREA TEMPLATE --> CPU1_AREA TEMPLATE --> CPUN_AREA
Memory Allocation Differences
Aspect | Standard Implementation | Naive Implementation |
---|---|---|
Memory Areas | Multiple per-CPU areas | Single global variables |
Initialization | Copy template to each area | No copying required |
Register Usage | Per-CPU register per core | No registers used |
Address Calculation | Base + offset | Direct symbol address |
Memory Overhead | area_size * num_cpus | Size of global variables only |
Sources: percpu/src/naive.rs(L1 - L54) README.md(L71 - L73)
Feature Integration
Build Configuration
The naive implementation is enabled through the sp-naive
cargo feature:
[dependencies]
percpu = { version = "0.1", features = ["sp-naive"] }
When this feature is active:
- All architecture-specific code paths are bypassed
- No linker script modifications are required for per-CPU sections
- The system operates as if there is only one CPU
Compatibility with Other Features
Feature Combination | Behavior |
---|---|
sp-naivealone | Pure global variable mode |
sp-naive+preempt | Global variables with preemption guards |
sp-naive+arm-el2 | Feature ignored, global variables used |
Sources: README.md(L69 - L79)
Use Cases and Limitations
Appropriate Use Cases
The naive implementation is suitable for:
- Single-core embedded systems: Where true per-CPU isolation is unnecessary
- Testing and development: Simplified debugging without architecture concerns
- Prototype development: Quick implementation without per-CPU complexity
- Resource-constrained environments: Minimal memory overhead requirements
Limitations
- No CPU isolation: All "per-CPU" variables are shared globally
- No scalability: Cannot be extended to multi-CPU systems without feature changes
- Limited performance benefits: No per-CPU cache locality optimizations
- Testing coverage gaps: May not expose multi-CPU race conditions during development