Naive Implementation
Relevant source files
Purpose and Scope
This document covers the sp-naive feature implementation of the percpu crate, which provides a simplified fallback for single-CPU systems. This implementation treats all per-CPU variables as regular global variables, eliminating the need for architecture-specific per-CPU registers and memory area management.
For information about the full multi-CPU implementation details, see Memory Management Internals. For architecture-specific code generation in multi-CPU scenarios, see Architecture-Specific Code Generation.
Overview
The naive implementation is activated when the sp-naive cargo feature is enabled. It provides a drastically simplified approach where:
- Per-CPU variables become standard global variables
- No per-CPU memory areas are allocated
- Architecture-specific registers (
GS_BASE,TPIDR_ELx,gp,$r21) are not used - All runtime functions become no-ops or return constant values
This approach is suitable for single-core embedded systems, testing environments, or scenarios where the overhead of true per-CPU data management is unnecessary.
Runtime Implementation
Naive Runtime Functions
The naive implementation provides stub versions of all runtime functions that would normally manage per-CPU memory areas and registers:
flowchart TD
subgraph subGraph1["Return Values"]
ZERO1["0"]
ONE1["1"]
ZERO2["0"]
ZERO3["0"]
NOOP1["no-op"]
NOOP2["no-op"]
ONE2["1"]
end
subgraph subGraph0["Naive Runtime Functions"]
PSZ["percpu_area_size"]
PNUM["percpu_area_num"]
PBASE["percpu_area_base"]
READ["read_percpu_reg"]
WRITE["write_percpu_reg"]
INIT_REG["init_percpu_reg"]
INIT["init"]
end
INIT --> ONE2
INIT_REG --> NOOP2
PBASE --> ZERO2
PNUM --> ONE1
PSZ --> ZERO1
READ --> ZERO3
WRITE --> NOOP1
Function Stub Implementations
| Function | Purpose | Naive Return Value |
|---|---|---|
| percpu_area_size() | Get per-CPU area size | Always0 |
| percpu_area_num() | Get number of per-CPU areas | Always1 |
| percpu_area_base(cpu_id) | Get base address for CPU | Always0 |
| read_percpu_reg() | Read per-CPU register | Always0 |
| write_percpu_reg(tp) | Write per-CPU register | No effect |
| init_percpu_reg(cpu_id) | Initialize CPU register | No effect |
| init() | Initialize per-CPU system | Returns1 |
Sources: percpu/src/naive.rs(L3 - L54)
Code Generation Differences
Macro Implementation Comparison
The naive implementation generates fundamentally different code compared to the standard multi-CPU implementation:
flowchart TD
subgraph subGraph2["Naive Implementation"]
NAV_OFFSET["gen_offsetaddr_of!(symbol)"]
NAV_PTR["gen_current_ptraddr_of!(symbol)"]
NAV_READ["gen_read_current_raw*self.current_ptr()"]
NAV_WRITE["gen_write_current_rawdirect assignment"]
end
subgraph subGraph1["Standard Implementation"]
STD_OFFSET["gen_offsetregister + offset"]
STD_PTR["gen_current_ptrregister-based access"]
STD_READ["gen_read_current_rawregister + offset"]
STD_WRITE["gen_write_current_rawregister + offset"]
end
subgraph subGraph0["Source Code"]
DEF["#[def_percpu]static VAR: T = init;"]
end
DEF --> NAV_OFFSET
DEF --> STD_OFFSET
NAV_OFFSET --> NAV_PTR
NAV_PTR --> NAV_READ
NAV_READ --> NAV_WRITE
STD_OFFSET --> STD_PTR
STD_PTR --> STD_READ
STD_READ --> STD_WRITE
Key Code Generation Functions
The naive macro implementation in percpu_macros/src/naive.rs(L6 - L28) provides these core functions:
gen_offset(): Returns::core::ptr::addr_of!(symbol) as usizeinstead of calculating register-relative offsetsgen_current_ptr(): Returns::core::ptr::addr_of!(symbol)for direct global accessgen_read_current_raw(): Uses*self.current_ptr()for simple dereferencegen_write_current_raw(): Uses direct assignment*(self.current_ptr() as *mut T) = val
Sources: percpu_macros/src/naive.rs(L6 - L28)
Memory Model Comparison
Standard vs Naive Memory Layout
flowchart TD
subgraph subGraph0["Standard Multi-CPU Model"]
TEMPLATE["Template .percpu Section"]
CPU1_AREA["CPU 1 Data Area"]
CPUN_AREA["CPU N Data Area"]
CPU1_REG["CPU 1 Register"]
CPUN_REG["CPU N Register"]
subgraph subGraph1["Naive Single-CPU Model"]
GLOBAL_VAR["Global Variable Storage"]
DIRECT_ACCESS["Direct Memory Access"]
CPU0_AREA["CPU 0 Data Area"]
CPU0_REG["CPU 0 Register"]
end
end
CPU0_REG --> CPU0_AREA
CPU1_REG --> CPU1_AREA
CPUN_REG --> CPUN_AREA
GLOBAL_VAR --> DIRECT_ACCESS
TEMPLATE --> CPU0_AREA
TEMPLATE --> CPU1_AREA
TEMPLATE --> CPUN_AREA
Memory Allocation Differences
| Aspect | Standard Implementation | Naive Implementation |
|---|---|---|
| Memory Areas | Multiple per-CPU areas | Single global variables |
| Initialization | Copy template to each area | No copying required |
| Register Usage | Per-CPU register per core | No registers used |
| Address Calculation | Base + offset | Direct symbol address |
| Memory Overhead | area_size * num_cpus | Size of global variables only |
Sources: percpu/src/naive.rs(L1 - L54) README.md(L71 - L73)
Feature Integration
Build Configuration
The naive implementation is enabled through the sp-naive cargo feature:
[dependencies]
percpu = { version = "0.1", features = ["sp-naive"] }
When this feature is active:
- All architecture-specific code paths are bypassed
- No linker script modifications are required for per-CPU sections
- The system operates as if there is only one CPU
Compatibility with Other Features
| Feature Combination | Behavior |
|---|---|
| sp-naivealone | Pure global variable mode |
| sp-naive+preempt | Global variables with preemption guards |
| sp-naive+arm-el2 | Feature ignored, global variables used |
Sources: README.md(L69 - L79)
Use Cases and Limitations
Appropriate Use Cases
The naive implementation is suitable for:
- Single-core embedded systems: Where true per-CPU isolation is unnecessary
- Testing and development: Simplified debugging without architecture concerns
- Prototype development: Quick implementation without per-CPU complexity
- Resource-constrained environments: Minimal memory overhead requirements
Limitations
- No CPU isolation: All "per-CPU" variables are shared globally
- No scalability: Cannot be extended to multi-CPU systems without feature changes
- Limited performance benefits: No per-CPU cache locality optimizations
- Testing coverage gaps: May not expose multi-CPU race conditions during development