Naive Implementation

Relevant source files

Purpose and Scope

This document covers the sp-naive feature implementation of the percpu crate, which provides a simplified fallback for single-CPU systems. This implementation treats all per-CPU variables as regular global variables, eliminating the need for architecture-specific per-CPU registers and memory area management.

For information about the full multi-CPU implementation details, see Memory Management Internals. For architecture-specific code generation in multi-CPU scenarios, see Architecture-Specific Code Generation.

Overview

The naive implementation is activated when the sp-naive cargo feature is enabled. It provides a drastically simplified approach where:

  • Per-CPU variables become standard global variables
  • No per-CPU memory areas are allocated
  • Architecture-specific registers (GS_BASE, TPIDR_ELx, gp, $r21) are not used
  • All runtime functions become no-ops or return constant values

This approach is suitable for single-core embedded systems, testing environments, or scenarios where the overhead of true per-CPU data management is unnecessary.

Runtime Implementation

Naive Runtime Functions

The naive implementation provides stub versions of all runtime functions that would normally manage per-CPU memory areas and registers:

flowchart TD
subgraph subGraph1["Return Values"]
    ZERO1["0"]
    ONE1["1"]
    ZERO2["0"]
    ZERO3["0"]
    NOOP1["no-op"]
    NOOP2["no-op"]
    ONE2["1"]
end
subgraph subGraph0["Naive Runtime Functions"]
    PSZ["percpu_area_size"]
    PNUM["percpu_area_num"]
    PBASE["percpu_area_base"]
    READ["read_percpu_reg"]
    WRITE["write_percpu_reg"]
    INIT_REG["init_percpu_reg"]
    INIT["init"]
end

INIT --> ONE2
INIT_REG --> NOOP2
PBASE --> ZERO2
PNUM --> ONE1
PSZ --> ZERO1
READ --> ZERO3
WRITE --> NOOP1

Function Stub Implementations

FunctionPurposeNaive Return Value
percpu_area_size()Get per-CPU area sizeAlways0
percpu_area_num()Get number of per-CPU areasAlways1
percpu_area_base(cpu_id)Get base address for CPUAlways0
read_percpu_reg()Read per-CPU registerAlways0
write_percpu_reg(tp)Write per-CPU registerNo effect
init_percpu_reg(cpu_id)Initialize CPU registerNo effect
init()Initialize per-CPU systemReturns1

Sources: percpu/src/naive.rs(L3 - L54) 

Code Generation Differences

Macro Implementation Comparison

The naive implementation generates fundamentally different code compared to the standard multi-CPU implementation:

flowchart TD
subgraph subGraph2["Naive Implementation"]
    NAV_OFFSET["gen_offsetaddr_of!(symbol)"]
    NAV_PTR["gen_current_ptraddr_of!(symbol)"]
    NAV_READ["gen_read_current_raw*self.current_ptr()"]
    NAV_WRITE["gen_write_current_rawdirect assignment"]
end
subgraph subGraph1["Standard Implementation"]
    STD_OFFSET["gen_offsetregister + offset"]
    STD_PTR["gen_current_ptrregister-based access"]
    STD_READ["gen_read_current_rawregister + offset"]
    STD_WRITE["gen_write_current_rawregister + offset"]
end
subgraph subGraph0["Source Code"]
    DEF["#[def_percpu]static VAR: T = init;"]
end

DEF --> NAV_OFFSET
DEF --> STD_OFFSET
NAV_OFFSET --> NAV_PTR
NAV_PTR --> NAV_READ
NAV_READ --> NAV_WRITE
STD_OFFSET --> STD_PTR
STD_PTR --> STD_READ
STD_READ --> STD_WRITE

Key Code Generation Functions

The naive macro implementation in percpu_macros/src/naive.rs(L6 - L28)  provides these core functions:

  • gen_offset(): Returns ::core::ptr::addr_of!(symbol) as usize instead of calculating register-relative offsets
  • gen_current_ptr(): Returns ::core::ptr::addr_of!(symbol) for direct global access
  • gen_read_current_raw(): Uses *self.current_ptr() for simple dereference
  • gen_write_current_raw(): Uses direct assignment *(self.current_ptr() as *mut T) = val

Sources: percpu_macros/src/naive.rs(L6 - L28) 

Memory Model Comparison

Standard vs Naive Memory Layout

flowchart TD
subgraph subGraph0["Standard Multi-CPU Model"]
    TEMPLATE["Template .percpu Section"]
    CPU1_AREA["CPU 1 Data Area"]
    CPUN_AREA["CPU N Data Area"]
    CPU1_REG["CPU 1 Register"]
    CPUN_REG["CPU N Register"]
    subgraph subGraph1["Naive Single-CPU Model"]
        GLOBAL_VAR["Global Variable Storage"]
        DIRECT_ACCESS["Direct Memory Access"]
        CPU0_AREA["CPU 0 Data Area"]
        CPU0_REG["CPU 0 Register"]
    end
end

CPU0_REG --> CPU0_AREA
CPU1_REG --> CPU1_AREA
CPUN_REG --> CPUN_AREA
GLOBAL_VAR --> DIRECT_ACCESS
TEMPLATE --> CPU0_AREA
TEMPLATE --> CPU1_AREA
TEMPLATE --> CPUN_AREA

Memory Allocation Differences

AspectStandard ImplementationNaive Implementation
Memory AreasMultiple per-CPU areasSingle global variables
InitializationCopy template to each areaNo copying required
Register UsagePer-CPU register per coreNo registers used
Address CalculationBase + offsetDirect symbol address
Memory Overheadarea_size * num_cpusSize of global variables only

Sources: percpu/src/naive.rs(L1 - L54)  README.md(L71 - L73) 

Feature Integration

Build Configuration

The naive implementation is enabled through the sp-naive cargo feature:

[dependencies]
percpu = { version = "0.1", features = ["sp-naive"] }

When this feature is active:

  • All architecture-specific code paths are bypassed
  • No linker script modifications are required for per-CPU sections
  • The system operates as if there is only one CPU

Compatibility with Other Features

Feature CombinationBehavior
sp-naivealonePure global variable mode
sp-naive+preemptGlobal variables with preemption guards
sp-naive+arm-el2Feature ignored, global variables used

Sources: README.md(L69 - L79) 

Use Cases and Limitations

Appropriate Use Cases

The naive implementation is suitable for:

  • Single-core embedded systems: Where true per-CPU isolation is unnecessary
  • Testing and development: Simplified debugging without architecture concerns
  • Prototype development: Quick implementation without per-CPU complexity
  • Resource-constrained environments: Minimal memory overhead requirements

Limitations

  • No CPU isolation: All "per-CPU" variables are shared globally
  • No scalability: Cannot be extended to multi-CPU systems without feature changes
  • Limited performance benefits: No per-CPU cache locality optimizations
  • Testing coverage gaps: May not expose multi-CPU race conditions during development

Sources: README.md(L71 - L73)  percpu/src/naive.rs(L1 - L2)