Supported Platforms
Relevant source files
This document provides a comprehensive overview of the CPU architectures and platforms supported by the percpu crate ecosystem. It covers architecture-specific per-CPU register usage, implementation details, and platform-specific considerations for integrating per-CPU data management.
For information about setting up these platforms in your build environment, see Installation and Setup. For details about the architecture-specific code generation mechanisms, see Architecture-Specific Code Generation.
Supported Architecture Overview
The percpu crate supports four major CPU architectures, each utilizing different registers for per-CPU data access:
| Architecture | per-CPU Register | Register Type | Supported Variants |
|---|---|---|---|
| x86_64 | GS_BASE | Segment base register | Standard x86_64 |
| AArch64 | TPIDR_ELx | Thread pointer register | EL1, EL2 modes |
| RISC-V | gp | Global pointer register | 32-bit, 64-bit |
| LoongArch | $r21 | General purpose register | 64-bit |
Platform Register Usage
flowchart TD
subgraph subGraph2["Access Methods"]
X86_ASM["mov gs:[offset]"]
ARM_ASM["mrs/msr instructions"]
RISCV_ASM["lui/addi with gp"]
LOONG_ASM["lu12i.w/ori with $r21"]
end
subgraph subGraph1["Per-CPU Registers"]
GS["GS_BASESegment Register"]
TPIDR["TPIDR_ELxThread Pointer"]
GP["gpGlobal Pointer"]
R21["$r21General Purpose"]
end
subgraph subGraph0["Architecture Support"]
X86["x86_64"]
ARM["AArch64"]
RISCV["RISC-V"]
LOONG["LoongArch"]
end
ARM --> TPIDR
GP --> RISCV_ASM
GS --> X86_ASM
LOONG --> R21
R21 --> LOONG_ASM
RISCV --> GP
TPIDR --> ARM_ASM
X86 --> GS
Sources: README.md(L19 - L31) percpu_macros/src/arch.rs(L21 - L46)
Platform-Specific Implementation Details
x86_64 Architecture
The x86_64 implementation uses the GS_BASE model-specific register to store the base address of the per-CPU data area. This approach leverages the x86_64 segment architecture for efficient per-CPU data access.
Key characteristics:
- Uses
GS_BASEMSR for per-CPU base address storage - Accesses data via
gs:[offset]addressing mode - Requires offset values ≤ 0xffff_ffff for 32-bit displacement
- Supports optimized single-instruction memory operations
flowchart TD
subgraph subGraph1["Generated Assembly"]
MOV_READ["mov reg, gs:[offset VAR]Direct read"]
MOV_WRITE["mov gs:[offset VAR], regDirect write"]
ADDR_CALC["mov reg, offset VARadd reg, gs:[__PERCPU_SELF_PTR]"]
end
subgraph subGraph0["x86_64 Implementation"]
GS_BASE["GS_BASE MSRBase Address"]
OFFSET["Symbol OffsetCalculated at compile-time"]
ACCESS["gs:[offset] AccessSingle instruction"]
end
ACCESS --> ADDR_CALC
ACCESS --> MOV_READ
ACCESS --> MOV_WRITE
GS_BASE --> ACCESS
OFFSET --> ACCESS
Sources: percpu_macros/src/arch.rs(L66 - L76) percpu_macros/src/arch.rs(L131 - L150) percpu_macros/src/arch.rs(L232 - L251)
AArch64 Architecture
The AArch64 implementation uses thread pointer registers that vary based on the exception level. The system supports both EL1 (kernel) and EL2 (hypervisor) execution environments.
Key characteristics:
- Uses
TPIDR_EL1by default,TPIDR_EL2whenarm-el2feature is enabled - Requires
mrs/msrinstructions for register access - Offset calculations limited to 16-bit immediate values (≤ 0xffff)
- Two-instruction sequence for per-CPU data access
The register selection is controlled at compile time:
flowchart TD
subgraph subGraph2["Access Pattern"]
MRS["mrs reg, TPIDR_ELxGet base address"]
CALC["add reg, offsetCalculate final address"]
ACCESS["ldr/str operationsMemory access"]
end
subgraph subGraph1["Register Usage"]
EL1["TPIDR_EL1Kernel Mode"]
EL2["TPIDR_EL2Hypervisor Mode"]
end
subgraph subGraph0["AArch64 Register Selection"]
FEATURE["arm-el2 feature"]
DEFAULT["Default Mode"]
end
CALC --> ACCESS
DEFAULT --> EL1
EL1 --> MRS
EL2 --> MRS
FEATURE --> EL2
MRS --> CALC
Sources: percpu_macros/src/arch.rs(L55 - L62) percpu_macros/src/arch.rs(L79 - L80) README.md(L33 - L35)
RISC-V Architecture
The RISC-V implementation uses the global pointer (gp) register for per-CPU data base addressing. This is a deviation from typical RISC-V conventions where gp is used for global data access.
Key characteristics:
- Uses
gpregister instead of standard thread pointer (tp) - Supports both 32-bit and 64-bit variants
- Uses
lui/addiinstruction sequence for address calculation - Offset values limited to 32-bit signed immediate range
Important note: The tp register remains available for thread-local storage, while gp is repurposed for per-CPU data.
flowchart TD
subgraph subGraph2["Memory Operations"]
LOAD["ld/lw/lh/lb instructionsBased on data type"]
STORE["sd/sw/sh/sb instructionsBased on data type"]
end
subgraph subGraph1["Address Calculation"]
LUI["lui reg, %hi(VAR)Load upper immediate"]
ADDI["addi reg, reg, %lo(VAR)Add lower immediate"]
FINAL["add reg, reg, gpAdd base address"]
end
subgraph subGraph0["RISC-V Register Usage"]
GP["gp registerPer-CPU base"]
TP["tp registerThread-local storage"]
OFFSET["Symbol offset%hi/%lo split"]
end
ADDI --> FINAL
FINAL --> LOAD
FINAL --> STORE
GP --> FINAL
LUI --> ADDI
OFFSET --> LUI
Sources: percpu_macros/src/arch.rs(L81 - L82) percpu_macros/src/arch.rs(L33 - L39) README.md(L28 - L31)
LoongArch Architecture
The LoongArch implementation uses the $r21 general-purpose register for per-CPU data base addressing. This architecture provides native support for per-CPU data patterns.
Key characteristics:
- Uses
$r21general-purpose register for base addressing - Supports 64-bit architecture (LoongArch64)
- Uses
lu12i.w/oriinstruction sequence for address calculation - Provides specialized load/store indexed instructions
flowchart TD
subgraph subGraph1["Instruction Sequences"]
LU12I["lu12i.w reg, %abs_hi20(VAR)Load upper 20 bits"]
ORI["ori reg, reg, %abs_lo12(VAR)OR lower 12 bits"]
LDX["ldx.d/ldx.w/ldx.h/ldx.buLoad indexed"]
STX["stx.d/stx.w/stx.h/stx.bStore indexed"]
end
subgraph subGraph0["LoongArch Implementation"]
R21["$r21 registerPer-CPU base"]
CALC["Address calculationlu12i.w + ori"]
INDEXED["Indexed operationsldx/stx instructions"]
end
CALC --> LU12I
LU12I --> ORI
ORI --> LDX
ORI --> STX
R21 --> INDEXED
Sources: percpu_macros/src/arch.rs(L83 - L84) percpu_macros/src/arch.rs(L40 - L46) percpu_macros/src/arch.rs(L114 - L129)
Continuous Integration and Testing Coverage
The percpu crate maintains comprehensive testing across all supported platforms through automated CI/CD pipelines. The testing matrix ensures compatibility and correctness across different target environments.
CI Target Matrix:
| Target | Architecture | Environment | Test Coverage |
|---|---|---|---|
| x86_64-unknown-linux-gnu | x86_64 | Linux userspace | Full tests + unit tests |
| x86_64-unknown-none | x86_64 | Bare metal/no_std | Build + clippy only |
| riscv64gc-unknown-none-elf | RISC-V 64 | Bare metal/no_std | Build + clippy only |
| aarch64-unknown-none-softfloat | AArch64 | Bare metal/no_std | Build + clippy only |
| loongarch64-unknown-none-softfloat | LoongArch64 | Bare metal/no_std | Build + clippy only |
Testing Strategy:
flowchart TD
subgraph subGraph2["Feature Combinations"]
SP_NAIVE["sp-naive featureSingle-core testing"]
PREEMPT["preempt featurePreemption safety"]
ARM_EL2["arm-el2 featureHypervisor mode"]
end
subgraph subGraph1["Test Types"]
BUILD["Build TestsAll targets"]
UNIT["Unit Testsx86_64 Linux only"]
LINT["Code QualityAll targets"]
end
subgraph subGraph0["CI Pipeline"]
MATRIX["Target Matrix5 platforms tested"]
FEATURES["Feature Testingpreempt, arm-el2"]
TOOLS["Quality Toolsclippy, rustfmt"]
end
BUILD --> ARM_EL2
BUILD --> PREEMPT
BUILD --> SP_NAIVE
FEATURES --> ARM_EL2
FEATURES --> PREEMPT
FEATURES --> SP_NAIVE
MATRIX --> BUILD
MATRIX --> LINT
MATRIX --> UNIT
Sources: .github/workflows/ci.yml(L8 - L32)
Platform-Specific Limitations and Considerations
macOS Development Limitation
The crate includes a development-time limitation for macOS hosts, where architecture-specific assembly code is disabled and replaced with unimplemented stubs. This affects local development but not target deployment.
Offset Size Constraints
Each architecture imposes different constraints on the maximum offset values for per-CPU variables:
- x86_64: 32-bit signed displacement (≤ 0xffff_ffff)
- AArch64: 16-bit immediate value (≤ 0xffff)
- RISC-V: 32-bit signed immediate (split hi/lo)
- LoongArch: 32-bit absolute address (split hi20/lo12)
Register Conflicts and Conventions
- RISC-V: Repurposes
gpregister, deviating from standard ABI conventions - AArch64: Requires careful exception level management for register selection
- x86_64: Depends on GS segment register configuration by system initialization
Sources: percpu_macros/src/arch.rs(L4 - L13) percpu_macros/src/arch.rs(L23 - L46) README.md(L28 - L35)