SMP vs Single-Core Implementation

Relevant source files

Purpose and Scope

This document explains how the kspin crate uses feature flags to provide dramatically different implementations for multi-core (SMP) and single-core environments. The smp feature flag enables compile-time optimization that completely eliminates atomic operations and lock state when they are unnecessary in single-core systems.

For details about the underlying BaseSpinLock structure, see BaseSpinLock and BaseSpinLockGuard. For information about how different guard types interact with both implementations, see BaseGuard Trait System. For technical details about atomic operations used in SMP mode, see Memory Ordering and Atomic Operations.

Feature Flag Architecture

The kspin crate uses conditional compilation through the smp feature flag to generate entirely different code paths for multi-core and single-core environments. This approach enables zero-cost abstractions where single-core systems pay no performance penalty for multi-core synchronization primitives.

Conditional Compilation Flow

flowchart TD
subgraph subGraph1["Single-Core Implementation"]
    SingleStruct["BaseSpinLock {_phantom: PhantomData"]
    SingleGuard["BaseSpinLockGuard {_phantom: &PhantomData"]
    SingleOps["No Atomic Operations:• lock() always succeeds• try_lock() always succeeds• is_locked() always false"]
end
subgraph subGraph0["SMP Implementation"]
    SMPStruct["BaseSpinLock {_phantom: PhantomData"]
    SMPGuard["BaseSpinLockGuard {_phantom: &PhantomData"]
    SMPOps["Atomic Operations:• compare_exchange_weak• load/store with ordering• spin_loop hints"]
end
FeatureFlag["smp feature flag"]
SMPBuild["SMP Build Target"]
SingleBuild["Single-Core Build Target"]

FeatureFlag --> SMPBuild
FeatureFlag --> SingleBuild
SMPBuild --> SMPGuard
SMPBuild --> SMPOps
SMPBuild --> SMPStruct
SingleBuild --> SingleGuard
SingleBuild --> SingleOps
SingleBuild --> SingleStruct

Sources: Cargo.toml(L14 - L16)  src/base.rs(L13 - L14)  src/base.rs(L29 - L31)  src/base.rs(L41 - L43) 

SMP Implementation Details

In SMP environments, the BaseSpinLock maintains actual lock state using atomic operations to coordinate between multiple CPU cores. The implementation provides true mutual exclusion through hardware-level atomic instructions.

SMP Lock Structure and Operations

flowchart TD
subgraph subGraph2["Memory Ordering"]
    AcquireOrder["Ordering::Acquire on success"]
    RelaxedOrder["Ordering::Relaxed on failure"]
    ReleaseOrder["Ordering::Release on unlock"]
end
subgraph subGraph1["Lock Acquisition Process"]
    AcquireGuard["G::acquire()"]
    CompareExchange["compare_exchange_weak(false, true)"]
    SpinLoop["spin_loop() while locked"]
    CreateGuard["BaseSpinLockGuard creation"]
end
subgraph subGraph0["BaseSpinLock SMP Structure"]
    SMPLock["BaseSpinLock"]
    Phantom["_phantom: PhantomData"]
    AtomicLock["lock: AtomicBool"]
    Data["data: UnsafeCell"]
end

AcquireGuard --> CompareExchange
AtomicLock --> CompareExchange
CompareExchange --> AcquireOrder
CompareExchange --> CreateGuard
CompareExchange --> RelaxedOrder
CompareExchange --> SpinLoop
CreateGuard --> ReleaseOrder
SMPLock --> AtomicLock
SMPLock --> Data
SMPLock --> Phantom
SpinLoop --> CompareExchange

The SMP implementation uses a two-phase locking strategy:

  1. Guard Acquisition: First acquires the protection guard (disabling preemption/IRQs)
  2. Atomic Lock: Then attempts to acquire the atomic lock using compare-and-swap operations

Key SMP Code Paths:

Sources: src/base.rs(L27 - L32)  src/base.rs(L77 - L101)  src/base.rs(L122 - L149)  src/base.rs(L159 - L162) 

Single-Core Implementation Details

In single-core environments, the BaseSpinLock completely eliminates the atomic lock state. Since only one CPU core exists, proper guard acquisition (disabling preemption/IRQs) provides sufficient mutual exclusion without any atomic operations.

Single-Core Optimization Strategy

flowchart TD
subgraph subGraph2["Eliminated Operations"]
    NoAtomic["❌ No AtomicBool field"]
    NoCompareExchange["❌ No compare_exchange"]
    NoSpinning["❌ No spinning loops"]
    NoMemoryOrdering["❌ No memory ordering"]
    AcquireGuard2["G::acquire()"]
    SingleLock["BaseSpinLock"]
end
subgraph subGraph0["BaseSpinLock Single-Core Structure"]
    NoCompareExchange["❌ No compare_exchange"]
    SingleLock["BaseSpinLock"]
    Data2["data: UnsafeCell"]
    subgraph subGraph1["Simplified Lock Process"]
        NoAtomic["❌ No AtomicBool field"]
        AcquireGuard2["G::acquire()"]
        DirectAccess["Direct data access"]
        Phantom2["_phantom: PhantomData"]
    end
end

AcquireGuard2 --> DirectAccess
SingleLock --> Data2
SingleLock --> Phantom2

Single-Core Behavior:

  • lock() always succeeds immediately after guard acquisition
  • try_lock() always returns Some(guard)
  • is_locked() always returns false
  • force_unlock() performs no atomic operations

Key Single-Core Code Paths:

Sources: src/base.rs(L25 - L26)  src/base.rs(L133 - L135)  src/base.rs(L114 - L116) 

Compile-Time Optimization Benefits

The feature flag approach provides significant performance and size benefits for single-core targets by eliminating unnecessary code at compile time.

Performance Comparison

OperationSMP ImplementationSingle-Core Implementation
Lock AcquisitionGuard + Atomic CAS loopGuard only
Try LockGuard + Atomic CASGuard only
Lock CheckAtomic loadConstantfalse
UnlockAtomic store + Guard releaseGuard release only
Memory Usage+1AtomicBoolper lockNo additional fields
Code SizeFull atomic operation codegenOptimized away

Optimization Mechanisms


Sources: src/base.rs(L111 - L117)  src/base.rs(L125 - L136)  Cargo.toml(L14 - L16) 

Code Generation Differences

The conditional compilation results in fundamentally different assembly code generation for the two target environments.

Structure Layout Differences

flowchart TD
subgraph subGraph0["SMP Memory Layout"]
    SMPData["data: UnsafeCell(sizeof T bytes)"]
    subgraph subGraph2["Guard Layout Differences"]
        SMPGuardLayout["SMP Guard:• phantom reference• irq_state• data pointer• lock reference"]
        SingleGuardLayout["Single-Core Guard:• phantom reference• irq_state• data pointer"]
        SingleStruct2["BaseSpinLock"]
        SinglePhantom["_phantom: PhantomData"]
        SingleData["data: UnsafeCell(sizeof T bytes)"]
        SMPStruct2["BaseSpinLock"]
        SMPPhantom["_phantom: PhantomData"]
        SMPAtomic["lock: AtomicBool(1 byte + padding)"]
    end
    subgraph subGraph1["Single-Core Memory Layout"]
        SMPGuardLayout["SMP Guard:• phantom reference• irq_state• data pointer• lock reference"]
        SingleGuardLayout["Single-Core Guard:• phantom reference• irq_state• data pointer"]
        SingleStruct2["BaseSpinLock"]
        SinglePhantom["_phantom: PhantomData"]
        SingleData["data: UnsafeCell(sizeof T bytes)"]
        SMPStruct2["BaseSpinLock"]
        SMPPhantom["_phantom: PhantomData"]
        SMPAtomic["lock: AtomicBool(1 byte + padding)"]
    end
end

SMPStruct2 --> SMPAtomic
SMPStruct2 --> SMPData
SMPStruct2 --> SMPPhantom
SingleStruct2 --> SingleData
SingleStruct2 --> SinglePhantom

Function Implementation Differences

The same function signatures produce completely different implementations:

This design ensures that single-core embedded systems receive highly optimized code while SMP systems get full multi-core safety guarantees.

Sources: src/base.rs(L27 - L32)  src/base.rs(L37 - L43)  src/base.rs(L52 - L59)  src/base.rs(L77 - L149)  src/base.rs(L218 - L226)