Core Concepts
A deep dive into the architecture of USM and its key components.
To effectively use USM, it's essential to understand its microkernel-inspired architecture. USM is not a single program, but a symbiotic system of components that work together across the user-kernel boundary.
Let's break down each major component shown in the diagram.
The USM Stack
The USM framework is composed of three primary layers that separate the mechanism (how memory is mapped) from the policy (the logic of when and where to map it).
USMKernel: The Mechanism Executor
The USMKernel is a minimal, hardened Linux kernel module (usm_lkmRM.ko). It acts as the trusted mediator between your policy and the hardware.
- CMA Reservation: At boot time, it uses the Contiguous Memory Allocator (CMA) to reserve a physically contiguous memory pool. This contiguity ensures that physical address lookups stay at O(1) complexity.
- Event Interception: It intercepts page faults for "tagged" processes and redirects them to the userspace.
- PG_USM Isolation: Pages managed by USM are tagged with the
PG_USMflag. This explicitly hides them from the standard Linuxkswapd(swap daemon) and the OOM killer, ensuring your policy has 100% control.
InstanceUSM: The Policy Engine
An InstanceUSM (your compiled project-2 binary) is a userspace daemon where all the intelligence lives. It is the policy layer.
- It receives memory events from
USMKernel. - It executes your custom C code (your allocation, eviction, and OOM policies) to decide how to handle the event.
- It sends a simple, declarative command back to the
USMKernelto be executed. - You can run multiple
InstanceUSMprocesses on the same machine to isolate different policies from each other.
Manager (Optional): The Scheduler
The Manager (usmWaker binary) is an optional but powerful scheduler. Its primary role is to wake up the correct InstanceUSM worker thread when a new memory event occurs.
- It monitors the shared memory communication channels for new events.
- When an event is detected, it sends a high-speed signal (like Intel's UIPI) to the appropriate sleeping worker thread.
- This design allows
InstanceUSMworker threads to sleep without consuming CPU when there is no work to be done, making the entire system highly efficient.
Inside an InstanceUSM
Architecture efficiency is driven by a hierarchy of workers and optimized memory pools.
Workers & Workees (uThreads)
USM uses a "fiber" or user-level threading model to avoid the overhead of heavy kernel context switches.
- Workers (kThreads): These are standard POSIX threads pinned to specific CPU cores. They run a
LocalManagerscheduler. - Workees (uThreads): These are lightweight contexts dedicated to specific application tasks. Switching between workees happens entirely in userspace, which is significantly faster than a kernel context switch.
- Bitfield Scheduling: The
LocalManageruses high-speed bitfields (the Page Fault Queue (PFQ) and ID Queue (IDQ)) to track which workees need to handle new faults. This allows O(1) dispatching of events.
Two-Tiered Arenas
To ensure that the InstanceUSM itself never becomes a bottleneck, it uses a lock-free allocation system:
- lArena (Local): Each worker has a private pool for instantaneous, lock-free page picking.
- sArena (Shared): A global pool that refills
lArenaswhen they run low.
The Communication Channels
Communication between these components is critical for performance. USM uses a combination of kernel facilities to make this fast.
/dev/USMMcd: A character device created by the kernel module that exposes the reserved CMA memory pool to userspace. TheInstanceUSMmmaps this device to gain control over its physical memory sandbox./proc/usm: A procfs file used by the kernel to announce that a new application has been tagged for USM management. TheInstanceUSMpolls this file to discover new processes to manage./proc/usmPgs: A reverse channel. When the kernel needs to free a managed page (e.g., on process exit), it reports the freed Page Frame Number (PFN) here so theInstanceUSMcan reclaim it for its own free list.