Coordinative Scheduling with Christine
How USM enables asynchronous page fault handling by coordinating with user-space schedulers like Christine to achieve sub-µs tail latency.
USM goes beyond simple memory allocation by integrating deeply with user-space threading models. This section details the Coordinative Scheduling policy, specifically designed for User-Level Thread (uThread) schedulers like Christine.
By transforming application threads into uThreads scheduled on top of kernel-level threads (kThreads), systems can achieve high performance. However, legacy page fault handling creates bottlenecks. USM solves this by introducing an asynchronous page fault handling policy that coordinates directly with the scheduler.
The Challenge: USM-sync
In a typical User-Space Scheduler (UUS) setup (such as Christine), when a uThread generates a page fault, the underlying kThread is blocked by the OS kernel while the fault is resolved.
- Behavior: The uThread occupies the kThread for the entire duration of the fault.
- Impact: Even if other uThreads are ready to run, the kThread cannot switch to them because it is blocked in a synchronous kernel call.
- Result: High tail latency, particularly for CPU-intensive tasks colocated with memory-intensive ones.
The Solution: USM-async
USM introduces a coordinative approach called USM-async. Instead of blocking, the system coordinates with the scheduler to "schedule out" the faulting uThread immediately.
- Fault Detection: A uThread triggers a page fault.
- Context Switch: Instead of waiting, the scheduler swaps the context of the faulting uThread with a ready uThread.
- Background Resolution: USM handles the page fault in the background (potentially retrieving data from slower memory tiers like disk or far memory).
- Resumption: Once the page is ready, the uThread is marked as runnable and scheduled back in.
This allows CPU-intensive uThreads to continue processing on the kThread while memory-intensive uThreads wait for I/O, significantly improving performance.
Note:
Performance Impact: Using this uThread-aware policy, CPU-intensive applications colocated with memory-intensive workloads saw performance improvements of up to 2.26× compared to the synchronous default.
Implementation Details
The implementation relies on efficient context swapping and signal handling to preempt faulting threads. Below are key components from the scheduler implementation (sched.c).
1. Worker Metamorphosis
When USM detects that it is running alongside a coordinative scheduler (via the ioctl_fd check), the worker thread "mutates" from a simple handler into a UTS Manager.
/* From sched.c: Transitioning to Christine's Manager mode */
#ifdef IUTHREAD
if(unlikely(cur_uthread->alt.manager.ioctl_fd != -1)) {
// The worker transitions to User Thread Scheduling (UTS) mode
usm_mem_worker_handler_uts(chn);
}
#endif
2. The UTS Dispatcher (Christine's Core)
The usm_mem_worker_handler_uts function manages a dedicated table of Internal User Threads (iUthreads). Instead of a single execution flow, it loops through events signaled by Christine and swaps contexts immediately.
- Shared Memory Communication: To avoid system call overhead, Christine and USM communicate via a shared bitmask (iUTcom) located at the end of the communication page.
- Zero-Blocking: If an iUthread encounters a new page fault, it is "scheduled out" back to the main scheduler loop (uctx_main), allowing other ready micro-threads to proceed.
/* Simplified logic of the UTS Manager in sched.c */
void usm_mem_worker_handler_uts(int wk_args) {
// 1. Allocate and initialize iUthreads contexts
cur_uthread->alt.manager.iuthreads = malloc(sizeof(uthread_t) * MAX_IUTHREADS);
while(1) {
// 2. Poll the shared bitmask for ready threads
tempIUTEvents = *(unsigned int *)cur_uthread->alt.manager.iUTcom;
while(tempIUTEvents) {
int id = usmChooseID(tempIUTEvents);
// 3. Context swap to the specific micro-thread
cur_vcpu->current_uthread = &iuthreads[id];
swapcontext(&cur_uthread->task, &iuthreads[id].task);
// 4. Clear handled event
tempIUTEvents &= (~((unsigned int)1 << id));
}
// 5. Yield back to main loop if idle
swapcontext(&cur_uthread->task, &cur_vcpu->uctx_main);
}
}
3. Implementation Details
| Feature | Implementation in sched.c |
|---|---|
| Micro-thread Table | alt.manager.iuthreads stores the ucontext_t for each Christine thread. |
| Control Channel | iUTcom provides a lock-less communication path between USM and the scheduler. |
| Preemption | Signal handlers (SIGUSR1/2) ensure that the scheduler can regain control if a thread exceeds its slice. |
| Efficiency | pthread_setaffinity_np pins the manager to the same physical core as the application threads to maintain cache locality. |
Summary of Benefits
| Metric | USM_sync | USM-async |
|---|---|---|
| kThread state | Blocked during page fault | Active(running other uThread) |
| COncurrency | Low (Faults halt execution) | High (Fault are overlapped) |
| Tail Latency | High (waiting for I/O) | Sub-µs (cpu-bound tasks proceed) |
| Colocation | Poor isolation | Excellent isolation |