Coordinative Scheduling with Christine

How USM enables asynchronous page fault handling by coordinating with user-space schedulers like Christine to achieve sub-µs tail latency.

USM goes beyond simple memory allocation by integrating deeply with user-space threading models. This section details the Coordinative Scheduling policy, specifically designed for User-Level Thread (uThread) schedulers like Christine.

By transforming application threads into uThreads scheduled on top of kernel-level threads (kThreads), systems can achieve high performance. However, legacy page fault handling creates bottlenecks. USM solves this by introducing an asynchronous page fault handling policy that coordinates directly with the scheduler.

The Challenge: USM-sync

In a typical User-Space Scheduler (UUS) setup (such as Christine), when a uThread generates a page fault, the underlying kThread is blocked by the OS kernel while the fault is resolved.

  • Behavior: The uThread occupies the kThread for the entire duration of the fault.
  • Impact: Even if other uThreads are ready to run, the kThread cannot switch to them because it is blocked in a synchronous kernel call.
  • Result: High tail latency, particularly for CPU-intensive tasks colocated with memory-intensive ones.

The Solution: USM-async

USM introduces a coordinative approach called USM-async. Instead of blocking, the system coordinates with the scheduler to "schedule out" the faulting uThread immediately.

  1. Fault Detection: A uThread triggers a page fault.
  2. Context Switch: Instead of waiting, the scheduler swaps the context of the faulting uThread with a ready uThread.
  3. Background Resolution: USM handles the page fault in the background (potentially retrieving data from slower memory tiers like disk or far memory).
  4. Resumption: Once the page is ready, the uThread is marked as runnable and scheduled back in.

This allows CPU-intensive uThreads to continue processing on the kThread while memory-intensive uThreads wait for I/O, significantly improving performance.

Note:

Performance Impact: Using this uThread-aware policy, CPU-intensive applications colocated with memory-intensive workloads saw performance improvements of up to 2.26× compared to the synchronous default.

Implementation Details

The implementation relies on efficient context swapping and signal handling to preempt faulting threads. Below are key components from the scheduler implementation (sched.c).

1. Worker Metamorphosis

When USM detects that it is running alongside a coordinative scheduler (via the ioctl_fd check), the worker thread "mutates" from a simple handler into a UTS Manager.

/* From sched.c: Transitioning to Christine's Manager mode */
#ifdef IUTHREAD
    if(unlikely(cur_uthread->alt.manager.ioctl_fd != -1)) {
        // The worker transitions to User Thread Scheduling (UTS) mode
        usm_mem_worker_handler_uts(chn); 
    }
#endif

2. The UTS Dispatcher (Christine's Core)

The usm_mem_worker_handler_uts function manages a dedicated table of Internal User Threads (iUthreads). Instead of a single execution flow, it loops through events signaled by Christine and swaps contexts immediately.

  • Shared Memory Communication: To avoid system call overhead, Christine and USM communicate via a shared bitmask (iUTcom) located at the end of the communication page.
  • Zero-Blocking: If an iUthread encounters a new page fault, it is "scheduled out" back to the main scheduler loop (uctx_main), allowing other ready micro-threads to proceed.
/* Simplified logic of the UTS Manager in sched.c */
void usm_mem_worker_handler_uts(int wk_args) {
    // 1. Allocate and initialize iUthreads contexts
    cur_uthread->alt.manager.iuthreads = malloc(sizeof(uthread_t) * MAX_IUTHREADS);
    
    while(1) {
        // 2. Poll the shared bitmask for ready threads
        tempIUTEvents = *(unsigned int *)cur_uthread->alt.manager.iUTcom;
        
        while(tempIUTEvents) {
            int id = usmChooseID(tempIUTEvents);
            
            // 3. Context swap to the specific micro-thread
            cur_vcpu->current_uthread = &iuthreads[id];
            swapcontext(&cur_uthread->task, &iuthreads[id].task);
            
            // 4. Clear handled event
            tempIUTEvents &= (~((unsigned int)1 << id));
        }
        // 5. Yield back to main loop if idle
        swapcontext(&cur_uthread->task, &cur_vcpu->uctx_main);
    }
}

3. Implementation Details

FeatureImplementation in sched.c
Micro-thread Tablealt.manager.iuthreads stores the ucontext_t for each Christine thread.
Control ChanneliUTcom provides a lock-less communication path between USM and the scheduler.
PreemptionSignal handlers (SIGUSR1/2) ensure that the scheduler can regain control if a thread exceeds its slice.
Efficiencypthread_setaffinity_np pins the manager to the same physical core as the application threads to maintain cache locality.

Summary of Benefits

MetricUSM_syncUSM-async
kThread stateBlocked during page faultActive(running other uThread)
COncurrencyLow (Faults halt execution)High (Fault are overlapped)
Tail LatencyHigh (waiting for I/O)Sub-µs (cpu-bound tasks proceed)
ColocationPoor isolationExcellent isolation