PTEMagnet as a USM Policy
Leveraging USM to reduce TLB misses in virtualized environments by enforcing physical address contiguity.
When applications run within a Virtual Machine (VM), they often experience performance degradation due to increased Translation Lookaside Buffer (TLB) misses. USM addresses this by implementing PTEMagnet logic as a userspace allocation policy.
The Problem: Non-Contiguity in VMs
In a virtualized environment, the Operating System allocates Guest Physical Addresses (GPAs) to back the application's Guest Virtual Addresses (GVAs).
- The Issue: Even if the application requests contiguous virtual memory, the underlying GPAs allocated by the OS are often non-contiguous, especially when multiple processes are collocated and memory is fragmented.
- The Consequence: The hardware cannot effectively use huge pages or contiguous block entries in the TLB, leading to frequent TLB misses and reduced performance.
The Solution: PTEMagnet Policy
We implemented the logic of PTEMagnet as a specific USM allocation policy (usm_alloc_policy_ops).
By trapping memory allocation events in userspace, USM allows the implementation of intelligent allocators that actively search for or reserve contiguous blocks of physical memory (GPAs) to back contiguous virtual address ranges.
Implementation Context
The PTEMagnet implementation relies on the allocator's ability to track the last physical page allocated in order to attempt to serve the immediately adjacent physical page during the next request.
/* PTEMagnet Policy - Core Logic */
// We maintain a pointer to the base of the page array for arithmetic
struct page *pagesArrayBase;
static struct page *last_allocated_page = NULL;
void pte_magnet_pick_pages(struct p_args_p *args) {
pthread_mutex_lock(&policiesSet1Flock);
struct page *chosen_page = NULL;
// 1. Magnet Strategy: Attempt physical contiguity
if (last_allocated_page != NULL) {
struct page *next_physical_target = last_allocated_page + 1;
// Check if the next page is within bounds and available
if (next_physical_target < (pagesArrayBase + totalPages) &&
is_page_free(next_physical_target)) {
chosen_page = next_physical_target;
}
}
// 2. Fallback: If contiguity fails, take the first available free page (BasicAlloc style)
if (!chosen_page) {
struct optEludeList *node = list_first_entry(&freeList, struct optEludeList, iulist);
chosen_page = node->page_ptr; // On utilise le pointeur vers struct page
}
// Update state
struct optEludeList *node_to_del = (struct optEludeList *)chosen_page->private;
list_del(&node_to_del->iulist);
last_allocated_page = chosen_page;
pthread_mutex_unlock(&policiesSet1Flock);
// Return the selected page to the USM engine
args->ret = 1;
args->l_ps = &(chosen_page->usedListPositionPointer->iulist);
}
Note:
Evaluation Methodology: To validate this approach, the PTEMagnet policy was compared against a BasicAlloc policy. The BasicAlloc environment used a shuffled list of free pages to simulate the fragmentation typical of workload collocation.
Performance Validation
The implementation was tested against TLB-sensitive applications, including the NAS Parallel Benchmarks (CG.C, BT.C) and data stores like Redis. The results demonstrate significant performance gains, confirming that userspace management can effectively mitigate virtualization overhead.
| Application | Workload / Type | Improvement over BasicAlloc |
|---|---|---|
| memflt | Micro-benchmark | +30.21% |
| MATMUL | CPU/Memory Intensive | +20.93% |
| Redis | YCSB B (Read mostly) | +23.35% |
| Redis | YCSB A (Update heavy) | +4.07% |
| BT.C | NAS Parallel Benchmark | +4.66% |
| CG.C | NAS Parallel Benchmark | +1.65% |
These results indicate that applications with high memory access frequency (like memflt and MATMUL) or specific access patterns (like Redis Read-mostly) benefit most substantially from the enforced contiguity provided by the PTEMagnet policy.