Sprint 4 — Processes & Scheduler

Run multiple threads of execution, share the CPU fairly across cores.

✅ Complete

Table of contents

Overview
Process & Thread Model
Context Switching
Preemptive Scheduler
SMP Initialization
Reaper Daemon
Dependencies

Overview #

Sprint 4 implements processes, threads, and a preemptive round-robin scheduler — the foundation for running multiple programs on x86_64 SMP hardware. The LAPIC one-shot timer fires every 10 ms to enforce fair CPU sharing. Application Processors (APs) are initialized via INIT/SIPI for symmetric multiprocessing.

Process & Thread Model #

✅ Implemented — kernel/src/sched/thread.rs

Process

A process owns an address space (PML4) and a CNode (64-slot capability table):

Each process has its own PML4 (top-level page table) — complete isolation
Processes communicate only through kernel-mediated IPC (SYS_SEND/SYS_RECV)
No shared memory by default — memory sharing requires explicit capability delegation

Thread (Actual Implementation)

#[repr(C)]
pub struct Thread {
    pub id: u64,                      // Unique TID from AtomicU64
    pub state: ThreadState,           // Ready | Running | BlockedSend | BlockedRecv | Dead
    pub rsp: u64,                     // Saved kernel RSP
    pub kernel_stack_base: u64,       // HHDM-mapped stack base
    pub kernel_stack_size: usize,     // 4 pages = 16 KiB
    pub name: [u8; 32],               // Display name
    pub process: *mut Process,        // Parent process (PML4 + CNode)
    pub ipc_buffer: IpcMessage,       // Per-thread IPC message buffer
    pub user_rip: u64,                // Ring 3 entry point (0 = kernel thread)
    pub user_rsp: u64,                // Ring 3 stack pointer
}

Thread States (5-state model)

    stateDiagram-v2
      [*] --> Ready
      Ready --> Running : schedule()
      Running --> BlockedSend : sys_send() waiting for receiver
      Running --> BlockedRecv : sys_recv() waiting for sender
      BlockedSend --> Ready : partner calls recv
      BlockedRecv --> Ready : partner calls send / IRQ fires
      Running --> Dead : sys_exit() / thread_exit()
      Dead --> [*] : Reaper reclaims

Context Switching #

✅ Implemented — kernel/src/sched/context.rs (naked assembly)

When the scheduler switches from thread A to thread B:

Save thread A's callee-saved registers (RBX, RBP, R12–R15) via switch_context()
Write RSP to A's thread.rsp field
Load RSP from B's thread.rsp
Switch page tables if processes differ (mov cr3, new_pml4)
Update TSS RSP0 to B's kernel stack top (for Ring 3 → Ring 0 transitions)
Restore B's callee-saved registers
RET — pops B's saved RIP and resumes execution

Thread Entry Trampoline

New threads are born via thread_entry_trampoline() (naked function):

sti — re-enable interrupts (disabled during context switch)
mov rdi, r14 — load argument (placed during Thread::new)
call r13 — call the thread's payload function
If payload returns, fall through to thread_exit() → mark Dead → schedule()

Preemptive Scheduler #

✅ Implemented — kernel/src/sched/scheduler.rs

Design

MinimalOS uses a preemptive round-robin scheduler with a 10 ms LAPIC one-shot timer quantum:

When scheduling a thread, arm the LAPIC timer for 10 ms
When the timer fires, the timer ISR calls schedule()
If the thread blocks (IPC) or exits before the timer, schedule() is called immediately
The timer is re-armed on every schedule() call

Per-Core Run Queues

Each CPU core has its own RunQueue (a VecDeque<Box<Thread>>) accessed via CpuLocal:

    graph TD
      RQ["Core 0 RunQueue"] --> T0["init (PID 1)"] 
      RQ --> T1["reaper daemon"]
      RQ --> T2["idle thread"]

A global BOOT_QUEUE holds threads spawned before per-core queues are initialized. The BSP drains it during scheduler::init().

Idle Thread

Each core has a dedicated idle thread that executes sti; hlt in a loop. This puts the core into a low-power C-state until the next interrupt.

schedule() Algorithm

Pop next thread from the per-core run queue
Handle current thread based on state:
- Running → re-enqueue to run queue (round-robin)
- BlockedSend/BlockedRecv → skip (Box owned by IPC endpoint)
- Dead → push to DEAD_QUEUE for reaper
Update TSS.rsp[0] and CpuLocal.kernel_stack_top
CR3 swap if process PML4 differs
switch_context(prev_rsp, next_rsp)
Re-arm LAPIC timer (10 ms)

SMP Initialization #

✅ Implemented — kernel/src/arch/x86_64/smp.rs

On boot, only the BSP (Bootstrap Processor, core 0) is running. The other cores are halted, waiting for a startup sequence.

AP Boot Sequence (Actual)

Parse Limine MP response — iterate MpResponse.cpus(), find each AP's LAPIC ID
Write goto_address — each AP's goto_address is set to ap_trampoline
ap_trampoline (naked) — loads kernel CR3 (KERNEL_PML4), calls ap_rust_entry
ap_rust_entry — per-AP initialization:
- GDT + TSS load
- IDT load
- CpuLocal install (GS base via IA32_GS_BASE MSR)
- LAPIC init (timer, spurious vector)
- SYSCALL MSR configuration
Idle loop — sti; hlt until work arrives

Per-CPU Local Storage (CpuLocal)

Each core maintains its own CpuLocal struct, accessed via the GS segment register:

#[repr(C)]
pub struct CpuLocal {
    pub self_ptr: *const CpuLocal,     // offset 0  — gs:[0]
    pub lapic_id: u32,                 // offset 8
    pub core_index: u32,               // offset 12
    pub current_thread: *mut Thread,   // offset 16
    pub idle_thread: *mut Thread,      // offset 24
    pub run_queue: *mut RunQueue,      // offset 32
    pub online: bool,                  // offset 40
    pub user_rsp_scratch: u64,         // offset 48 — syscall entry saves user RSP here
    pub kernel_stack_top: u64,         // offset 56 — syscall entry loads kernel RSP from here
}

Reaper Daemon (Sprint 9.5) #

✅ Implemented — kernel/src/sched/scheduler.rs (reaper_entry)

The reaper is a kernel thread that reclaims resources from dead threads:

schedule() pushes Dead threads into DEAD_QUEUE (a SpinLock<Vec<Box<Thread>>>)
Reaper daemon drains the queue each time it's scheduled
For each dead thread: convert HHDM stack addresses → physical, call free_frame() per stack page
If the thread's process has no more living threads: walk PML4[0..256] to free all user pages, then unregister_process()
Drop Box<Thread> to reclaim the TCB

Dependencies #

Requires: Sprint 3 (IDT for timer interrupts, LAPIC for preemption, ACPI/MADT for AP discovery)
Enables: Sprint 5 (capabilities need process/thread structures), Sprint 6 (userspace needs Ring 3 processes), Sprint 9.5 (reaper daemon)