feat: add hardware interrupt support (PIC, KVM IRQ chip, MSHV SynIC, WHP software timer)#1272
feat: add hardware interrupt support (PIC, KVM IRQ chip, MSHV SynIC, WHP software timer)#1272danbugs wants to merge 13 commits intohyperlight-dev:mainfrom
Conversation
268cf83 to
ddd016c
Compare
syntactically
left a comment
There was a problem hiding this comment.
Beyond this immediate use case, hardware interrupt support is a prerequisite for the ring buffer notifier mechanism planned for the tandr/ring branch — the upstream Notifier trait needs a way to signal the guest that new work is available in a virtqueue/ring buffer, and hardware interrupts are the natural delivery mechanism for that (matching the virtio model of interrupt-driven I/O notification).
I'm not sure we've made this decision yet. I believe the intention was to benchmark whether that made sense, or if a custom ABI (say some register flag set the next time that the guest was reentered through one of the existing entrypoint stubs) ended up being faster (since it would allow some extra trips up and down through the hv).
I'm actually a bit curious if we have similar data here as well---maybe the complexity of emulating a good fraction of an interrupt controller is worth it for the performance in the KVM case where there's extra support for it, but especially in the other cases, are we sure this is actually any better than just having a custom interface for "jump to this address every so often"? It seems like we don't really need all the complex interrupt routing, priority, etc parts of the interrupt controller---we just need the timer pulse?
Where the guest expects a legacy PIC but interrupts are actually delivered via LAPIC
Since we don't intend to actually have a PIC at any point, can we just modify the guest to get rid of this assumption when it's being built for the Hyperlight platform?
| use hyperlight_common::outb::OutBAction; | ||
| use tracing::instrument; | ||
|
|
||
| /// Halt the execution of the guest and returns control to the host. |
There was a problem hiding this comment.
Comment (and maybe name) (and commit message) should clarify that this is meant to be used for wfi rather than actually ending execution as we've been using hlt for in the past?
| hl_exception_handler = sym super::handle::hl_exception_handler, | ||
| ); | ||
|
|
||
| // Default no-op IRQ handler for hardware interrupts (vectors 0x20-0x2F). |
There was a problem hiding this comment.
AFAIUI this is code that should be running in target_arch = i686 only right now, so perhaps ought not be in amd64?
|
|
||
| // Default no-op IRQ handler for hardware interrupts (vectors 0x20-0x2F). | ||
| // Sends a non-specific EOI to the master PIC and returns. | ||
| // This prevents unhandled-interrupt faults when the in-kernel PIT fires |
There was a problem hiding this comment.
This race condition is still present since this is only being installed in init_idt after the guest has already been running instructions for some time?
If this is a serious concern, surely we need the host to control the vm state a bit better---either only enabling interrupts after initialize() has finished and the guest kernel is up, or presetting the idt state before entering the guest for the first time (although I'm unsure if there is API for that)
| call {internal_dispatch_function}\n | ||
| mov dx, 108\n | ||
| out dx, al\n | ||
| cli\n |
There was a problem hiding this comment.
Is there any use for this wfi fallback at the end here? If the guest does resume execution on interrupt delivery from a hlt here, something has gone very wrong.
| @@ -0,0 +1,228 @@ | |||
| /* | |||
There was a problem hiding this comment.
This file should be in some arch (i686?) specific directory, I think.
| .set_lapic(&lapic) | ||
| .map_err(|e| CreateVmError::InitializeVm(e.into()))?; | ||
|
|
||
| // Install MSR intercept for IA32_APIC_BASE (MSR 0x1B) to prevent |
There was a problem hiding this comment.
This seems like something that should just be fixed by having the guest kernel not look for an I/O APIC when it is being built for hyperlight, rather than something that should be hacked around in Hyperlight.
| if let Ok(mut lapic) = self.vcpu_fd.get_lapic() { | ||
| let svr = read_lapic_u32(&lapic.regs, 0xF0); | ||
| if svr & 0x100 == 0 { | ||
| write_lapic_u32(&mut lapic.regs, 0xF0, 0x1FF); |
There was a problem hiding this comment.
Where are these values coming from?
| const LAPIC_EMULATION_BIT: u64 = 1 << 1; | ||
|
|
||
| #[cfg(feature = "hw-interrupts")] | ||
| let mut lapic_emulation = { |
There was a problem hiding this comment.
Do we need to support hosts without LAPIC emulation, or could we just error here if it was missing?
| sregs.into(); | ||
|
|
||
| // When LAPIC emulation is active, skip writing APIC_BASE. | ||
| // The generic CommonSpecialRegisters defaults APIC_BASE to 0 |
There was a problem hiding this comment.
Should this be changed in the CommonSpecialRegisters code? Is it also wrong on other HVs?
|
|
||
| #[cfg(not(feature = "init-paging"))] | ||
| pub(crate) fn standard_real_mode_defaults() -> Self { | ||
| // In real mode, all data/code segment registers must have valid |
d0daeee to
ba7dce0
Compare
|
Major rework based on the feedback. Addressing all review comments: Custom ABI approach (syntactically's suggestion)Adopted the "custom ABI" approach. The changes:
Backend simplification
Specific review comments addressed
TestingAll existing tests pass on KVM (default features, kvm-only features, and hw-interrupts features). The MSHV and WHP testing pending. |
e481d13 to
5d77fbd
Compare
- Add PvTimerConfig (port 107) and Halt (port 108) to OutBAction enum - Add userspace 8259A PIC emulation for MSHV/WHP (KVM uses in-kernel PIC) - Add halt() function in guest exit module using Halt port instead of HLT - Add default no-op IRQ handler at IDT vector 0x20 for PIC-remapped IRQ0 - Update dispatch epilogue to use Halt port before cli+hlt fallback - Add hw-interrupts feature flag to hyperlight-host
- Create IRQ chip (PIC + IOAPIC + LAPIC) and PIT before vCPU creation - Add hw-interrupts run loop that handles HLT re-entry, Halt port, and PvTimerConfig port (ignored since in-kernel PIT handles scheduling) - Non-hw-interrupts path also recognizes Halt port for compatibility
- Enable LAPIC in partition flags for SynIC direct-mode timer delivery - Configure LAPIC (SVR, TPR, LINT0/1, LVT Timer) during VM creation - Install MSR intercept on IA32_APIC_BASE to prevent guest from disabling the LAPIC globally - Add SynIC STIMER0 configuration via PvTimerConfig IO port - Add userspace PIC emulation integration for MSHV - Restructure run_vcpu into a loop for HLT re-entry and hw IO handling - Bridge PIC EOI to LAPIC EOI for SynIC timer interrupt acknowledgment - Handle PIT/speaker/debug IO ports in userspace
Add WHP hardware interrupt support using a host-side software timer thread that periodically injects interrupts via WHvRequestInterrupt. Key changes: - Detect LAPIC emulation support via WHvGetCapability - Initialize LAPIC via bulk interrupt-controller state API (WHvGet/SetVirtualProcessorInterruptControllerState2) since individual APIC register writes fail with ACCESS_DENIED - Software timer thread for periodic interrupt injection - LAPIC EOI handling for PIC-only guest acknowledgment - PIC emulation integration for MSHV/WHP shared 8259A - Filter APIC_BASE from set_sregs when LAPIC emulation active - HLT re-entry when timer is active
Add Halt outb (port 108) before cli/hlt in guest init and dummyguest so KVM's in-kernel LAPIC does not absorb the HLT exit. Also restore the hw_timer_interrupts integration test that was inadvertently dropped. Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
The access_gpa and SharedMemoryPageTableBuffer::new functions now take a scratch_base_gpa (u64) instead of scratch_size (usize). Update the read_guest_memory_by_gva caller to pass the correct value from layout.get_scratch_base_gpa(). Signed-off-by: danbugs <danilochiarlone@gmail.com>
…mpatibility MSHV and WHP reject vCPU state with zeroed segment registers (ES, SS, FS, GS, LDT) and uninitialized XSAVE areas. Properly initialize all segment registers in standard_real_mode_defaults() and add reset_xsave() call after set_sregs() to ensure FPU state (FCW, MXCSR) is valid. Signed-off-by: danbugs <danilochiarlone@gmail.com>
Replace create_pit2() with a host-side timer thread that fires an EventFd registered via register_irqfd for GSI 0 (IRQ0). The in-kernel IRQ chip (PIC + IOAPIC + LAPIC) is kept for proper interrupt routing. When the guest writes to PvTimerConfig (port 107), the host parses the requested period and spawns a timer thread that periodically writes to the EventFd. Guest PIT port writes (0x40-0x43) exit to userspace and are silently ignored. Tested with nanvix hello-c.elf: works correctly.
… + host timer thread Replace the SynIC-based timer (STIMER0_CONFIG/COUNT + SCONTROL) with a host timer thread that periodically calls request_virtual_interrupt() (HVCALL_ASSERT_VIRTUAL_INTERRUPT, hypercall 148). This makes MSHV consistent with the KVM irqfd and WHP software timer patterns — all three platforms now use: (1) start timer on PvTimerConfig, (2) host thread periodically injects interrupt, (3) stop on Halt/Drop. Changes: - Remove SynIC dependency: make_default_synthetic_features_mask, SCONTROL, STIMER0_CONFIG, STIMER0_COUNT imports and usage - Set synthetic_proc_features to 0 (no SynIC features needed) - Wrap vm_fd in Arc<VmFd> for thread-safe sharing - Add timer_stop + timer_thread fields (same pattern as KVM) - PvTimerConfig handler spawns timer thread with request_virtual_interrupt - Add Drop impl for timer cleanup - Remove build_stimer_config() and period_us_to_100ns() helpers - Remove SynIC-related unit tests Kept unchanged: LAPIC init, MSR intercept, PIC emulation, EOI bridging (all still required for interrupt delivery). NOT RUNTIME TESTED — MSHV not available on this system.
… PIC ports Remove the Pic struct from both MSHV and WHP. The nanvix guest always remaps IRQ0 to vector 0x20 via PIC ICW2, so the host can hardcode the timer vector instead of tracking the ICW initialization sequence. PIC ports (0x20/0x21/0xA0/0xA1) are now accepted as no-ops. The only PIC-related logic retained is LAPIC EOI bridging: when the guest sends a non-specific EOI on port 0x20, the host clears the LAPIC ISR bit. This is 3 lines of logic vs the ~130-line pic.rs state machine. This addresses the PR 1272 reviewer concern about emulating "a good fraction of the IC controller" — the PIC state machine was the unnecessary complexity, not the timer delivery mechanism.
5d77fbd to
e84c56d
Compare
Summary
Adds hardware interrupt support to Hyperlight, enabling guest OS kernels to receive timer interrupts for preemptive scheduling. Each hypervisor backend uses its native interrupt delivery mechanism:
WHvRequestInterruptfor periodic interrupt injection, using the bulk LAPIC state API for initializationAll implementations are gated behind the
hw-interruptscargo feature and have no effect on existing behavior when the feature is disabled.Motivation
Nanvix is a microkernel that requires preemptive scheduling via timer interrupts. Beyond this immediate use case, hardware interrupt support is a prerequisite for the ring buffer notifier mechanism planned for the
tandr/ringbranch — the upstreamNotifiertrait needs a way to signal the guest that new work is available in a virtqueue/ring buffer, and hardware interrupts are the natural delivery mechanism for that (matching the virtio model of interrupt-driven I/O notification).Key design decisions
No PvTimer abstraction
Each platform goes directly to its native mechanism rather than using a common timer abstraction. This avoids lowest-common-denominator limitations — KVM's in-kernel PIT is zero-overhead, MSHV's SynIC timer is hypervisor-native, and WHP's software timer uses async interrupt injection to work with WHP's blocking
WHvRunVirtualProcessor.Guest halt mechanism
The guest signals "I'm done" by writing to
OutBAction::Halt(port 108) instead of using the HLT instruction. With an in-kernel LAPIC (KVM) or SynIC (MSHV), HLT is absorbed by the hypervisor to wait for the next interrupt — it never reaches userspace as a VM exit. The Halt port write always triggers a VM exit, giving Hyperlight a clean signal to stop the vCPU run loop.PIC emulation (MSHV/WHP)
A minimal userspace 8259A PIC emulation handles the interrupt acknowledge cycle for MSHV and WHP, where the guest expects a legacy PIC but interrupts are actually delivered via LAPIC. KVM doesn't need this because its in-kernel IRQ chip handles the full PIC/APIC routing natively.
Changes
Commit 1: Foundation — PIC, OutBAction variants, guest halt
outb.rs:OutBAction::PvTimerConfig(107) andOutBAction::Halt(108) enum variantspic.rs: Userspace 8259A PIC emulation for MSHV/WHPmod.rs: Registerpicmodule with cfg gateexit.rs: Guesthalt()function using Halt port +cli; hltsafety fallbackentry.rs: Default no-op IRQ handler at vector 0x20dispatch.rs: Use halt port in dispatch epilogueoutb.rs(host):handle_outbmatch arms for PvTimerConfig and HaltCommit 2: KVM — in-kernel IRQ chip + PIT
IrqchipandPit2set_pit2(), Halt port handlingCommit 3: MSHV — SynIC direct-mode timer
IA32_APIC_BASEto prevent guest APIC disableCommit 4: WHP — software timer thread
WHvGet/SetVirtualProcessorInterruptControllerState2)WHvRequestInterruptfor periodic interrupt injectionset_sregsAPIC_BASE filtering to prevent accidental LAPIC disableCommit 5: Tests
#[cfg_attr(feature = "hw-interrupts", ignore)]on tests that conflict with hw-interrupts modeCommit 6: CI
hw-interruptstest step indep_build_test.ymlhw-interruptsrecipe in Justfiletest-like-ciandbuild-test-like-ciTest plan
cargo clippy -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,init-paging"passescargo clippy -p hyperlight-host --no-default-features -F "kvm,init-paging"passes (without hw-interrupts)cargo test -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,init-paging" --lib— 77 passed, 12 ignoredcargo test -p hyperlight-host --no-default-features -F "kvm,init-paging" --lib— 83 passed, 5 ignoredcargo build -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,executable_heap" --libsucceedscargo test -p hyperlight-host --no-default-features -F "init-paging,hw-interrupts" -- hw_timer_interrupts --nocapture