Skip to content

feat: add hardware interrupt support (PIC, KVM IRQ chip, MSHV SynIC, WHP software timer)#1272

Open
danbugs wants to merge 13 commits intohyperlight-dev:mainfrom
nanvix:danbugs/hw-interrupts
Open

feat: add hardware interrupt support (PIC, KVM IRQ chip, MSHV SynIC, WHP software timer)#1272
danbugs wants to merge 13 commits intohyperlight-dev:mainfrom
nanvix:danbugs/hw-interrupts

Conversation

@danbugs
Copy link
Contributor

@danbugs danbugs commented Mar 3, 2026

Summary

Adds hardware interrupt support to Hyperlight, enabling guest OS kernels to receive timer interrupts for preemptive scheduling. Each hypervisor backend uses its native interrupt delivery mechanism:

  • KVM: In-kernel IRQ chip (PIC + IOAPIC + LAPIC) + PIT for native timer interrupts
  • MSHV: SynIC STIMER direct-mode timer with LAPIC, including MSR intercepts to prevent the guest from disabling the APIC
  • WHP: Software timer thread with WHvRequestInterrupt for periodic interrupt injection, using the bulk LAPIC state API for initialization

All implementations are gated behind the hw-interrupts cargo feature and have no effect on existing behavior when the feature is disabled.

Motivation

Nanvix is a microkernel that requires preemptive scheduling via timer interrupts. Beyond this immediate use case, hardware interrupt support is a prerequisite for the ring buffer notifier mechanism planned for the tandr/ring branch — the upstream Notifier trait needs a way to signal the guest that new work is available in a virtqueue/ring buffer, and hardware interrupts are the natural delivery mechanism for that (matching the virtio model of interrupt-driven I/O notification).

Key design decisions

No PvTimer abstraction

Each platform goes directly to its native mechanism rather than using a common timer abstraction. This avoids lowest-common-denominator limitations — KVM's in-kernel PIT is zero-overhead, MSHV's SynIC timer is hypervisor-native, and WHP's software timer uses async interrupt injection to work with WHP's blocking WHvRunVirtualProcessor.

Guest halt mechanism

The guest signals "I'm done" by writing to OutBAction::Halt (port 108) instead of using the HLT instruction. With an in-kernel LAPIC (KVM) or SynIC (MSHV), HLT is absorbed by the hypervisor to wait for the next interrupt — it never reaches userspace as a VM exit. The Halt port write always triggers a VM exit, giving Hyperlight a clean signal to stop the vCPU run loop.

PIC emulation (MSHV/WHP)

A minimal userspace 8259A PIC emulation handles the interrupt acknowledge cycle for MSHV and WHP, where the guest expects a legacy PIC but interrupts are actually delivered via LAPIC. KVM doesn't need this because its in-kernel IRQ chip handles the full PIC/APIC routing natively.

Changes

Commit 1: Foundation — PIC, OutBAction variants, guest halt

  • outb.rs: OutBAction::PvTimerConfig (107) and OutBAction::Halt (108) enum variants
  • pic.rs: Userspace 8259A PIC emulation for MSHV/WHP
  • mod.rs: Register pic module with cfg gate
  • exit.rs: Guest halt() function using Halt port + cli; hlt safety fallback
  • entry.rs: Default no-op IRQ handler at vector 0x20
  • dispatch.rs: Use halt port in dispatch epilogue
  • outb.rs (host): handle_outb match arms for PvTimerConfig and Halt

Commit 2: KVM — in-kernel IRQ chip + PIT

  • Capability checks for Irqchip and Pit2
  • In-kernel IRQ chip + PIT creation during VM setup
  • hw-interrupts run loop: HLT re-entry, PvTimerConfig → set_pit2(), Halt port handling

Commit 3: MSHV — SynIC direct-mode timer

  • LAPIC initialization (SVR, TPR, DFR, LVT entries)
  • SynIC enable + STIMER0 configuration (direct-mode, periodic)
  • MSR intercept for IA32_APIC_BASE to prevent guest APIC disable
  • PIC emulation integration with LAPIC EOI bridging
  • IO port handling for PIC, PIT data, diagnostic ports

Commit 4: WHP — software timer thread

  • LAPIC emulation mode detection and partition setup
  • Bulk LAPIC state API (WHvGet/SetVirtualProcessorInterruptControllerState2)
  • Software timer thread with WHvRequestInterrupt for periodic interrupt injection
  • set_sregs APIC_BASE filtering to prevent accidental LAPIC disable
  • LAPIC EOI via bulk state API

Commit 5: Tests

  • Unit tests for LAPIC register helpers, SynIC timer config, PIT values
  • #[cfg_attr(feature = "hw-interrupts", ignore)] on tests that conflict with hw-interrupts mode

Commit 6: CI

  • hw-interrupts test step in dep_build_test.yml
  • hw-interrupts recipe in Justfile test-like-ci and build-test-like-ci

Test plan

  • cargo clippy -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,init-paging" passes
  • cargo clippy -p hyperlight-host --no-default-features -F "kvm,init-paging" passes (without hw-interrupts)
  • cargo test -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,init-paging" --lib — 77 passed, 12 ignored
  • cargo test -p hyperlight-host --no-default-features -F "kvm,init-paging" --lib — 83 passed, 5 ignored
  • Non-init-paging build: cargo build -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,executable_heap" --lib succeeds
  • Windows/WHP testing (manual): cargo test -p hyperlight-host --no-default-features -F "init-paging,hw-interrupts" -- hw_timer_interrupts --nocapture
  • MSHV testing via CI

@danbugs danbugs force-pushed the danbugs/hw-interrupts branch from 268cf83 to ddd016c Compare March 3, 2026 23:04
@danbugs danbugs added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Mar 3, 2026
@danbugs danbugs mentioned this pull request Mar 4, 2026
Copy link
Member

@syntactically syntactically left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond this immediate use case, hardware interrupt support is a prerequisite for the ring buffer notifier mechanism planned for the tandr/ring branch — the upstream Notifier trait needs a way to signal the guest that new work is available in a virtqueue/ring buffer, and hardware interrupts are the natural delivery mechanism for that (matching the virtio model of interrupt-driven I/O notification).

I'm not sure we've made this decision yet. I believe the intention was to benchmark whether that made sense, or if a custom ABI (say some register flag set the next time that the guest was reentered through one of the existing entrypoint stubs) ended up being faster (since it would allow some extra trips up and down through the hv).

I'm actually a bit curious if we have similar data here as well---maybe the complexity of emulating a good fraction of an interrupt controller is worth it for the performance in the KVM case where there's extra support for it, but especially in the other cases, are we sure this is actually any better than just having a custom interface for "jump to this address every so often"? It seems like we don't really need all the complex interrupt routing, priority, etc parts of the interrupt controller---we just need the timer pulse?

Where the guest expects a legacy PIC but interrupts are actually delivered via LAPIC

Since we don't intend to actually have a PIC at any point, can we just modify the guest to get rid of this assumption when it's being built for the Hyperlight platform?

use hyperlight_common::outb::OutBAction;
use tracing::instrument;

/// Halt the execution of the guest and returns control to the host.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment (and maybe name) (and commit message) should clarify that this is meant to be used for wfi rather than actually ending execution as we've been using hlt for in the past?

hl_exception_handler = sym super::handle::hl_exception_handler,
);

// Default no-op IRQ handler for hardware interrupts (vectors 0x20-0x2F).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIUI this is code that should be running in target_arch = i686 only right now, so perhaps ought not be in amd64?


// Default no-op IRQ handler for hardware interrupts (vectors 0x20-0x2F).
// Sends a non-specific EOI to the master PIC and returns.
// This prevents unhandled-interrupt faults when the in-kernel PIT fires
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This race condition is still present since this is only being installed in init_idt after the guest has already been running instructions for some time?

If this is a serious concern, surely we need the host to control the vm state a bit better---either only enabling interrupts after initialize() has finished and the guest kernel is up, or presetting the idt state before entering the guest for the first time (although I'm unsure if there is API for that)

call {internal_dispatch_function}\n
mov dx, 108\n
out dx, al\n
cli\n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any use for this wfi fallback at the end here? If the guest does resume execution on interrupt delivery from a hlt here, something has gone very wrong.

@@ -0,0 +1,228 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be in some arch (i686?) specific directory, I think.

.set_lapic(&lapic)
.map_err(|e| CreateVmError::InitializeVm(e.into()))?;

// Install MSR intercept for IA32_APIC_BASE (MSR 0x1B) to prevent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like something that should just be fixed by having the guest kernel not look for an I/O APIC when it is being built for hyperlight, rather than something that should be hacked around in Hyperlight.

if let Ok(mut lapic) = self.vcpu_fd.get_lapic() {
let svr = read_lapic_u32(&lapic.regs, 0xF0);
if svr & 0x100 == 0 {
write_lapic_u32(&mut lapic.regs, 0xF0, 0x1FF);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are these values coming from?

const LAPIC_EMULATION_BIT: u64 = 1 << 1;

#[cfg(feature = "hw-interrupts")]
let mut lapic_emulation = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to support hosts without LAPIC emulation, or could we just error here if it was missing?

sregs.into();

// When LAPIC emulation is active, skip writing APIC_BASE.
// The generic CommonSpecialRegisters defaults APIC_BASE to 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be changed in the CommonSpecialRegisters code? Is it also wrong on other HVs?


#[cfg(not(feature = "init-paging"))]
pub(crate) fn standard_real_mode_defaults() -> Self {
// In real mode, all data/code segment registers must have valid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same commit as the one in #1271 ?

@danbugs danbugs force-pushed the danbugs/hw-interrupts branch from d0daeee to ba7dce0 Compare March 5, 2026 20:18
@danbugs
Copy link
Contributor Author

danbugs commented Mar 5, 2026

Major rework based on the feedback. Addressing all review comments:

Custom ABI approach (syntactically's suggestion)

Adopted the "custom ABI" approach. The changes:

  1. PIC state machine eliminated (~130 lines of pic.rs deleted). Timer vector 0x20 is hardcoded — Nanvix always remaps to 0x20 via ICW2, so there's no need to emulate the PIC initialization sequence. PIC I/O ports (0x20-0x21, 0xA0-0xA1) are accepted as no-ops. The only retained PIC logic: a 3-line EOI bridge on port 0x20 write that signals end-of-interrupt to the backend.

  2. Guest requests timer via paravirtual port (PvTimerConfig, port 107). Guest writes the period in nanoseconds. Host spawns a timer thread that fires at that rate. No PIT countdown emulation, no channel registers.

  3. Guest signals readiness via Halt port (port 108) before cli; hlt. This ensures KVM's in-kernel LAPIC doesn't absorb the HLT exit — the outb triggers an IO exit first.

Backend simplification

  • KVM: irqfd replaces in-kernel PIT. Host timer thread writes to an EventFd → kernel injects vector via in-kernel PIC. No userspace signal handling races — irqfd is kernel-mediated.
  • MSHV: request_virtual_interrupt replaces SynIC. Direct interrupt injection per timer tick — no SynIC message pages, no auto-EOI configuration, no synthetic timer setup. Addresses the "why not just opt into specific features" concern.
  • WHP: WHvRequestInterrupt (same pattern as MSHV).

Specific review comments addressed

  • "PIC file should be in arch-specific directory" → File deleted entirely, no longer needed.
  • "Guest IOAPIC workaround should be fixed in guest" → Agreed. Guest no longer looks for IOAPIC. The PIC initialization in the guest kernel writes to PIC ports which are accepted as no-ops by the host.
  • "Where are these [SynIC] values coming from?" → SynIC code removed entirely.
  • "Do we need to support hosts without LAPIC emulation?" → WHP LAPIC emulation check retained as it's part of the setup path, but could be simplified to an error.
  • "Race condition between interrupt delivery and IDT setup" → Still present in the current code (IDT installed after guest starts). Host-side mitigation: timer doesn't start until guest explicitly requests it via PvTimerConfig port. This means no interrupts fire during init before the IDT is installed.
  • "Is xsave the same commit as in feat: add configurable scratch region base GPA #1271?" → Yes, it was duplicated. Dropped from both this PR and feat: add configurable scratch region base GPA #1271.

Testing

All existing tests pass on KVM (default features, kvm-only features, and hw-interrupts features). The hw_timer_interrupts integration test validates the full flow: guest configures timer → timer fires → guest increments counter → host verifies count > 0.

MSHV and WHP testing pending.

@danbugs danbugs force-pushed the danbugs/hw-interrupts branch 2 times, most recently from e481d13 to 5d77fbd Compare March 5, 2026 21:27
danbugs added 13 commits March 6, 2026 05:11
- Add PvTimerConfig (port 107) and Halt (port 108) to OutBAction enum
- Add userspace 8259A PIC emulation for MSHV/WHP (KVM uses in-kernel PIC)
- Add halt() function in guest exit module using Halt port instead of HLT
- Add default no-op IRQ handler at IDT vector 0x20 for PIC-remapped IRQ0
- Update dispatch epilogue to use Halt port before cli+hlt fallback
- Add hw-interrupts feature flag to hyperlight-host
- Create IRQ chip (PIC + IOAPIC + LAPIC) and PIT before vCPU creation
- Add hw-interrupts run loop that handles HLT re-entry, Halt port, and
  PvTimerConfig port (ignored since in-kernel PIT handles scheduling)
- Non-hw-interrupts path also recognizes Halt port for compatibility
- Enable LAPIC in partition flags for SynIC direct-mode timer delivery
- Configure LAPIC (SVR, TPR, LINT0/1, LVT Timer) during VM creation
- Install MSR intercept on IA32_APIC_BASE to prevent guest from
  disabling the LAPIC globally
- Add SynIC STIMER0 configuration via PvTimerConfig IO port
- Add userspace PIC emulation integration for MSHV
- Restructure run_vcpu into a loop for HLT re-entry and hw IO handling
- Bridge PIC EOI to LAPIC EOI for SynIC timer interrupt acknowledgment
- Handle PIT/speaker/debug IO ports in userspace
Add WHP hardware interrupt support using a host-side software timer
thread that periodically injects interrupts via WHvRequestInterrupt.

Key changes:
- Detect LAPIC emulation support via WHvGetCapability
- Initialize LAPIC via bulk interrupt-controller state API
  (WHvGet/SetVirtualProcessorInterruptControllerState2) since
  individual APIC register writes fail with ACCESS_DENIED
- Software timer thread for periodic interrupt injection
- LAPIC EOI handling for PIC-only guest acknowledgment
- PIC emulation integration for MSHV/WHP shared 8259A
- Filter APIC_BASE from set_sregs when LAPIC emulation active
- HLT re-entry when timer is active
Add Halt outb (port 108) before cli/hlt in guest init and dummyguest so
KVM's in-kernel LAPIC does not absorb the HLT exit.  Also restore the
hw_timer_interrupts integration test that was inadvertently dropped.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
The access_gpa and SharedMemoryPageTableBuffer::new functions now take
a scratch_base_gpa (u64) instead of scratch_size (usize). Update the
read_guest_memory_by_gva caller to pass the correct value from
layout.get_scratch_base_gpa().

Signed-off-by: danbugs <danilochiarlone@gmail.com>
…mpatibility

MSHV and WHP reject vCPU state with zeroed segment registers (ES, SS,
FS, GS, LDT) and uninitialized XSAVE areas.  Properly initialize all
segment registers in standard_real_mode_defaults() and add reset_xsave()
call after set_sregs() to ensure FPU state (FCW, MXCSR) is valid.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Replace create_pit2() with a host-side timer thread that fires an
EventFd registered via register_irqfd for GSI 0 (IRQ0). The in-kernel
IRQ chip (PIC + IOAPIC + LAPIC) is kept for proper interrupt routing.

When the guest writes to PvTimerConfig (port 107), the host parses
the requested period and spawns a timer thread that periodically
writes to the EventFd. Guest PIT port writes (0x40-0x43) exit to
userspace and are silently ignored.

Tested with nanvix hello-c.elf: works correctly.
… + host timer thread

Replace the SynIC-based timer (STIMER0_CONFIG/COUNT + SCONTROL) with a
host timer thread that periodically calls request_virtual_interrupt()
(HVCALL_ASSERT_VIRTUAL_INTERRUPT, hypercall 148).

This makes MSHV consistent with the KVM irqfd and WHP software timer
patterns — all three platforms now use: (1) start timer on PvTimerConfig,
(2) host thread periodically injects interrupt, (3) stop on Halt/Drop.

Changes:
- Remove SynIC dependency: make_default_synthetic_features_mask,
  SCONTROL, STIMER0_CONFIG, STIMER0_COUNT imports and usage
- Set synthetic_proc_features to 0 (no SynIC features needed)
- Wrap vm_fd in Arc<VmFd> for thread-safe sharing
- Add timer_stop + timer_thread fields (same pattern as KVM)
- PvTimerConfig handler spawns timer thread with request_virtual_interrupt
- Add Drop impl for timer cleanup
- Remove build_stimer_config() and period_us_to_100ns() helpers
- Remove SynIC-related unit tests

Kept unchanged: LAPIC init, MSR intercept, PIC emulation, EOI bridging
(all still required for interrupt delivery).

NOT RUNTIME TESTED — MSHV not available on this system.
… PIC ports

Remove the Pic struct from both MSHV and WHP. The nanvix guest always
remaps IRQ0 to vector 0x20 via PIC ICW2, so the host can hardcode the
timer vector instead of tracking the ICW initialization sequence.

PIC ports (0x20/0x21/0xA0/0xA1) are now accepted as no-ops. The only
PIC-related logic retained is LAPIC EOI bridging: when the guest sends
a non-specific EOI on port 0x20, the host clears the LAPIC ISR bit.
This is 3 lines of logic vs the ~130-line pic.rs state machine.

This addresses the PR 1272 reviewer concern about emulating "a good
fraction of the IC controller" — the PIC state machine was the
unnecessary complexity, not the timer delivery mechanism.
@danbugs danbugs force-pushed the danbugs/hw-interrupts branch from 5d77fbd to e84c56d Compare March 6, 2026 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants