Skip to content

Conversation

@ludfjig
Copy link
Contributor

@ludfjig ludfjig commented Nov 11, 2025

Gives Hypervisor::run a default implementation and removes each specialized implementation from kvm/mshv/whp. Adds run_vcpu to just run vm for 1 single exit. The rest of the changes are just made to enable the above. Also refactors guest function cancellation slightly (simpler). Certain logic is introduced in this PR that will be removed in my next pr.

@ludfjig ludfjig added the kind/refactor For PRs that restructure or remove code without adding new functionality. label Nov 11, 2025
@ludfjig ludfjig force-pushed the intermediate_pr branch 3 times, most recently from 8602907 to 027fa17 Compare November 12, 2025 01:08
Comment on lines +274 to +279
// TODO in next PR: make sure to remove hw breakpoint here
// In case the hw breakpoint is the entry point, remove it to
// avoid hanging here as gdb does not remove breakpoints it
// has not set.
// Gdb expects the target to be stopped when connected.
// self.remove_hw_breakpoint(vcpu_fd, entrypoint)?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note this self.remove_hw_breakpoint has been moved to another place (handle_debug in each of the 3 drivers), it will be brought back here in next PR.

@ludfjig ludfjig marked this pull request as ready for review November 12, 2025 01:53
Copy link
Contributor

@dblnz dblnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, I appreciate the effort to split the Vm trait PR into multiple smaller PRs
I left some comments, the only "concerning" one is related to the tracing spans reordering.

Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Copy link
Contributor

@simongdavies simongdavies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a few comments, two general comments:

I feel like this PR is doing 2 things and could be split up to make it easier to review:

  1. Refactor Run - which is what the title says
  2. Changing the way cancellation is implemented

The second part could (should?) be done in another PR .

I also feel like we need more information about how cancellation works, this really needs to be well documented , maybe we need a diagram or two somewhere

/// The operation should be retried
/// On Linux this can happen where a call to run the CPU can return EAGAIN
/// On Windows the platform could cause a cancelation of the VM run
/// The operation should be retried, for example this can happen on Linux where a call to run the CPU can return EAGAIN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont understand why we need to do this? Cant we exclude this from whatever match is using it on Windows? Also I dont understand why we are not seeing this on Windows since IIRC the platform itself can choose to cancel an execution which we then captured and retried, what happened to this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I know what you mean. We only construct VmExit::Retry() on linux by this mapping libc::EAGAIN => VmExit::Retry(). There is no equivalent exit on windows that needs to be retried as far as I know. Regardless, if windows exits with a VmExit::Cancelled(), but there is no cancellation requested for the sandbox, then we retry it (but we still don't construct any VmExit::Retry() which is why the warning is supressed on windows).

// ===== KILL() TIMING POINT 3: Before calling run_vcpu() =====
// If kill() is called and ran to completion BEFORE this line executes:
// - CANCEL_BIT will be set, but it's too late to prevent entering the guest this iteration
// - Signals will interrupt the guest (RUNNING_BIT=true), causing VmExit::Cancelled()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isnt quite true since Windows does not use signals?

// - CANCEL_BIT will be set, but it's too late to prevent entering the guest this iteration
// - Signals will interrupt the guest (RUNNING_BIT=true), causing VmExit::Cancelled()
// - If the guest completes before any signals arrive, kill() may have no effect
// - If there are more iterations to do (IO/host func, etc.), the next iteration will be cancelled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there are not more iterations to do , is it possible to lie about completion or cancel the next use of this Sandbox?

// ===== KILL() TIMING POINT 4: Before capturing cancel_requested =====
// If kill() is called and ran to completion BEFORE this line executes:
// - CANCEL_BIT will be set
// - Signals may still be sent (RUNNING_BIT=true) but are harmless no-ops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about windows?

}
}
Ok(VmExit::Cancelled()) => {
// If cancellation was not requested for this specific guest function call,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the scope of this problem, is it possible that we try to kill a long running execution and that we end up delivering the signal to the wrong sandbox? If so does this happen in the current implementation? Either way if this does happen we should really think about a way to prevent this from happening as we don't really want one sandboxes behavior to influence the execution of another sandbox.

Copy link
Contributor Author

@ludfjig ludfjig Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does happen in the current implementation and as far as I am aware it's impossible to avoid. Signals are delivered async so there's no guaranteeing what happens in between sending the signal and the signal being delivered. A totally new vm could have started on the same thread or a new guest call could have been made. I think the only thing we can do here is ignore it if signal was not intended for this guest-call or sandbox.

/// This is acceptable because the generation tracking provides an additional
/// safety layer. Even if a stale kill somehow stamped cancel_requested, the
/// generation mismatch would cause it to be ignored.
call_active: AtomicBool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you delete this? Seemed like a reasonable optimization to me , it avoided a lot of useless processing

Copy link
Contributor Author

@ludfjig ludfjig Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag provided no optimization as far as I can tell. We always already exit early if running flag is not true.

/// - CANCEL_BIT remains unchanged
///
/// # Memory Ordering
/// Uses `Release` ordering to ensure that the `tid` store (which uses `Relaxed`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is extremely confusing, I don't understand the relationship between store of this running atomic and the tid store or the relevance of the relaxed orderings, I think generally where we are use atomics and ordering we should document why the orderings are the way they are , these things are hard to understand when looking at the code later.

I think it will also make reviewing the code easier.

Copy link
Contributor Author

@ludfjig ludfjig Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, typo in the docs. And agree. I'll create a new document for this

Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
@ludfjig ludfjig marked this pull request as draft November 19, 2025 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/refactor For PRs that restructure or remove code without adding new functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants