Skip to content

[RFC] Uniform task spawning (structural-typed version)#727

Open
williampMSFT wants to merge 1 commit intoOpenDevicePartnership:v0.2.0from
williampMSFT:user/williamp/es-uniform-spawn
Open

[RFC] Uniform task spawning (structural-typed version)#727
williampMSFT wants to merge 1 commit intoOpenDevicePartnership:v0.2.0from
williampMSFT:user/williamp/es-uniform-spawn

Conversation

@williampMSFT
Copy link
Contributor

This is an option for uniform spawning of tasks. It attempts to make it more difficult to incorrectly use tasks by doing the following three things:

  1. Restructure tasks such that only one long-running task is required, and that task does a 'select!' over all the tasks that were previously required. This makes it impossible for users to only launch some of the tasks they need, or to launch them in an order that doesn't work
  2. Make each service's init() function return a Result<&'hw Self, ServiceRunner> rather than just a &'hw Self. The user is required to call .run() on the ServiceRunner for the service to function, which they will typically do from a task that they've spawned on their executor. This makes it difficult (but not impossible) for a user to miss spawning the task, because to do that, they have to ignore the ServiceRunner that was returned at init() time, which requires them to explicitly ignore it by binding it to a name with a leading _.
  3. Create a spawn_service!() macro that does all the boilerplate of declaring storage and spawning tasks on Embassy for a given service. This doesn't work on Tokio, but Tokio is mostly intended for test usage, where it's generally desirable to select!() over the spawn token and some 'test arm' because that lets the lifetime of the service not be 'static, which is probably what you want for tests.

This variant does not add a trait bound to RunnableService that requires an init() function with exactly two arguments - rather, it forwards all extraneous arguments to the init() function. This approach has the upside of requiring very little ceremony - you can just directly call spawn_service!() with the args you want to pass through to init(), but comes at the cost of not catching issues with the type signature of the init() function that might make it not work well

This approach has the upside of catching things that won't work with the spawn_service!() macro at service compile time rather than spawn_service!() instantiation time, but comes at the cost of a bit more ceremony - each service has to declare an 'init params' struct, and usages outside of spawn_service!() require the RunnableService trait to be in scope.

Copilot AI review requested due to automatic review settings February 20, 2026 18:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new uniform task spawning pattern for services to prevent incorrect usage. It restructures service initialization to make it harder to forget spawning required tasks or spawning them incorrectly.

Changes:

  • Adds new RunnableService trait and ServiceRunner infrastructure to embedded-service that enforces proper service lifecycle management through the type system
  • Migrates time-alarm-service and espi-service to the new pattern, consolidating multiple tasks into a single select!-based event loop and removing standalone task modules
  • Provides spawn_service! macro for Embassy executors to reduce boilerplate and ensure correct initialization

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
embedded-service/src/service.rs New module defining RunnableService trait, ServiceRunner handle, impl_runner_creation_token! macro, and spawn_service! macro
embedded-service/src/lib.rs Adds public service module export
time-alarm-service/src/lib.rs Migrates init() to return (service, runner) tuple and implements RunnableService with consolidated select! loop
time-alarm-service/src/task.rs Removes standalone task module (replaced by RunnableService impl)
time-alarm-service/tests/tad_test.rs Updates tests to destructure init() result and call runner.run() directly
espi-service/src/espi_service.rs Migrates init() to return (service, runner) tuple and implements RunnableService with consolidated event loop
espi-service/src/task.rs Removes standalone task module (replaced by RunnableService impl)
espi-service/src/lib.rs Removes task module export
examples/rt685s-evk/src/bin/time_alarm.rs Demonstrates spawn_service! macro usage, replacing manual task spawning boilerplate

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@RobertZ2011 RobertZ2011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this is a pattern we could adopt, but I think the implementation should be left up to each service over attempting to standardize with traits. Each service can have its new function return (Self, RunnerToken) and then pass the token to a bespoke ServiceRunner.

/// A trait for a service that can be run on the EC.
/// Implementations of RunnableService should have an init() function to construct the service that
/// returns a Runner, which the user is expected to spawn a task for.
pub trait RunnableService<'hw> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if a trait is the right design for this. If the service is in a mutex then this would require hold a mutex lock. I think the run code can end up on ServiceRunner, with each service implementing their own code on their own struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could definitely convert ServiceRunner into a trait, move the run() method to that, and make it an associated type or something if we wanted to do that. I'd originally started with something along those lines, but ran into some snags writing a macro that could generate an implementation against a type that may or may not be generic so I bailed on that approach. Might be fine to just have everyone implement their own, though?

I think we do need a uniformly-shaped run() method on the runner, though, and I think that means the runner needs an internal reference to the service that it's the runner for, which kind of locks us out of having the mutex be external, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been doing a bit of research on patterns for this, and stumbled across a few examples of this 'run token' pattern in embassy that I think are instructive - they all do things a little differently, though, and I'm interested in your thoughts on which is better for our use case:

(1) embassy-net is pretty similar to what I'm doing here. It yields a (Stack, Runner) tuple, and all methods on Stack take &self. This seems to me like a good fit for something that is maintaining its internal state on a separate 'thread' that it owns and is expected to be shared between multiple things (time-alarm comes to mind here - it's the way you get wall-clock time).

(2) embassy-usb doesn't have a separate Runner object - once you have the thing set up, you call run(&mut self) -> ! and then you can't call stuff on it ever again. I think this is a nonstarter if we want to continue with this direct async call approach, but @felipebalbi had some interesting thoughts on using ergot for communicating between services rather than direct async calls, and in that world, this might be viable. I took a quick glance at the ergot stuff and it looks like it uses a lot of 'static, though, so I'd want to make sure there's a path that doesn't require 'static for senders/receivers/topics before committing to that; more research required.

(3) cyw43 (wifi driver) takes a third approach - they return a (NetDriver, Control, Runner) triple. The Runner is run(self) -> !, and as far as I can tell, the NetDriver and Control objects both mostly require &mut self. This seems to me like a good fit for a driver for a hardware peripheral that is expected to be only used by a single higher-level abstraction - perhaps the USB-C stuff falls into that bucket?

Notably, in the two of these that don't just tie up the entire object forever (embassy-net and cyw43), both use interior mutability to share state between the runner and the control handle, even in cases where the control handle requires &mut references to do stuff to the object.

I think if we go with (1) it buys us a free-ish implementation of ServiceRunner (which I have currently in this PR), but perhaps isn't as good of a fit for driver stuff like (3), which might be closer to what the USB-C service is doing, and we may want to support both - I missed this nuance during my initial stab at this. The 'free' implementation is also only like 5 lines of code, so not a huge deal if we need to replicate something like it in each service. I'm leaning towards something like this:

  • Make ServiceRunner a trait rather than a concrete type
  • Remove the run() method from the RunnableService trait
  • Have an associated type on RunnableService that is the ServiceRunner type
  • Leave the decision of methods taking &mut self vs &self to the services - 'shared' services that are expecting to be talked to by a bunch of other components should take &self and use interior mutability to achieve that, and 'driver' services like the USB-C service should take &mut self and leave higher-level code to decide if they want to put a mutex around the handle or not
  • In all cases, the expectation is that the ServiceRunner implementation will use interior mutability / mutexes to share with the 'control handle' for the service without requiring an external mutex, but that's an implementation detail that's transparent to the users of these services.

I think this enables both (1) and (3), since the decision about mutability isn't really part of the RunnableService trait at all - it's just a matter of what you can do with the associated 'control handle'.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed an attempt at this to #726

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different embassy crates having different patterns around this makes me more skeptical of a trait-based approach, especially at a global level. I think there's more room on a per-service basis, i.e. each service defines its own ServiceRunner trait. But I think ServiceRunner implementations will be different enough in terms of arguments and return values that there won't be enough commonality for a trait. Application code will want a runner that has fn run(_) -> ! while tests will want fn run(_). Likewise, test and application runners will require different arguments. I think we should implement our service runners first and then create a trait later if we find there's enough commonality.

Copy link
Contributor Author

@williampMSFT williampMSFT Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests should also be using runners with fn run(_) -> !, I think - they just need to select!() over a call to run() and a closure that runs their test code; when the test closure completes, then the select! drops the -> ! future and returns. We have examples of this approach in tad_test.rs.

I think ServiceRunner implementations should always have zero arguments (other than self)? The only thing instantiating those should be the service, when it returns from init(), and instantiation of the runner is out of scope for the ServiceRunner trait (unlike previously, when ServiceRunner was a type provided by this lib). Any arguments that the runner needs should be passed into the service at init time, and the fact that it gets handed off to the returned ServiceRunner impl is an implementation detail of the service that the user shouldn't have to care about. Runners taking no additional arguments is the same across all of the above patterns.

To be a bit more explicit about goals here, one of the things I'm trying to achieve is to not require the user to have service-specific knowledge of an arbitrary number of 'worker tasks' that each have their own quirks around what parameters they need - I view that as implementation details that we're currently leaking, and that leakage makes it difficult for end users to reason about what the service actually needs be used and how to uptake breaking changes. That's the thing I'm trying to make 'uniform'.

macro_rules! spawn_service {
($spawner:expr, [ $($service_ty:tt)* ], $($init_args:expr),* $(,)?) => {
{
static SERVICE: embassy_sync::once_lock::OnceLock<$($service_ty)*> = embassy_sync::once_lock::OnceLock::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a lot of benefit to a macro like this. It forces the use of 'static and OnceLock even though those might not be required in a given situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The theory is that we repeat more-or-less exactly this thing like 10 times in soc-embedded-controller and also in just about every example (excluding integration tests), and I figure if we have to do it that often, then OEMs will also run into this problem and there's probably some value in a 'default setup helper' to make it easier to adopt / harder to screw up.

I don't think we'd want to (or be able to, for that matter) make this the only way to instantiate services, but I think it (or something sort-of like it, probably need to nail down the must_spawn() thing a bit) is likely what they'd want in the 95% case? Notably, integration tests fall into that 5% bucket, and I'd expect to continue doing nonstatic allocation in those. On hardware when you're using embassy, though, I think you're already priced into 'static/OnceLock because of how embassy tasks work, right?

@williampMSFT
Copy link
Contributor Author

williampMSFT commented Feb 28, 2026

This is probably obsoleted by #740 ; leaving it around until we figure out if we want that for comparison's sake

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants