Skip to content

fix(cli): handle sync/async serve functions in m serve#784

Open
markstur wants to merge 6 commits intogenerative-computing:mainfrom
markstur:fix_async
Open

fix(cli): handle sync/async serve functions in m serve#784
markstur wants to merge 6 commits intogenerative-computing:mainfrom
markstur:fix_async

Conversation

@markstur
Copy link
Copy Markdown
Contributor

@markstur markstur commented Apr 2, 2026

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

fix(cli): handle sync/async serve functions in m serve

Fixes sync/async mismatch in `m serve` by detecting function type and
handling appropriately:
- Async serve functions are awaited directly
- Sync serve functions are wrapped in asyncio.to_thread() to prevent
  blocking FastAPI's event loop

This ensures the server can handle concurrent requests efficiently
regardless of whether user-defined serve functions are sync or async.

Changes:
- cli/serve/app.py: Add asyncio/inspect imports, update make_chat_endpoint()
  to detect coroutine functions and wrap sync functions in to_thread()
- test/cli/test_serve_sync_async.py: Add comprehensive test suite (9 tests)
  including empirical timing test that proves non-blocking behavior

feat: Map OpenAI parameters to ModelOption sentinels:

   - temperature → ModelOption.TEMPERATURE
   - max_tokens → ModelOption.MAX_NEW_TOKENS
   - seed → ModelOption.SEED


Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

@markstur markstur requested a review from a team as a code owner April 2, 2026 23:18
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@markstur markstur changed the title Fix async fix(cli): handle sync/async serve functions in m serve Apr 2, 2026
@github-actions github-actions bot added the bug Something isn't working label Apr 2, 2026
@markstur markstur marked this pull request as draft April 2, 2026 23:41
@markstur markstur marked this pull request as ready for review April 3, 2026 15:32
markstur added 4 commits April 3, 2026 10:29
Fixes sync/async mismatch in `m serve` by detecting function type and
handling appropriately:
- Async serve functions are awaited directly
- Sync serve functions are wrapped in asyncio.to_thread() to prevent
  blocking FastAPI's event loop

This ensures the server can handle concurrent requests efficiently
regardless of whether user-defined serve functions are sync or async.

Changes:
- cli/serve/app.py: Add asyncio/inspect imports, update make_chat_endpoint()
  to detect coroutine functions and wrap sync functions in to_thread()
- test/cli/test_serve_sync_async.py: Add comprehensive test suite (9 tests)
  including empirical timing test that proves non-blocking behavior

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
   - temperature → ModelOption.TEMPERATURE
   - max_tokens → ModelOption.MAX_NEW_TOKENS
   - seed → ModelOption.SEED

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
With async added, need to fix the catch

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Fix for -- Error: function raises but has no Raises section

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

* filter out model, n, user, and extra
* comment the filtering
* use a map and ModelOption.replace_keys() for mapping
* fix replace_keys to no-op with from == to

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
@markstur markstur requested a review from a team as a code owner April 8, 2026 00:06
* Use whole seconds for timing test so it should be less flakey
* Return a real ModelOutputThunk instead of a misleading Mock
* Use AsyncMock with side_effect for consistency

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
@markstur
Copy link
Copy Markdown
Contributor Author

markstur commented Apr 8, 2026

Thanks @planetf1 for the review! I fixed each. Please see the comment/question about whether we really want an allowlist. I interpreted that as not the main objective there (but I have a feeling I read it with bias). Happy to address it if you think we should pursue that right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants