Skip to content

fix: fix simple_chat Responses tool schema + model discovery fallback#216

Open
gyliu513 wants to merge 2 commits intollamastack:mainfrom
gyliu513:simple-chat
Open

fix: fix simple_chat Responses tool schema + model discovery fallback#216
gyliu513 wants to merge 2 commits intollamastack:mainfrom
gyliu513:simple-chat

Conversation

@gyliu513
Copy link

@gyliu513 gyliu513 commented Jan 20, 2026

What does this PR do?

  • Root cause: examples/agents/simple_chat.py sent tools without the required type discriminator for the Responses API, causing a 400. Also, examples/agents/utils.py assumed the client model schema always exposes model_type, which is not true for some deployments, leading to AttributeError during model selection.
  • Fix: Send web search tool using Responses-compliant schema ({"type": "web_search"}) and adjust logging to match AgentEventLogger output. Add resilient model discovery helpers to resolve model id/type across client schema variants and default to LLM when type is missing.
  • Notes: Changes are limited to examples; no runtime API behavior changes.
(stack) gualiu@gualiu-mac llama-stack-apps % python -m examples.agents.simple_chat --host localhost --port 8321
INFO:httpx:HTTP Request: GET http://localhost:8321/v1/shields "HTTP/1.1 200 OK"
No available shields. Disabling safety.
INFO:httpx:HTTP Request: GET http://localhost:8321/v1/models "HTTP/1.1 200 OK"
Using model: ollama/llama3.2:3b
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/conversations "HTTP/1.1 200 OK"
User> Hello
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/responses "HTTP/1.1 200 OK"
🤔 {"name": "search", "parameters": {"query": "hello world"}}
User> Search web for which players played in the winning team of the NBA western conference semifinals of 2024
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/responses "HTTP/1.1 200 OK"
🤔

🔧 Executing web_search (server-side)...
🤔 I was unable to find any specific information on the 2024 NBA Western Conference Semifinals playoff winners. However, I can suggest some options to help you find the answer:

You can try searching for the official NBA website or social media channels for the latest updates and news.

Another option is to check online sports websites such as ESPN, Sports Illustrated, or CBS Sports, which provide comprehensive coverage of the NBA playoffs.

Feature/Issue validation/testing/test plan

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration or test plan.

  • Test A
    Logs for Test A

  • Test B
    Logs for Test B

Sources

Please link relevant resources if necessary.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Thanks for contributing 🎉!

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 20, 2026
@gyliu513
Copy link
Author

@cdoern @leseb ^^

@gyliu513 gyliu513 mentioned this pull request Jan 22, 2026
2 tasks
Comment on lines 48 to 58
agent_kwargs = {
"model": model_id,
"instructions": "",
# OpenAI Responses tool schema requires a type discriminator.
"tools": [{"type": "web_search"}],
"input_shields": available_shields,
"output_shields": available_shields,
"enable_session_persistence": False,
}
allowed_params = set(inspect.signature(Agent.__init__).parameters)
filtered_kwargs = {k: v for k, v in agent_kwargs.items() if k in allowed_params}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not clear that any developer will write code like this when creating agents using llama stack client. Can you make it so that the code here is something a new developer can just copy? We dont need any backward compatibility here either. We could just use the latest version. We can have copies of the examples for older versions if needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@gyliu513
Copy link
Author

@raghotham can you help check if this can be merged? Thanks!

return available_models[0]


def can_model_chat(client: LlamaStackClient, model_id: str) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have to run a chat completion to see if the model supports chat? we already have model type: https://github.com/llamastack/llama-stack/blob/ffa98595e696c7ab3e0e933d0ed75375ee1d7b84/src/llama_stack_api/models/models.py#L23

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghotham I can see there are some models with llm type still do not support chat, like ollama/all-minilm:latest.

(llama-stack) (base) gualiu@gualiu-mac llama-stack % curl -s http://localhost:8321/v1/models \
  | jq '.data[] | select(.id=="ollama/all-minilm:latest")'
{
  "id": "ollama/all-minilm:latest",
  "object": "model",
  "created": 1769569923,
  "owned_by": "llama_stack",
  "custom_metadata": {
    "model_type": "llm",
    "provider_id": "ollama",
    "provider_resource_id": "all-minilm:latest"
  }
}

But this model do not support chat.

(stack) gualiu@gualiu-mac llama-stack-apps % python -m examples.agents.simple_chat --host localhost --port 8321  --model
_id ollama/all-minilm:latest
INFO:httpx:HTTP Request: GET http://localhost:8321/v1/models "HTTP/1.1 200 OK"
Using model: ollama/all-minilm:latest
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/conversations "HTTP/1.1 200 OK"
User> Hello
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/responses "HTTP/1.1 200 OK"
🤔
❌ Turn failed: Error code: 400 - {'error': {'message': '"all-minilm:latest" does not support chat', 'type': 'api_error', 'param': None, 'code': None}}
User> Search web for which players played in the winning team of the NBA western conference semifinals of 2024
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/responses "HTTP/1.1 200 OK"
🤔
❌ Turn failed: Error code: 400 - {'error': {'message': '"all-minilm:latest" does not support chat', 'type': 'api_error', 'param': None, 'code': None}}

I think besides model_type, we may need to add a new field named as capability for the model, the capability can be chat, completion, tool_calling etc, comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants