fix: fix simple_chat Responses tool schema + model discovery fallback#216
fix: fix simple_chat Responses tool schema + model discovery fallback#216gyliu513 wants to merge 2 commits intollamastack:mainfrom
Conversation
examples/agents/simple_chat.py
Outdated
| agent_kwargs = { | ||
| "model": model_id, | ||
| "instructions": "", | ||
| # OpenAI Responses tool schema requires a type discriminator. | ||
| "tools": [{"type": "web_search"}], | ||
| "input_shields": available_shields, | ||
| "output_shields": available_shields, | ||
| "enable_session_persistence": False, | ||
| } | ||
| allowed_params = set(inspect.signature(Agent.__init__).parameters) | ||
| filtered_kwargs = {k: v for k, v in agent_kwargs.items() if k in allowed_params} |
There was a problem hiding this comment.
it is not clear that any developer will write code like this when creating agents using llama stack client. Can you make it so that the code here is something a new developer can just copy? We dont need any backward compatibility here either. We could just use the latest version. We can have copies of the examples for older versions if needed.
|
@raghotham can you help check if this can be merged? Thanks! |
| return available_models[0] | ||
|
|
||
|
|
||
| def can_model_chat(client: LlamaStackClient, model_id: str) -> bool: |
There was a problem hiding this comment.
do we have to run a chat completion to see if the model supports chat? we already have model type: https://github.com/llamastack/llama-stack/blob/ffa98595e696c7ab3e0e933d0ed75375ee1d7b84/src/llama_stack_api/models/models.py#L23
There was a problem hiding this comment.
@raghotham I can see there are some models with llm type still do not support chat, like ollama/all-minilm:latest.
(llama-stack) (base) gualiu@gualiu-mac llama-stack % curl -s http://localhost:8321/v1/models \
| jq '.data[] | select(.id=="ollama/all-minilm:latest")'
{
"id": "ollama/all-minilm:latest",
"object": "model",
"created": 1769569923,
"owned_by": "llama_stack",
"custom_metadata": {
"model_type": "llm",
"provider_id": "ollama",
"provider_resource_id": "all-minilm:latest"
}
}But this model do not support chat.
(stack) gualiu@gualiu-mac llama-stack-apps % python -m examples.agents.simple_chat --host localhost --port 8321 --model
_id ollama/all-minilm:latest
INFO:httpx:HTTP Request: GET http://localhost:8321/v1/models "HTTP/1.1 200 OK"
Using model: ollama/all-minilm:latest
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/conversations "HTTP/1.1 200 OK"
User> Hello
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/responses "HTTP/1.1 200 OK"
🤔
❌ Turn failed: Error code: 400 - {'error': {'message': '"all-minilm:latest" does not support chat', 'type': 'api_error', 'param': None, 'code': None}}
User> Search web for which players played in the winning team of the NBA western conference semifinals of 2024
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/responses "HTTP/1.1 200 OK"
🤔
❌ Turn failed: Error code: 400 - {'error': {'message': '"all-minilm:latest" does not support chat', 'type': 'api_error', 'param': None, 'code': None}}
I think besides model_type, we may need to add a new field named as capability for the model, the capability can be chat, completion, tool_calling etc, comments?
What does this PR do?
Feature/Issue validation/testing/test plan
Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration or test plan.
Test A
Logs for Test A
Test B
Logs for Test B
Sources
Please link relevant resources if necessary.
Before submitting
Pull Request section?
to it if that's the case.
Thanks for contributing 🎉!