Skip to content

Conversation

@dtrawins
Copy link
Collaborator

🛠 Summary

CVS-179106

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

@@ -0,0 +1,86 @@
#!/bin/bash -x
#
# Copyright (c) 2024 Intel Corporation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2026

@@ -0,0 +1,22 @@
export MODEL=$1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyright header

@atobiszei atobiszei requested a review from Copilot January 13, 2026 15:11
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Berkeley Function Call Leaderboard (BFCL) integration to version 4, refactoring test infrastructure and adding support for new models and chat templates.

Changes:

  • Refactored testing scripts by extracting model test logic from export_all_models.sh into dedicated test scripts
  • Updated gorilla benchmark integration to a newer version with modified patch configurations
  • Added support for additional models (GPT-OSS-20B, Qwen3-Coder-30B, Devstral) with corresponding chat templates

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/accuracy/test_small_models.sh New script containing extracted model testing logic with configurable tool-guided generation
tests/accuracy/test_single_model.sh New script for testing individual models with specific configurations
tests/accuracy/test_case_ids_to_generate.json Configuration file defining test case IDs for generation
tests/accuracy/install_gorilla.sh New installation script for gorilla benchmark with updated commit hash
tests/accuracy/export_all_models.sh Removed model testing logic and added new model export commands
extras/chat_template_examples/chat_template_devstral.jinja New Devstral chat template with comprehensive system prompts and tool call formatting
demos/continuous_batching/accuracy/gorilla.patch Updated patch for newer gorilla version with modified configuration handling
demos/continuous_batching/accuracy/README.md Updated documentation with new gorilla version, installation steps, and test categories

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 7 to 12
--rest_port 8000 --model_repository_path /models --source_model ${model_name}-${precision} \
--tool_parser ${tool_parser} --model_name ovms-model \
--cache_size 0 --task text_generation

echo wait for model server to be ready
while [ "$(curl -s http://localhost:8000/v3/models | jq -r '.data[0].id')" != "${model_name}-${precision}" ] ; do echo waiting for LLM model; sleep 1; done
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing from the previous issue, this line also uses undefined lowercase variables ${model_name} and ${precision} instead of ${MODEL} and ${PRECISION}.

Suggested change
--rest_port 8000 --model_repository_path /models --source_model ${model_name}-${precision} \
--tool_parser ${tool_parser} --model_name ovms-model \
--cache_size 0 --task text_generation
echo wait for model server to be ready
while [ "$(curl -s http://localhost:8000/v3/models | jq -r '.data[0].id')" != "${model_name}-${precision}" ] ; do echo waiting for LLM model; sleep 1; done
--rest_port 8000 --model_repository_path /models --source_model ${MODEL}-${PRECISION} \
--tool_parser ${tool_parser} --model_name ovms-model \
--cache_size 0 --task text_generation
echo wait for model server to be ready
while [ "$(curl -s http://localhost:8000/v3/models | jq -r '.data[0].id')" != "${MODEL}-${PRECISION}" ] ; do echo waiting for LLM model; sleep 1; done

Copilot uses AI. Check for mistakes.


docker stop ovms 2>/dev/null
docker run -d --name ovms --user $(id -u):$(id -g) --rm -p 8000:8000 -v $(pwd)/models:/models openvino/model_server:latest \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most probably we would need to pass image label/tag so this could be the default, but otherwise user passed arguments should be used here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants