add Devstral demo as code assistant #3911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

dtrawins wants to merge 3 commits into main from vs-demo

+88 −18

Collaborator

dtrawins commented Jan 15, 2026

🛠 Summary

🧪 Checklist

Unit tests added.
The documentation updated.
Change follows security best practices.
``


          add Devstral demo as code assistant

Copilot AI review requested due to automatic review settings

January 15, 2026 22:58

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

This PR adds support for the Devstral model as a code assistant option in the demo documentation. The changes enable users to deploy and configure the unsloth/Devstral-Small-2507 model for local code assistance tasks.

Changes:

Added devstral as a supported tool parser in the LLM reference documentation
Introduced deployment instructions for unsloth/Devstral-Small-2507 model on both Linux and Windows platforms
Updated existing model deployment commands to include missing parameters and use the weekly Docker image tag

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
docs/llm/reference.md	Added `devstral` to the list of supported tool parsers and updated documentation to reference it
demos/code_local_assistant/README.md	Added complete deployment and configuration instructions for Devstral model across Linux and Windows platforms, plus fixes to existing model deployment commands

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

demos/code_local_assistant/README.md Outdated

Comment on lines 82 to 83

    
                  --model_name nsloth/Devstral-Small-2507 \

                  --model_path nsloth/Devstral-Small-2507

Copilot AI Jan 15, 2026

The model_name and model_path use 'nsloth' instead of 'unsloth', which is inconsistent with the source_model parameter on line 75. This appears to be a typo and should be 'unsloth' to match the actual model identifier.

Suggested change

      
                --model_name nsloth/Devstral-Small-2507 \
          
                --model_path nsloth/Devstral-Small-2507
          
                --model_name unsloth/Devstral-Small-2507 \
          
                --model_path unsloth/Devstral-Small-2507

Copilot uses AI. Check for mistakes.

demos/code_local_assistant/README.md

Comment on lines +92 to 93

    
                  openvino/model_server:weekly \

                  --pull \

Copilot AI Jan 15, 2026

The command structure is incorrect. The Docker image name should be followed by container configuration flags (like -d, --rm, -v), not command arguments like --pull. The --pull flag appears to be intended for the ovms executable inside the container, but the command structure doesn't properly separate Docker flags from OVMS command arguments.

Copilot uses AI. Check for mistakes.

demos/code_local_assistant/README.md

Comment on lines +132 to 133

    
                  openvino/model_server:weekly \

                  --pull \

Copilot AI Jan 15, 2026

The command structure is incorrect. Similar to the previous issue, the Docker image name should be followed by container configuration flags, not command arguments. The --pull flag and subsequent arguments should be passed to the ovms executable inside the container.

Copilot uses AI. Check for mistakes.

dtrawins added 2 commits

January 16, 2026 00:07

fix

e1f4b4d

fix

b0ee66a

dtrawins requested review from dkalinowski and przepeck

January 16, 2026 10:28

przepeck approved these changes

View reviewed changes

dkalinowski reviewed

View reviewed changes

demos/code_local_assistant/README.md

Comment on lines +37 to 38

    
              > **Note:** For deployment, the model requires ~16GB disk space and recommended 16GB+ of VRAM on the GPU. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.

Collaborator

dkalinowski Jan 16, 2026

Suggested change

      
            > **Note:** For deployment, the model requires ~16GB disk space and recommended 16GB+ of VRAM on the GPU. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.
          
            > **Note:** For deployment, the model requires ~16GB disk space and recommended 16GB+ of VRAM on the GPU. For conversion, the original model will be pulled and quantization will be applied. It requires the amount of RAM equal to the model size <how much?>

please fill "how much"
is it 150gb?

dkalinowski reviewed

View reviewed changes

demos/code_local_assistant/README.md

    
                  --model_name unsloth/Devstral-Small-2507 \

                  --model_path unsloth/Devstral-Small-2507

              ```

              > **Note:** This model requires ~13GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.

Collaborator

dkalinowski Jan 16, 2026

Suggested change

      
            > **Note:** This model requires ~13GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.
          
            > **Note:** This model requires ~13GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will be applied. It requires the amount of RAM equal to the model size <how much?>

please fill "how much"

Collaborator

dkalinowski Jan 16, 2026

Also what about VRAM information of other models? We are missing it for qwen3 and qwen2.5

dkalinowski reviewed

View reviewed changes

demos/code_local_assistant/README.md

    
              ovms.exe --add_to_config --config_path models/config_all.json --model_name openai/gpt-oss-20b --model_path openai/gpt-oss-20b

              ```

              > **Note:** This model requires ~13GB disk space and same amount of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.

              > **Note:** This model requires ~12GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.

Collaborator

dkalinowski Jan 16, 2026

Suggested change

      
            > **Note:** This model requires ~12GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.
          
            > **Note:** This model requires ~12GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will be applied. It requires the amount of RAM equal to the model size <how much?>

dkalinowski reviewed

View reviewed changes

demos/code_local_assistant/README.md

    
              ovms.exe --add_to_config --config_path models/config_all.json --model_name unsloth/Devstral-Small-2507 --model_path unsloth/Devstral-Small-2507

              ```

              > **Note:** This model requires ~13GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.

Collaborator

dkalinowski Jan 16, 2026

Suggested change

      
            > **Note:** This model requires ~13GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will require the amount of RAM of the model size.
          
            > **Note:** This model requires ~13GB disk space and recommended 16GB+ of VRAM on the GPU for deployment. For conversion, the original model will be pulled and quantization will be applied. It requires the amount of RAM equal to the model size <how much?>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet