Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 4 additions & 9 deletions .github/workflows/pre-commit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,11 @@ jobs:
with:
fetch-depth: 0

- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.12"
- name: Install uv
uses: astral-sh/setup-uv@e58605a9b6da7c637471fab8847a5e5a6b8df081 # v5
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow now relies on whatever Python happens to be on the runner. To avoid non-reproducible pre-commit results, explicitly set up the intended Python version (3.12) or configure the uv setup step accordingly.

Suggested change
uses: astral-sh/setup-uv@e58605a9b6da7c637471fab8847a5e5a6b8df081 # v5
uses: astral-sh/setup-uv@e58605a9b6da7c637471fab8847a5e5a6b8df081 # v5
with:
python-version: "3.12"

Copilot uses AI. Check for mistakes.

- name: Install dependencies
run: |
python -m pip install pip==26.0.1
pip install -e ".[dev]"
run: uv sync --frozen --extra dev

- name: Run pre-commit
run: |
pre-commit run --all-files --show-diff-on-failure
run: uv run pre-commit run --all-files --show-diff-on-failure
29 changes: 6 additions & 23 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,26 +9,14 @@ on:
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.12"]

steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install pip==26.0.1
pip install -e ".[test]"
- name: Install uv
uses: astral-sh/setup-uv@e58605a9b6da7c637471fab8847a5e5a6b8df081 # v5

- name: Run tests
run: |
pytest -xv -m "not slow and not performance" --cov=src --cov-report=xml --cov-report=html
run: uv run --frozen --extra test pytest -xv -m "not slow and not performance" --cov=src --cov-report=xml --cov-report=html
Comment on lines +15 to +19
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI no longer pins/sets up a specific Python version. Since this project requires Python >=3.12 and previously enforced 3.12, consider explicitly setting up Python (or configuring the uv setup step to install/use 3.12) to keep CI reproducible and avoid breakage if the runner’s default Python changes.

Copilot uses AI. Check for mistakes.

- name: Upload coverage to Codecov
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v6.0.0
Expand All @@ -43,13 +31,8 @@ jobs:
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.12"
- name: Install uv
uses: astral-sh/setup-uv@e58605a9b6da7c637471fab8847a5e5a6b8df081 # v5

- name: Audit dependencies for known vulnerabilities
run: |
python -m pip install pip==26.0.1
pip install -e ".[dev,test,performance]"
pip-audit
run: uv run --frozen --extra dev --extra test --extra performance pip-audit
7 changes: 7 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -62,3 +62,10 @@ repos:
types: [python]
pass_filenames: true
exclude: ^(src/inference_endpoint/openai/openai_types_gen.py)$

- id: uv-lock-check
name: Check uv.lock is up-to-date
entry: uv lock --check
language: system
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hook runs uv lock --check with language: system, so pre-commit will fail on machines that don’t have the uv binary installed on PATH (including pip-only setups). Consider switching to a hook that provisions uv automatically or ensure uv is installed as part of the documented/dev dependency setup so non-uv contributors aren’t blocked.

Suggested change
language: system
language: python
additional_dependencies:
- uv

Copilot uses AI. Check for mistakes.
pass_filenames: false
files: ^pyproject\.toml$
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.12
11 changes: 9 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,11 @@ High-performance benchmarking tool for LLM inference endpoints targeting 50k+ QP
## Common Commands

```bash
# Development setup
# Development setup (uv — recommended)
uv sync --extra dev --extra test
uv run pre-commit install

# Development setup (pip — still works)
python3.12 -m venv venv && source venv/bin/activate
pip install -e ".[dev,test]"
pre-commit install
Expand All @@ -32,6 +36,9 @@ inference-endpoint probe --endpoints http://localhost:8765 --model test-model
inference-endpoint benchmark offline --endpoints URL --model NAME --dataset PATH
inference-endpoint benchmark online --endpoints URL --model NAME --dataset PATH --load-pattern poisson --target-qps 100
inference-endpoint benchmark from-config --config config.yaml

# Or with uv (auto-creates venv, uses lockfile)
uv run inference-endpoint benchmark offline --endpoints URL --model NAME --dataset PATH
```

## Architecture
Expand Down Expand Up @@ -347,5 +354,5 @@ Known failure modes when AI tools generate code for this project. Reference thes

### Dependency & Environment

- **Adding new dependencies without justification**: AI may `pip install` or add imports for packages not in `pyproject.toml`. Any new dependency must be justified, added to the correct optional group, and pinned to an exact version (`==`). After adding a dependency, run `pip-audit` (included in `dev` extras) to verify it has no known vulnerabilities.
- **Adding new dependencies without justification**: AI may `pip install` or add imports for packages not in `pyproject.toml`. Any new dependency must be justified, added to the correct optional group, and pinned to an exact version (`==`). After adding a dependency, run `pip-audit` (included in `dev` extras) to verify it has no known vulnerabilities. When adding dependencies, use `uv add <package>==<version>` to update both `pyproject.toml` and `uv.lock` atomically, then run `uv run pip-audit` to check for vulnerabilities.
- **Using `requests`/`aiohttp` for HTTP**: This project has its own HTTP client (`endpoint_client/http.py`) using `httptools`. AI defaults to `requests` or `aiohttp` — these should not appear in production code (test dependencies are fine).
27 changes: 24 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,27 @@ A high-performance benchmarking tool for LLM endpoints.
# Note: This repo will be migrated to https://github.com/mlcommons/endpoints
git clone https://github.com/mlcommons/endpoints.git
cd endpoints
```

**Option A: uv (recommended)** — faster, lockfile-verified dependencies

```bash
# As a user
uv sync

# As a developer (with development and test extras)
uv sync --extra dev --extra test
uv run pre-commit install

# Activate the venv to use python/pytest/etc. directly (optional)
source .venv/bin/activate
pytest -m "not performance and not run_explicitly"
inference-endpoint --help
```

**Option B: pip + venv**

```bash
# Create virtual environment
python3.12 -m venv venv
source venv/bin/activate
Expand Down Expand Up @@ -83,10 +103,11 @@ See [Local Testing Guide](docs/LOCAL_TESTING.md) for detailed instructions.
### Running Tests and Examples

```bash
# Install test dependencies
pip install ".[test]"
# With uv (after uv sync --extra test)
uv run pytest -m "not performance and not run_explicitly"

# Run tests (excluding performance and explicit-run tests)
# With pip (activate venv first)
pip install ".[test]"
pytest -m "not performance and not run_explicitly"

# Run examples: follow instructions in examples/*/README.md
Expand Down
33 changes: 16 additions & 17 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,20 @@
[build-system]
requires = ["setuptools==78.1.1", "wheel==0.46.3"]
build-backend = "setuptools.build_meta"
requires = ["uv_build>=0.7.6,<0.8"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The build system requirement should be uv-build-backend, not uv_build. While the backend entry point is named uv_build, the package name on PyPI is uv-build-backend. Using the incorrect name will cause pip and other PEP 517 compatible tools to fail when attempting to build the project from source.

Suggested change
requires = ["uv_build>=0.7.6,<0.8"]
requires = ["uv-build-backend>=0.1.0"]

build-backend = "uv_build"

[tool.uv]
index-url = "https://pypi.org/simple"
environments = [
"sys_platform == 'linux' and platform_machine == 'x86_64'",
"sys_platform == 'linux' and platform_machine == 'aarch64'",
"sys_platform == 'darwin' and platform_machine == 'x86_64'",
"sys_platform == 'darwin' and platform_machine == 'arm64'",
]

[tool.uv.build]
module-root = "src"
data = {"inference_endpoint" = ["config/templates/*.yaml"]}
exclude = ["evaluation/livecodebench/_server.py"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The exclude path should be relative to the project root to ensure the file is correctly identified and excluded from the build. Since module-root is set to src, the path should include the src/ prefix to match the actual file location.

Suggested change
exclude = ["evaluation/livecodebench/_server.py"]
exclude = ["src/inference_endpoint/evaluation/livecodebench/_server.py"]

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tool.uv.build.exclude path looks incorrect given the actual file location is src/inference_endpoint/evaluation/livecodebench/_server.py. With module-root = "src", the exclude entry likely needs to include the inference_endpoint/ prefix; otherwise _server.py may be unintentionally packaged.

Suggested change
exclude = ["evaluation/livecodebench/_server.py"]
exclude = ["inference_endpoint/evaluation/livecodebench/_server.py"]

Copilot uses AI. Check for mistakes.

[project]
name = "inference-endpoint"
Expand Down Expand Up @@ -112,21 +126,6 @@ Documentation = "https://github.com/mlperf/inference-endpoint#readme"
Repository = "https://github.com/mlperf/inference-endpoint.git"
Issues = "https://github.com/mlperf/inference-endpoint/issues"

[tool.setuptools.packages.find]
where = ["src"]

[tool.setuptools.package-dir]
"" = "src"

[tool.setuptools.package-data]
inference_endpoint = ["config/templates/*.yaml"]

[tool.setuptools.exclude-package-data]
"inference_endpoint.evaluation.livecodebench" = ["_server.py"]

[tool.autopep8]
max_line_length = 88

[tool.ruff]
target-version = "py312"
line-length = 88
Expand Down
15 changes: 10 additions & 5 deletions scripts/Dockerfile.dev
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,25 @@

FROM python:3.12.11-slim

# Copy uv binary from official image
COPY --from=ghcr.io/astral-sh/uv:0.7.6 /uv /uvx /bin/
Comment on lines +8 to +9
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For supply-chain reproducibility, copying uv from an image tag (ghcr.io/astral-sh/uv:0.7.6) is still mutable. Consider pinning the --from image to an immutable digest so builds can’t change if the tag is retargeted.

Suggested change
# Copy uv binary from official image
COPY --from=ghcr.io/astral-sh/uv:0.7.6 /uv /uvx /bin/
# Copy uv binary from an immutable official image reference
COPY --from=ghcr.io/astral-sh/uv:0.7.6@sha256:<verified-digest-for-0.7.6> /uv /uvx /bin/

Copilot uses AI. Check for mistakes.

# Set environment variables
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
UV_COMPILE_BYTECODE=1 \
UV_LINK_MODE=copy

# Install system dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends build-essential procps \
apt-get install -y --no-install-recommends build-essential procps \
&& rm -rf /var/lib/apt/lists/*

RUN mkdir /mnt/inference-endpoint
WORKDIR /mnt/inference-endpoint
COPY pyproject.toml .

# Copy lockfile + project metadata first for Docker layer caching
COPY pyproject.toml uv.lock .python-version ./
COPY src/ ./src/
Comment on lines +26 to 27
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To optimize Docker layer caching, it is recommended to install dependencies before copying the full source code. This prevents the uv sync step from re-running every time a source file is modified. You can use uv sync --no-install-project to install only the dependencies first.

COPY pyproject.toml uv.lock .python-version ./
RUN uv sync --frozen --no-install-project --extra dev --extra test
COPY src/ ./src/


# Create a non-root user for security
Expand All @@ -33,4 +38,4 @@ RUN if ! getent group ${GROUP_ID}; then \
USER appuser
ENV PATH="/home/appuser/.local/bin:$PATH"

RUN pip install -e .[dev,test]
RUN uv sync --frozen --extra dev --extra test
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When running the container with a volume mount (as suggested in the file header: -v $(pwd):/mnt/inference-endpoint), the .venv directory created here will be obscured by the host directory. This will result in the installed dependencies being unavailable in the running container. To resolve this, consider setting UV_PROJECT_ENVIRONMENT to a location outside the project root (e.g., /opt/venv) and adding that location to the PATH.

Loading
Loading