Merve

Model server - Wrap any Python predictor class into a production-ready FastAPI inference API using a simple YAML configuration file.

What This Does

You have a Python class with predict() and/or predict_proba() methods. This tool:

Wraps it in a FastAPI server with /predict and /predict_proba endpoints
Adds Prometheus metrics, structured logging, and health checks
Handles input validation and format conversion automatically
Provides Docker containerization with version tracking
Generates GitHub Actions workflows for automated CI/CD builds

Installation

pip install git+https://github.com/core64-lab/merve.git

For development:

git clone https://github.com/core64-lab/merve.git
cd merve
pip install -e ".[dev]"

Quick Start

1. Create Your Predictor

# mlserver_predictor.py
import joblib

class MyPredictor:
    def __init__(self, model_path: str):
        self.model = joblib.load(model_path)

    def predict(self, data):
        # data: list of dicts or numpy array
        return self.model.predict(data)

    def predict_proba(self, data):
        return self.model.predict_proba(data)

2. Create mlserver.yaml

predictor:
  module: mlserver_predictor
  class_name: MyPredictor
  init_kwargs:
    model_path: ./model.pkl

classifier:
  name: my-classifier
  version: 1.0.0

That's the minimal configuration. Server defaults to 0.0.0.0:8000.

3. Start Server

mlserver serve

4. Make Predictions

# Single prediction
curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"instances": [{"feature1": 1.0, "feature2": 2.0}]}'

# Batch prediction (same endpoint)
curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"instances": [
    {"feature1": 1.0, "feature2": 2.0},
    {"feature1": 3.0, "feature2": 4.0}
  ]}'

API Endpoints

Endpoint	Method	Description
`/predict`	POST	Model predictions (single or batch)
`/predict_proba`	POST	Probability predictions
`/healthz`	GET	Health check
`/info`	GET	Server and model metadata
`/status`	GET	Detailed status information
`/metrics`	GET	Prometheus metrics

Configuration

Minimal Configuration

predictor:
  module: mlserver_predictor
  class_name: MyPredictor
  init_kwargs:
    model_path: ./model.pkl

classifier:
  name: my-classifier
  version: 1.0.0

Full Configuration

server:
  host: 0.0.0.0
  port: 8000
  workers: 1
  log_level: INFO
  cors:
    allow_origins: []

predictor:
  module: mlserver_predictor
  class_name: MyPredictor
  init_kwargs:
    model_path: ./model.pkl

classifier:
  name: my-classifier
  version: 1.0.0
  description: My ML classifier

api:
  adapter: auto              # auto | records | ndarray
  feature_order: [col1, col2]  # or path to JSON file
  thread_safe_predict: false
  max_concurrent_predictions: 1
  warmup_on_start: true
  endpoints:
    predict: true
    predict_proba: true

observability:
  metrics: true
  structured_logging: true
  correlation_ids: true
  log_payloads: false

Multi-Classifier Configuration

Serve multiple models from one repository:

server:
  host: 0.0.0.0
  port: 8000

classifiers:
  sentiment:
    predictor:
      module: sentiment_predictor
      class_name: SentimentPredictor
      init_kwargs:
        model_path: ./models/sentiment.pkl
    classifier:
      name: sentiment
      version: 1.0.0

  fraud:
    predictor:
      module: fraud_predictor
      class_name: FraudPredictor
      init_kwargs:
        model_path: ./models/fraud.pkl
    classifier:
      name: fraud
      version: 2.0.0

Run a specific classifier:

mlserver serve --classifier sentiment
mlserver build --classifier sentiment

CLI Commands

mlserver serve [config.yaml]           Start the server
mlserver build --classifier <name>     Build Docker container
mlserver tag <patch|minor|major> -c <name>   Create version tag
mlserver push --classifier <name>      Push to container registry
mlserver run --classifier <name>       Run container locally
mlserver images                        List built images
mlserver clean --classifier <name>     Remove built images
mlserver list-classifiers              List classifiers in config
mlserver version [--json]              Show version info
mlserver status                        Show system status
mlserver validate                      Validate configuration
mlserver doctor                        Diagnose common issues
mlserver test                          Test against running server
mlserver init                          Initialize new project
mlserver init-github                   Generate GitHub Actions workflow

Docker Containerization

Build and Run

# Build container
mlserver build --classifier my-classifier

# Run locally
mlserver run --classifier my-classifier

# Or manually
docker run -p 8000:8000 my-repo/my-classifier:latest

Version Tagging

Create hierarchical tags that track both classifier and mlserver versions:

# Create patch version bump (1.0.0 -> 1.0.1)
mlserver tag patch --classifier my-classifier

# Push to trigger GitHub Actions
git push --tags

Tag format: <classifier>-v<version>-mlserver-<commit>

Example: my-classifier-v1.0.1-mlserver-abc123d

GitHub Actions CI/CD

Initialize the workflow:

mlserver init-github

This creates .github/workflows/ml-classifier-container-build.yml which:

Triggers on hierarchical tags
Installs the exact mlserver version from the tag
Builds and tests the container
Pushes to GHCR or ECR

Configure registry in mlserver.yaml:

deployment:
  registry:
    type: ghcr    # or ecr
    namespace: your-org

Input Formats

The API auto-detects input format:

Records (list of dicts):

{"instances": [{"age": 25, "income": 50000}]}

ndarray (nested lists):

{"instances": [[25, 50000]]}

Force a specific format with api.adapter: records or api.adapter: ndarray.

Observability

Built-in observability at no extra configuration:

Prometheus metrics at /metrics
Structured JSON logging with correlation IDs
Health checks at /healthz
Request tracing via X-Correlation-ID header

Example Prometheus + Grafana setup in monitoring/ directory.

Requirements

Python 3.9+
Docker (for containerization)
Git (for version tagging)

Architecture

Request -> FastAPI -> InputAdapter -> Predictor.predict() -> Response
                           |
                    Metrics + Logging

Examples

See examples/ directory for complete working examples:

Single classifier setup
Multi-classifier repository
Custom preprocessing
Model ensembles

Testing

Run the test suite:

# Run all tests
pytest tests/

# Run with coverage
pytest tests/ --cov=mlserver --cov-report=term-missing

# Run specific test categories
pytest tests/unit/           # Unit tests
pytest tests/integration/    # Integration tests

Current status: 860 tests passing, 63% coverage

Troubleshooting

# Validate configuration
mlserver validate

# Diagnose environment issues
mlserver doctor

# Check server status
mlserver status

Common issues:

Import errors: Ensure predictor module is in Python path
Memory issues: Reduce server.workers (each loads full model)
Slow first request: Enable api.warmup_on_start: true

License

Apache License 2.0 - see LICENSE file.

Alternatives

MLflow Models - Full ML lifecycle platform
BentoML - Feature-rich model serving
TorchServe / TensorFlow Serving - Framework-specific

This tool focuses on simplicity: wrap any Python predictor with minimal configuration, no framework lock-in.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
docs		docs
examples		examples
mlserver		mlserver
monitoring		monitoring
scripts		scripts
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile.fraud		Dockerfile.fraud
Dockerfile.sentiment		Dockerfile.sentiment
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
__init__.py		__init__.py
build_utils.py		build_utils.py
docker-compose.monitoring.yml		docker-compose.monitoring.yml
global_config.yaml		global_config.yaml
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Merve

What This Does

Installation

Quick Start

1. Create Your Predictor

2. Create mlserver.yaml

3. Start Server

4. Make Predictions

API Endpoints

Configuration

Minimal Configuration

Full Configuration

Multi-Classifier Configuration

CLI Commands

Docker Containerization

Build and Run

Version Tagging

GitHub Actions CI/CD

Input Formats

Observability

Requirements

Architecture

Examples

Testing

Troubleshooting

License

Alternatives

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

core64-lab/merve

Folders and files

Latest commit

History

Repository files navigation

Merve

What This Does

Installation

Quick Start

1. Create Your Predictor

2. Create mlserver.yaml

3. Start Server

4. Make Predictions

API Endpoints

Configuration

Minimal Configuration

Full Configuration

Multi-Classifier Configuration

CLI Commands

Docker Containerization

Build and Run

Version Tagging

GitHub Actions CI/CD

Input Formats

Observability

Requirements

Architecture

Examples

Testing

Troubleshooting

License

Alternatives

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages