Kubernetes Deployment Template for Inference Service

This repo contains code for FastAPI service for serving LUME models from MLFlow. With one docker image, multiple model deployments can be made via environment variables. Any changes to the client code will rebuild the base image automatically through Github Actions CI.

Environment Variables

MLFLOW_TRACKING_URI: This is set in the mlflow-config configmap.
MODEL_NAME: lcls_cu_inj_model or lcls-fel-surrogate
MODEL_VERSION: 1

Testing inference service image

To test the functionality of the image, user can create a temporary pod using the test-client image (also created by the CI) to run checks.

kubectl run test -n inference-service --image=ghcr.io/slaclab/inference-service/test-client:latest --rm -it --restart=Never --env="INFERENCE_SERVICE_URL=http://inference-service:8000" python test_validation.py

kubectl run test -n inference-service --image=ghcr.io/slaclab/inference-service/test-client:latest --rm -it --restart=Never --env="INFERENCE_SERVICE_URL=http://inference-service:8000" python test_client.py

Template for inference service deployment

The Copier template generates Kubernetes manifests for deploying ML inference services. User can either use a yaml like here or use copier.

User can either use a simple template that is then used by Copier to generate deployment yaml

service_name: iris-service
namespace: inference-service
model_name: iris-model
model_version: "1"
# mlflow_uri removed - it's in the shared mlflow-config ConfigMap
container_registry: ghcr.io/slaclab/inference-service
replicas: 2
memory_request: "4Gi"
memory_limit: "8Gi"
cpu_request: "1000m"
cpu_limit: "4000m"

An example of this template is in the model-configs/iris-model directory. User can generate copier template in the deployments/iris-model directory using this command from the root -

copier copy --data-file model-configs/iris-model.yaml copier-template-k8s deployments/iris-model

User can also simply run below copier command and answer questions to generate the deployment yaml

copier copy copier-template-k8s deployments/iris-model

The command below will create the deployment yaml in the folder specified at the end. These are the questions the template will ask.

(test-bed) bash-5.3$ copier copy copier-template-k8s deployments/fel-model
🎤 What is the service name?
   inference-service-fel
🎤 Which Kubernetes namespace?
   lume-online-ml
🎤 What is the MLflow model name?
   lcls-fel-surrogate
🎤 What model version to deploy?
   1
🎤 Container registry (e.g., ghcr.io/username/repo)?
   ghcr.io/slaclab/inference-service
🎤 Number of replicas?
   1
🎤 Memory request (e.g., 2Gi)?
   2Gi
🎤 Memory limit (e.g., 4Gi)?
   4Gi
🎤 CPU request (e.g., 500m)?
   500m
🎤 CPU limit (e.g., 2000m)?
   2000m

Copying from template version None
    create  deployment.yaml

User can then deploy the deployment yaml in the lume-online-ml namespace either manually using kubectl apply -f deployment.yaml -n lume-online-ml. We have also configured ArgoCD to automate the deployments.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
copier-template-k8s		copier-template-k8s
deployments		deployments
k8s		k8s
model-configs		model-configs
Dockerfile		Dockerfile
Dockerfile.client		Dockerfile.client
README.md		README.md
client.py		client.py
inference_service.py		inference_service.py
requirements.txt		requirements.txt
test_client.py		test_client.py
test_local.py		test_local.py
test_validation.py		test_validation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kubernetes Deployment Template for Inference Service

Environment Variables

Testing inference service image

Template for inference service deployment

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

slaclab/inference-service

Folders and files

Latest commit

History

Repository files navigation

Kubernetes Deployment Template for Inference Service

Environment Variables

Testing inference service image

Template for inference service deployment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

Packages