Skip to content

anyscale/demoguides

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 

Repository files navigation

Anyscale on AKS Demo Guide

Audience: Microsoft Field Engineers Goal: Provide guidance on how to demo Anyscale on AKS functionality


Table of Contents


Prerequisites

Step 1: Request Access

Contact [email protected] for access credentials to the demo Anyscale organization.

Step 2: Login

Navigate to console.anyscale.com and sign in with your credentials.

Anyscale Console Login


Demo 1: Multi-modal Batch Inference

Preparation

  1. Launch a workspace with Multi-Modal AI template

    • From the Anyscale console, create a new workspace (Create from template)
    • Select the "Multi-Modal AI" template

    Multi-Modal AI Template Selection

    • Launch the template
  2. Modify compute configuration

    • Terminate the workspace (if already running)
    • Navigate to compute configuration settings

    Compute Configuration Settings

    • Change the head node:
      • From: 2CPU-8GB
      • To: 8CPU-32GB
    • Change the worker nodes:
      • From: Auto-select workers
      • To: 4 x T4 GPUs

    Head and Worker Node Configuration

  3. Modify the container image

    • Select image "anyscale/ray:2.49.1-py312-cu128"
  4. Re-launch the workspace

    • Start the Workspace with the new configuration
    • Observe Workspace logs for successful launch. Below is an example: Example Worskpace Logs

Demo Execution

  • Navigate to the VSCode interface (not VSCode Desktop)
  • Access notebooks/01-Batch-Inference.ipynb
  • [Optional] Modify the Batch Inference notebook to use a shared storage mount In the start of the section "Data ingestion" replace the code with a reference to S3 to:
# Load data.
ds = ray.data.read_images(
    "/mnt/shared_storage/doggos-dataset/train", 
    include_paths=True, 
    shuffle="files",
)
ds.take(1)
  • Run through the notebook until the section "Monitoring and Debugging" in the notebook

Demo 2: Deploy LLMs

Preparation

  1. Launch a workspace with Deploy LLMs template

    • From the Anyscale console, create a new workspace
    • Select the "Deploy LLMs" template

    Deploy LLMs Template Selection

  2. Modify compute configuration

    • Terminate the workspace (if already running)
    • Navigate to compute configuration settings
    • Change the head node:
      • From: 2CPU-8GB
      • To: 8CPU-32GB
    • Change the worker nodes:
      • From: Auto-select workers
      • To: 2 x A100 nodes

    A100 Node Configuration

  3. Set up HuggingFace token

    • Sign in to HuggingFace (create an account if required)
    • Navigate to Profile → Access Tokens
    • Create a new token with read permissions
    • Copy the token for the next step

    HuggingFace Token Creation

  4. Configure environment variables

    • In the Anyscale workspace settings, navigate to Dependencies → Environment Variables
    • Edit to Add the following environment variable:
      HF_TOKEN=<YOUR_HF_TOKEN>
      
    • Replace <YOUR_HF_TOKEN> with your actual HuggingFace token

    Environment Variables Configuration

  5. Launch the workspace

    • Start the workspace with the new configuration

Demo Execution

  • Modify small-size-llm/notebook.ipynb as follows:
  1. accelerator_type="A100", instead of accelerator_type="L4"
  2. Add your HuggingFace in the right locations (two locations)
  • Modify small-size-llm/serve_llama_3_1_8b.py as follows:
  1. accelerator_type="A100", instead of accelerator_type="L4"
  • Modify small-size-llm/service.yaml as follows:
# service.yaml
name: deploy-llama-3-8b
image_uri: anyscale/ray-llm:2.50.1-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
  auto_select_worker_config: true 
  head_node:
    instance_type: 8CPU-32GB
working_dir: .
cloud:
applications:
  # Point to your app in your Python module
  - import_path: serve_llama_3_1_8b:app
  • Follow the instructions in the notebook small-size-llm/notebook.ipynb

Tips for a Successful Demo

  • Ensure nodes are provisioned before the demo to avoid wait times
  • Test the workflows in advance to familiarize yourself with the UI
  • Prepare talking points about AKS integration benefits
  • Have backup examples ready in case of any technical issues
  • Emphasize scalability and Azure-native features

Support

For questions or issues, contact [email protected]

About

Demo guide for Anyscale on Azure

Resources

Stars

Watchers

Forks