Cairo Dictionary AI – Backend with Ray Serve

Ray Serve backend for Arabic Speech Recognition, Text Correction, and Text-to-Speech (TTS). This implementation uses Ray Serve to deploy our models — DeepAr for Arabic speech-to-text and AraFix for text correction — as scalable microservices. The models are included in this project as Git submodules and are also available on our CUAIStudents HuggingFace organization.

Key Features

Dynamic Batching: Ray Serve automatically batches incoming requests to maximize GPU utilization and throughput
Scalability: Easily scale to multiple replicas to handle increased load
Fault Tolerance: Automatic recovery from worker failures
Resource Management: Fine-grained control over CPU/GPU resources

Installation

Make sure ffmpeg and python are installed on your system.

macOS:

# Install Homebrew (if not installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install ffmpeg
brew install ffmpeg

# Install Python (if needed)
brew install [email protected]

Ubuntu/Debian:

# Update package list
sudo apt update

# Install ffmpeg
sudo apt install ffmpeg

# Install Python
sudo apt install python3.11 python3.11-venv

Clone with submodules (includes DeepAr + AraFix):

git clone --recurse-submodules https://github.com/AbdoAlshoki2/Cairo-Dictionary-AI-Ray-Backend.git <any_dir>
cd <any_dir>/ray-api

Create a virtual environment and install dependencies:

python -m venv venv
.\venv\Scripts\activate  # Windows
# source venv/bin/activate  # Linux/Mac
pip install -r src/requirements.txt

Running the API

Development Mode

Start Ray and deploy the services:

# Start Ray in the background
ray start --head

# Deploy the services
serve run src/config.yaml

API Endpoints

Speech-to-Text

POST /api/v1/audio

Upload audio file in the request body
Returns transcribed text

Text Correction

POST /api/v1/text

Input: {"text": "your arabic text"}
Returns corrected Arabic text

Text-to-Speech

/api/v1/voice_generator

Input: {"text": "your arabic text"}
Returns audio stream

Dynamic Batching

This implementation leverages Ray Serve's built-in dynamic batching to maximize throughput:

Requests are automatically batched based on model requirements
Batch size is dynamically adjusted for optimal performance
Reduces latency by processing multiple requests simultaneously

Project Structure

ray-api/
├── src/
│   ├── apps/               # Ray Serve applications
│   │   ├── __init__.py
│   │   ├── text_corrector.py
│   │   ├── transcriber.py
│   │   └── voice_generator.py
│   │
│   ├── models/             # Model implementations
│   │   ├── araFix/         # Text correction model
│   │   └── whisper/        # Speech recognition model
│   │
│   ├── schemas/            # Pydantic models
│   │   ├── correction.py
│   │   └── tts.py
│   │
│   ├── config.yaml         # Ray Serve configuration
│   └── main.py            # Entry point for Ray Serve
│
├── .gitmodules            # Git submodules configuration
└── README.md              # This file

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cairo Dictionary AI – Backend with Ray Serve

Key Features

Installation

Running the API

Development Mode

API Endpoints

Speech-to-Text

Text Correction

Text-to-Speech

Dynamic Batching

Project Structure

About

Uh oh!

Releases

Packages

Languages

AbdoAlshoki2/Cairo-Dictionary-AI-Ray-Backend

Folders and files

Latest commit

History

Repository files navigation

Cairo Dictionary AI – Backend with Ray Serve

Key Features

Installation

Running the API

Development Mode

API Endpoints

Speech-to-Text

Text Correction

Text-to-Speech

Dynamic Batching

Project Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages