Cairo Dictionary is an Arabic dictionary project that enriches traditional lookup functionality by empowering its tools with modern AI. The aim is to enhance how users interact with Arabic language content through intelligent assistance rather than being limited to static dictionary entries.
Our team focused on building a speech correction pipeline, which combines:
-
Speech-to-Text: Converting spoken Arabic into text with diacritics.
-
Text Correction: Automatically correcting transcription errors and refining the output.
-
Text-to-Speech: Generating natural-sounding Arabic speech from text.
We also prepared datasets and trained models that live on our CUAIStudents HuggingFace organization. There you can find:
-
The fine-tuned DeepAr model for Arabic speech recognition.
-
The AraFix model for text correction.
-
Datasets and checkpoints used in our experiments.
For a more detailed explanation of our methods, pipeline design, and evaluation, please refer to the academic PDF report.
We implemented two backend APIs that serve as proof-of-concepts for deploying the models:
-
Cairo Dictionary AI – Ray Backend: Uses Ray Serve to scale services with dynamic batching, scalability, and resource management.
-
Cairo Dictionary AI – FastAPI Backend: Provides a lightweight FastAPI implementation exposing REST endpoints for transcription, correction, and TTS.
We have recently added new models for both ASR and GEC:
-
Adapt-Ar
Built on the same architecture as DeepAr, this model was trained for an additional half epoch on augmented data to better handle noisy, silent, and speed-varied voices.
Its overall performance is close to DeepAr (±1%), but it consistently performs better in noisy conditions. -
Qwen-Ar-GEC
Based on Qwen 2.5-7B-Instruct, fine-tuned using the QLoRA method with LLaMA Factory.
It outperforms our earlier AraFix model by leveraging Qwen’s extensive knowledge and strong instruction-following capabilities.
The following results are for our Qwen-QLoRA model:
| Metric | Score |
|---|---|
| CER | 0.01 |
| WER | 0.08 |
| DER | 0.07 |
| AVG | 0.053 |
AVG is the arithmetic mean of the above metrics.