Skip to content

Overview of our graduation project “Cairo Dictionary AI” – an Arabic dictionary enriched with AI. Includes our speech correction pipeline, HuggingFace models/datasets, backend prototypes (Ray & FastAPI), and academic report.

Notifications You must be signed in to change notification settings

AbdoAlshoki2/Cairo-Dictionary-AI-Overview

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Cairo-Dictionary-AI-Overview

Cairo Dictionary is an Arabic dictionary project that enriches traditional lookup functionality by empowering its tools with modern AI. The aim is to enhance how users interact with Arabic language content through intelligent assistance rather than being limited to static dictionary entries.

Our Contribution

Our team focused on building a speech correction pipeline, which combines:

  • Speech-to-Text: Converting spoken Arabic into text with diacritics.

  • Text Correction: Automatically correcting transcription errors and refining the output.

  • Text-to-Speech: Generating natural-sounding Arabic speech from text.

We also prepared datasets and trained models that live on our CUAIStudents HuggingFace organization. There you can find:

  • The fine-tuned DeepAr model for Arabic speech recognition.

  • The AraFix model for text correction.

  • Datasets and checkpoints used in our experiments.

For a more detailed explanation of our methods, pipeline design, and evaluation, please refer to the academic PDF report.

Backend Implementations

We implemented two backend APIs that serve as proof-of-concepts for deploying the models:

⚠️ Note: These backends are not production-ready. They were developed as part of our academic project and demonstrate integration of the models rather than being optimized for deployment.

Recent Work

We have recently added new models for both ASR and GEC:

  • Adapt-Ar
    Built on the same architecture as DeepAr, this model was trained for an additional half epoch on augmented data to better handle noisy, silent, and speed-varied voices.
    Its overall performance is close to DeepAr (±1%), but it consistently performs better in noisy conditions.

  • Qwen-Ar-GEC
    Based on Qwen 2.5-7B-Instruct, fine-tuned using the QLoRA method with LLaMA Factory.
    It outperforms our earlier AraFix model by leveraging Qwen’s extensive knowledge and strong instruction-following capabilities.


The following results are for our Qwen-QLoRA model:

Metric Score
CER 0.01
WER 0.08
DER 0.07
AVG 0.053

AVG is the arithmetic mean of the above metrics.

Contributors

About

Overview of our graduation project “Cairo Dictionary AI” – an Arabic dictionary enriched with AI. Includes our speech correction pipeline, HuggingFace models/datasets, backend prototypes (Ray & FastAPI), and academic report.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published