This repository implements ABGD, an open Arabic Decimal Encoding System.
ABGD converts Arabic text into a structured list of decimal numbers based on traditional Abjad gematrical values.
It also supports encoding diacritics (e.g. shadda + fatha → 0.61) using a smart fractional system,
making it suitable for:
- Linguistic & Quranic text analysis
- Root extraction and semantic processing
- Text compression and secure Arabic encoding
- AI/NLP training with language-specific precision
| File | Description |
|---|---|
abgd.py |
Core encoding/decoding logic based on decimal Abjad values |
main.py |
Example that encodes & decodes full Arabic phrase |
README.md |
This documentation |
To run the example:
python main.pyExpected output:
[
2.3, 60.4, 40.3, # بِسْمِ
1.0, 30.4, 30.6, 5.3, # اللَّهِ
1.0, 30.4, 200.61, 8.4, 40.1, 1.0, 50.3, # الرَّحْمَٰنِ
1.0, 30.4, 200.63, 8.3, 10.3, 40.3 # الرَّحِيمِ
]You can import the abgd.py module into your own scripts or apps:
from abgd import encode, decode
text = "الرَّحْمَٰنِ"
code = encode(text)
print("Encoded:", code)
# Output: [1.0, 30.4, 200.61, 8.4, 40.1, 1.0, 50.3]
decoded = decode(code)
print("Decoded:", decoded)
# Output: الرَّحْمَٰنِencode(text)→ takes Arabic string, returns list of decimal values.decode(code_list)→ reverses encoded list back to Arabic text with diacritics.- Encoding handles:
- حرف + تشكيلتين (مثل شدة + فتحة)
- تمييز بين أنواع الألف (أ، آ، إ...)
- علامات سكون وتنوين
- Values like
0.61mean: شدة (6) + فتحة (1)
- ✅ Fully reversible: decode returns full word with diacritics
- ✅ Language-aware: differentiates between أ، إ، آ, ء, etc.
- ✅ Extendable: supports pause marks, root analysis, and contextual forms
- ✅ Free of Unicode dependency: mathematically encoded
This project is licensed under the MIT License. Open for contributions & extensions.