Releases: addok/addok-fr
Releases · addok/addok-fr
v1.1.0
This release brings a major infrastructure upgrade with the integration of long-standing contributions from @cquest, improved caching strategy, and modernized packaging.
✨ What's New
🎯 Integration of @cquest's Work (#9)
After several years, we've finally merged the valuable contributions from @cquest's cquest-11rc1 branch (originally from 2016-2018):
Enhanced Phonemicization Rules:
- Better handling of complex vowel combinations (ae, ei, oeu)
- Improved nasal sound processing (m → n before labial/dental consonants)
- Special cases like "oeufs" → "eu"
- Enhanced support for "y" at word beginning
- More accurate silent consonant handling (including "gn": seigneur → senieur)
- Better duplicate letter removal
- Compiled regex patterns for better performance
Extended Synonym Coverage (#6):
- Added ~48 new synonym mappings (175 → 223 entries)
- Additional address abbreviations and variations
- Includes mappings like "clef/clefs → cle/cles", "gir → giratoire"
- Fixed duplicate entries
🔧 LRU Cache Implementation (#7)
Replaced the simple dictionary cache with a memory-efficient LRU (Least Recently Used) cache:
- Configurable cache size via
PHONEMICIZE_CACHE_SIZEsetting - Default: 500,000 entries (~86 MB), suitable for most French address datasets
- Recommendations:
- 500K entries (~86 MB): Default, suitable for most datasets
- 1M entries (~172 MB): For larger datasets with more unique words
- 250K entries (~43 MB): For memory-constrained environments
- Fixed race condition in cache initialization
- Prevents unbounded memory growth while maintaining performance
🔧 Infrastructure & Development
-
Modernized Packaging (#5)
- Migrated from setup.py to pyproject.toml (PEP 517/518 compliant)
- Modern build system with setuptools>=65.0
- Support for Python 3.9 through 3.14
-
CI/CD Setup (#8)
- Added GitHub Actions workflow for automated testing
- Redis integration in CI pipeline
- Automated pytest and coverage reporting
-
Development Environment
- Added dev dependencies: pytest, pytest-cov, build, twine
📝 Documentation
- Added cache configuration guidance in README
- Memory usage recommendations for different cache sizes
- Improved wording and examples
🔄 Upgrading from 1.0.1
This release is fully backward compatible:
pip install --upgrade addok-frOptional: Configure cache size in your Addok config if needed:
PHONEMICIZE_CACHE_SIZE = 500_000 # Adjust based on your needsNo other configuration changes required.
Full Changelog: 1.0.1...v1.1.0