A real-time American Sign Language (ASL) recognition system using computer vision and deep learning to translate hand gestures into text.
Watch the demo video here: Sign Language Recognition Demo
- Real-time Recognition: Live ASL gesture recognition through webcam
- Hand Detection: Accurate hand landmark detection using MediaPipe
- Deep Learning Model: Custom CNN model trained on ASL alphabet dataset
- High Accuracy: Optimized model for reliable gesture classification
- User-friendly Interface: Simple and intuitive real-time display
- Confidence Scoring: Shows prediction confidence for each gesture
- Multi-platform: Works on Windows, macOS, and Linux
- TensorFlow/Keras - Deep learning framework for model training and inference
- MediaPipe - Hand landmark detection and tracking
- NumPy - Numerical computations and array operations
- OpenCV - Real-time computer vision and image processing
- PIL/Pillow - Image manipulation and preprocessing
- MobileNetV2 - Efficient CNN architecture for real-time inference
- Transfer Learning - Pre-trained weights fine-tuned for ASL recognition
- Data Augmentation - Enhanced training with image transformations
- Python 3.11+ - Primary programming language
- Jupyter Notebooks - Model development and experimentation
- Git - Version control
- Clone the repository
git clone https://github.com/sudiptasarkar011/sign-language.git
cd sign-language- Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install dependencies
pip install -r requirements.txtpython realtime.py- Point your webcam at ASL hand gestures
- The system will detect and classify gestures in real-time
- Press 'q' to quit the application
python train.py- Ensure your dataset is organized in the correct folder structure
- The script will train a new model and save it as
asl_mobilenet_full.keras
sign-language/
├── realtime.py # Real-time recognition script
├── train.py # Model training script
├── asl_mobilenet_full.keras # Pre-trained model
├── classes.txt # Class labels
├── requirements.txt # Python dependencies
├── Demo.mov # Demo video
├── main.ipynb # Jupyter notebook for experimentation
└── README.md # This file
- Hand Detection: MediaPipe detects hand landmarks in real-time
- Region Extraction: Extracts hand region with bounding box
- Preprocessing: Resizes and normalizes the image for model input
- Prediction: MobileNetV2 model classifies the gesture
- Display: Shows the predicted letter with confidence score
- Architecture: MobileNetV2-based CNN
- Input Size: 160x160 RGB images
- Classes: 26 ASL alphabet letters
- Accuracy: ~95%+ on test dataset
- Inference Speed: Real-time (30+ FPS)
You can modify the following parameters in realtime.py:
IMG_SIZE = (160, 160) # Input image size for model
min_detection_confidence=0.6 # Hand detection threshold
min_tracking_confidence=0.5 # Hand tracking threshold
prediction_threshold=0.5 # Minimum confidence for predictions- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- MediaPipe team for the excellent hand detection framework
- TensorFlow team for the deep learning framework
- ASL alphabet dataset contributors
- Open source community for various tools and libraries
Sudipta Sarkar
- GitHub: @sudiptasarkar011
- Email: sudiptasarkar.1108@gmail.com
⭐ If you found this project helpful, please give it a star!