A Machine Learning–powered solution to classify SMS messages as Spam or Ham using TF-IDF & SVM
- SMSGuard is an advanced SMS spam detection system built using Machine Learning and Natural Language Processing (NLP). It analyzes text messages and predicts whether they are Spam or Ham (Genuine) using TF-IDF vectorization and multiple ML models like:
- 🔹 Naive Bayes
- 🔹 Logistic Regression
- 🔹 Support Vector Machine (SVM) : The project includes custom SMS prediction, confusion matrix visualization, and real-world testing, making it ideal for cybersecurity, telecom filtering, and academic ML demonstration.
✔️ Data Preprocessing
- Label encoding (spam = 1, ham = 0)
- Train-test split
- TF-IDF vectorization
✔️ Machine Learning Models Multinomial Naive Bayes Logistic Regression Support Vector Machine (Linear Kernel)
✔️ Evaluation Classification report Accuracy comparison Confusion matrix heatmap
✔️ Real-World Testing 10 real SMS samples tested with model predictions.
✔️ Custom Prediction User can input any SMS to classify instantaneously.
A clean, organized, and scalable folder architecture for your ML project
- Python 🐍
- NumPy
- Pandas
- Scikit-learn
- Matplotlib
- Seaborn
- TF-IDF Vectorizer
- Jupyter Notebook
- Three models were trained and evaluated:
Model Description
- Naive Bayes Fast, good for text classification
- Logistic Regression Strong baseline ML model
- SVM (Linear) Best accuracy & performance : The SVM model performed best and was selected for final deployment.
The project tests the classifier using real-world styled examples including:
- Lottery fraud messages
- Bank/KYC phishing URLs
- Safe messages from family/friends
- Fake reward & cashback links : This ensures the model performs effectively in real scenarios.
A heatmap visualization helps observe:
- True Positives
- True Negatives
- False Positives
- False Negatives
1️⃣ Install Required Libraries pip install -r requirements.txt
2️⃣ Run Jupyter Notebook jupyter notebook
3️⃣ Open the Project File SMSGuard – Intelligent Spam SMS Detection System.ipynb
Example: predict_sms("Congratulations! You won a free lottery. Call now!") Output: Spam
- 🔐 Anti-spam filters
- 📡 Telecom SMS filtering
- 📱 Mobile security apps
- 📊 Academic ML learning projects
- 🛡️ Anti-phishing awareness systems
- 📌 High accuracy
- 📌 Fast and lightweight
- 📌 Real-world applicability
- 📌 Clean modular code
- 📌 Professional ML pipeline
~ AYUSH
- Pull requests are welcome!
- If you find bugs or want new features, feel free to open an issue.
⭐ Support This Project
- If you like this repository, don’t forget to star ⭐ it on GitHub!
Aspiring Data Scientist & Analyst
- 📫 Email: [email protected]
- 🌐 GitHub: https://github.com/ayush13-0
- ℹ️ LinkedIn: www.linkedin.com/in/ayush130