PhishShield is a phishing website detector using two ML models (feature-based and text-based).
The project utilizes two separate datasets, each tailored for training a specific machine learning model.
-
Feature-based: Makes prediction based on 29 URL features extracted from the URL.
-
Text-based: Makes predition by analyzing URL text, words used in URL.
-
Both models use pipelines, transformers, and hyperparameter tuning with grid search.
-
Backend: Flask app serving prediction endpoints. Disk caching for improving speed.
-
Frontend: Simple HTML/CSS/Bootstrap UI. Enter a URL, get prediction.
To use the PhishShield, follow these steps:
-
Clone the repository:
git clone --depth=1 https://github.com/praneeth-katuri/PhishShield.git -
Install the required dependencies:
Python Version:
3.12.3pip install -r requirements.txt -
Run the NLTK setup script:
python utils/setup_nltk.py -
Edit
.envfile and enter your reCAPTCHA Keys andFlask Secret KeyTo generate
Flask Secret Keyrun the below code in terminal and copy the Output key obtained in.envfilepython -c 'import secrets; print(secrets.token_hex(16))' -
To start the Flask application, run the following command in your terminal:
python run.py -
To access the webpage interface, open
http://127.0.0.1:5000in your web browser.
Metrics Evaluated: accuracy, precision, recall, F1-score.
Contributions to this project are welcome! If you have ideas for improvements or new features, feel free to open an issue or submit a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.



