PhishShield

PhishShield is a phishing website detector using two ML models (feature-based and text-based).

Datasets

The project utilizes two separate datasets, each tailored for training a specific machine learning model.

Dataset for Feature-based Model

The dataset used to train the feature-based model

Dataset for Text-based Model

The dataset used to train the text-based model

Models

Feature-based: Makes prediction based on 29 URL features extracted from the URL.
Text-based: Makes predition by analyzing URL text, words used in URL.
Both models use pipelines, transformers, and hyperparameter tuning with grid search.

Deployment

Backend: Flask app serving prediction endpoints. Disk caching for improving speed.
Frontend: Simple HTML/CSS/Bootstrap UI. Enter a URL, get prediction.

Webpage Interface

Usage

To use the PhishShield, follow these steps:

Clone the repository:

git clone --depth=1 https://github.com/praneeth-katuri/PhishShield.git

Install the required dependencies:

Python Version: 3.12.3
```
pip install -r requirements.txt
```
Run the NLTK setup script:
```
python utils/setup_nltk.py
```
Edit .env file and enter your reCAPTCHA Keys and Flask Secret Key

To generate Flask Secret Key run the below code in terminal and copy the Output key obtained in .env file
```
python -c 'import secrets; print(secrets.token_hex(16))'
```
To start the Flask application, run the following command in your terminal:
```
python run.py
```
To access the webpage interface, open http://127.0.0.1:5000 in your web browser.

Results

Metrics Evaluated: accuracy, precision, recall, F1-score.

Feature-based Model

Text-based Model

Contributing

Contributions to this project are welcome! If you have ideas for improvements or new features, feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
app		app
datasets		datasets
models		models
preprocessing		preprocessing
screenshots		screenshots
tests/unit		tests/unit
utils		utils
.dockerignore		.dockerignore
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhishShield

Datasets

Dataset for Feature-based Model

Dataset for Text-based Model

Models

Deployment

Webpage Interface

Usage

Results

Feature-based Model

Text-based Model

Contributing

License

About

Uh oh!

Uh oh!

Languages

License

praneeth-katuri/phish-shield

Folders and files

Latest commit

History

Repository files navigation

PhishShield

Datasets

Dataset for Feature-based Model

Dataset for Text-based Model

Models

Deployment

Webpage Interface

Usage

Results

Feature-based Model

Text-based Model

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages