Benchmarking Sentiment Classifiers Against Adversarial and Stress Attacks

Objective

This project benchmarks the robustness of a DistilBERT-based sentiment classifier against various adversarial threats, including typographical errors, flood attacks with long junk texts, and simulated DDoS attacks. The goal is to evaluate model stability, response time, and security under real-world attack conditions.

Datasets

IMDB Movie Reviews: Complex, natural movie reviews, ideal for evaluating typo robustness.
Amazon Mobile Apps Reviews: Short, bursty reviews, perfect for testing stress and system load impacts.

Why These Datasets?

IMDB reviews offer long-form natural language testing, while Amazon Mobile Apps reviews simulate real-world short bursts of text traffic, common in production ML deployments.

Model Used

DistilBERT-base-uncased-finetuned-sst-2-english
Chosen for speed, efficiency, and strong benchmark sentiment classification performance.

Attack Methods

Typo Adversarial Attack: Introducing random character errors.
Flood Attack: Long text flooding to simulate sequence overflow.
DDoS Attack Simulation: Rapid injection of small random junk inputs to simulate system overload.

Key Results

Dataset	Clean Accuracy	Typo Attack Accuracy
IMDB	49.4%	52.8%
Amazon Mobile Apps	88.0%	71.3%

Response time increased significantly under flood and DDoS conditions.

Key Takeaways

Despite reasonable typo robustness, NLP models like DistilBERT degrade significantly under flood and DDoS-style attacks, exposing vulnerabilities for real-world deployment scenarios.

Outputs:

Full benchmarking notebook (Benchmarking.ipynb)
Adversarial Results (CSV, HTML)
Graphs (Accuracy Comparison, Response Time)
Full HTML Final Report

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Benchmarking Sentiment Classifiers Against Adversarial and Stress Attacks - Report.pdf		Benchmarking Sentiment Classifiers Against Adversarial and Stress Attacks - Report.pdf
Benchmarking.ipynb		Benchmarking.ipynb
Diagram.png		Diagram.png
LICENSE		LICENSE
README.md		README.md
accuracy_comparison.png		accuracy_comparison.png
amazon_mobile_apps_adversarial_results.csv		amazon_mobile_apps_adversarial_results.csv
amazon_mobile_apps_adversarial_results.html		amazon_mobile_apps_adversarial_results.html
ddos_attack_results.csv		ddos_attack_results.csv
ddos_attack_results.html		ddos_attack_results.html
final_report.html		final_report.html
flood_attack_results.csv		flood_attack_results.csv
flood_attack_results.html		flood_attack_results.html
imdb_adversarial_results.csv		imdb_adversarial_results.csv
imdb_adversarial_results.html		imdb_adversarial_results.html
response_time_comparison.png		response_time_comparison.png
summary_accuracy_comparison.csv		summary_accuracy_comparison.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Benchmarking Sentiment Classifiers Against Adversarial and Stress Attacks

Objective

Datasets

Why These Datasets?

Model Used

Attack Methods

Key Results

Key Takeaways

About

Uh oh!

Releases

Packages

Languages

License

iamdevnd/Transformer-Robustness-Benchmark

Folders and files

Latest commit

History

Repository files navigation

Benchmarking Sentiment Classifiers Against Adversarial and Stress Attacks

Objective

Datasets

Why These Datasets?

Model Used

Attack Methods

Key Results

Key Takeaways

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages