➡️ Short Description: A Streamlit application that uses a BERT-based model to detect and classify toxic comments across multiple categories.
This application analyzes text for different types of toxicity:
- Detects general toxicity in text
- Identifies severe toxic content
- Recognizes obscene language
- Detects threats in text
- Identifies insulting content
- Recognizes identity-based hate speech
The app provides visual gauge charts showing the probability of each toxicity type, making it easy to interpret the results.
- Log in with your Supabase credentials (or use demo mode if authentication is disabled)
- Enter the text you want to analyze in the text area
- Click the "Analyze" button
- View the results showing different toxicity scores as gauge charts
- See an overall assessment of the text's toxicity
This application requires the following main libraries:
- Streamlit for the user interface
- PyTorch and Transformers for the BERT model
- Supabase for authentication
- Plotly for visualization
All dependencies are listed in the requirements.txt file and are automatically installed during the build process.
app.py: The main Streamlit applicationauth.py: Authentication logic using Supabaserequirements.txt: Lists all Python dependenciesDockerfile: Container configuration for Hugging Face Spacessaved/: Directory for storing the trained modelsrc/: Source code for the model and preprocessing
The model is a fine-tuned BERT classifier trained on the Toxic Comment Classification Dataset. It predicts six different types of toxicity:
- Toxic: General category for unpleasant content
- Severe Toxic: Extreme cases of toxicity
- Obscene: Explicit or vulgar content
- Threat: Expressions of intent to harm
- Insult: Disrespectful or demeaning language
- Identity Hate: Prejudiced language against protected characteristics
Created by Ralph