Skip to content

A Streamlit app for exploring content moderation with Llama Guard on Groq.

License

Notifications You must be signed in to change notification settings

alphasecio/llama-guard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Llama Guard

Llama Guard is a natively multimodal input-output safeguard model geared towards Human-AI conversation use cases. If the input is determined to be safe, the response will be Safe. Else, the response will be Unsafe, followed by one or more of the violating categories from the MLCommons Taxonomy of Hazards:

  • S1: Violent Crimes.
  • S2: Non-Violent Crimes.
  • S3: Sex Crimes.
  • S4: Child Sexual Exploitation.
  • S5: Defamation.
  • S6: Specialized Advice.
  • S7: Privacy.
  • S8: Intellectual Property.
  • S9: Indiscriminate Weapons.
  • S10: Hate.
  • S11: Suicide & Self-Harm.
  • S12: Sexual Content.
  • S13: Elections.
  • S14: Code Interpreter Abuse.

This repository contains a Streamlit app for exploring content moderation with Llama Guard 4 on Groq. Sign up for an account at GroqCloud and get an API token, which you'll need for this project. See this blog for more details, and deploy this app on Railway or similar platforms to explore it further.

Here's a sample response by Llama Guard upon detecting a prompt that violated a specific category.

llama-guard

About

A Streamlit app for exploring content moderation with Llama Guard on Groq.

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

Languages