Skip to content

LLM | Security | Operations in one github repo with good links and pictures.

Notifications You must be signed in to change notification settings

wearetyomsmnv/Awesome-LLMSecOps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

58 Commits
Β 
Β 
Β 
Β 

Repository files navigation

LLMSecOps

πŸš€ Awesome LLMSecOps

Awesome GitHub stars GitHub forks GitHub last commit

πŸ” A curated list of awesome resources for LLMSecOps (Large Language Model Security Operations) 🧠

by @wearetyomsmnv and people

Architecture | Vulnerabilities | Tools | Defense | Threat Modeling | Jailbreaks | RAG Security | PoC's | Study Resources | Books | Blogs | Datasets for Testing | OPS Security | Frameworks | Best Practices | Research | Tutorials | Companies | Community Resources

LLM safety is a huge body of knowledge that is important and relevant to society today. The purpose of this Awesome list is to provide the community with the necessary knowledge on how to build an LLM development process - safe, as well as what threats may be encountered along the way. Everyone is welcome to contribute.

Important

This repository, unlike many existing repositories, emphasizes the practical implementation of security and does not provide a lot of references to arxiv in the description.


Architecture risks

Overview of fundamental architectural risks and challenges in LLM systems.

Risk Description
Recursive Pollution LLMs can produce incorrect output with high confidence. If such output is used in training data, it can cause future LLMs to be trained on polluted data, creating a feedback loop problem.
Data Debt LLMs rely on massive datasets, often too large to thoroughly vet. This lack of transparency and control over data quality presents a significant risk.
Black Box Opacity Many critical components of LLMs are hidden in a "black box" controlled by foundation model providers, making it difficult for users to manage and mitigate risks effectively.
Prompt Manipulation Manipulating the input prompts can lead to unstable and unpredictable LLM behavior. This risk is similar to adversarial inputs in other ML systems.
Poison in the Data Training data can be contaminated intentionally or unintentionally, leading to compromised model integrity. This is especially problematic given the size and scope of data used in LLMs.
Reproducibility Economics The high cost of training LLMs limits reproducibility and independent verification, leading to a reliance on commercial entities and potentially unreviewed models.
Model Trustworthiness The inherent stochastic nature of LLMs and their lack of true understanding can make their output unreliable. This raises questions about whether they should be trusted in critical applications.
Encoding Integrity Data is often processed and re-represented in ways that can introduce bias and other issues. This is particularly challenging with LLMs due to their unsupervised learning nature.

From Berryville Institute of Machine Learning (BIML) paper

Vulnerabilities description

by Giskard

Common vulnerabilities and security issues found in LLM applications.

Vulnerability Description
Hallucination and Misinformation These vulnerabilities often manifest themselves in the generation of fabricated content or the spread of false information, which can have far-reaching consequences such as disseminating misleading content or malicious narratives.
Harmful Content Generation This vulnerability involves the creation of harmful or malicious content, including violence, hate speech, or misinformation with malicious intent, posing a threat to individuals or communities.
Prompt Injection Users manipulating input prompts to bypass content filters or override model instructions can lead to the generation of inappropriate or biased content, circumventing intended safeguards.
Robustness The lack of robustness in model outputs makes them sensitive to small perturbations, resulting in inconsistent or unpredictable responses that may cause confusion or undesired behavior.
Output Formatting When model outputs do not align with specified format requirements, responses can be poorly structured or misformatted, failing to comply with the desired output format.
Information Disclosure This vulnerability occurs when the model inadvertently reveals sensitive or private data about individuals, organizations, or entities, posing significant privacy risks and ethical concerns.
Stereotypes and Discrimination If model's outputs are perpetuating biases, stereotypes, or discriminatory content, it leads to harmful societal consequences, undermining efforts to promote fairness, diversity, and inclusion.

LLMSecOps Life Cycle

Group 2

πŸ›  Tools for scanning

Security scanning and vulnerability assessment tools for LLM applications.

Tool Description Stars
πŸ”§ Garak LLM vulnerability scanner GitHub stars
πŸ”§ ps-fuzz 2 Make your GenAI Apps Safe & Secure πŸš€ Test & harden your system prompt GitHub stars
πŸ—ΊοΈ LLMmap Tool for mapping LLM vulnerabilities GitHub stars
πŸ›‘οΈ Agentic Security Security toolkit for AI agents GitHub stars
πŸ”’ LLM Confidentiality Tool for ensuring confidentiality in LLMs GitHub stars
πŸ”’ PyRIT The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems. GitHub stars
πŸ”§ promptfoo LLM red teaming and evaluation framework. Test for jailbreaks, prompt injection, and other vulnerabilities with adversarial attacks (PAIR, tree-of-attacks, crescendo). CI/CD integration. GitHub stars
πŸ”§ LLaMator Framework for testing vulnerabilities of large language models with support for Russian language GitHub stars
πŸ”§ Spikee Comprehensive testing framework for LLM applications. Tests prompt injection, jailbreaks, and other vulnerabilities. Supports custom targets, attacks, judges, and guardrail evaluation GitHub stars
πŸ›‘οΈ LocalMod Self-hosted content moderation API with prompt injection detection, toxicity filtering, PII detection, and NSFW filtering. Runs 100% offline. GitHub stars

πŸ›‘οΈDefense

Defensive mechanisms, guardrails, and security controls for protecting LLM applications.

Security by Design

Category Method / Technology Principle of Operation (Mechanism) Examples of Use / Developers
1. Fundamental Alignment RLHF (Reinforcement Learning from Human Feedback) Training a model with reinforcement learning based on a reward model, which is trained on human evaluations. It optimizes for "usefulness" and "safety." OpenAI (GPT-4), Yandex (YandexGPT)
DPO (Direct Preference Optimization) Direct optimization of response probabilities based on preference pairs, bypassing the creation of a separate reward model. It is described as more stable and effective. Meta (Llama 3), Mistral, open models
Constitutional AI / RLAIF Using the model itself to criticize and correct its responses according to a set of rules ("Constitution"). AI replaces human labeling (RLAIF). Anthropic (Claude 3)
2. Internal Control (Interpretability) Representation Engineering (RepE) Detection and suppression of neuron activation vectors responsible for undesirable concepts (e.g., falsehood, lust for power) in real-time. Center for AI Safety (CAIS)
Circuit Breakers Redirection ("short-circuiting") of internal representations of malicious queries into orthogonal space, causing failure or nonsense. GraySwan AI, researchers
Machine Unlearning Algorithmic "erasure" of dangerous knowledge or protected data from model weights (e.g., via Gradient Ascent) so that the model physically "forgets" them. Research groups, Microsoft
3. External Filters (Guardrails) Llama Guard A specialized LLM-classifier that checks incoming prompts and outgoing responses for compliance with a risk taxonomy (MLCommons). Meta
NeMo Guardrails A programmable dialogue management system. It uses the Colang language for strict topic adherence and attack blocking. NVIDIA
Prompt Guard / Shields Lightweight models (based on BERT/DeBERTA) for detecting jailbreaks and prompt injections before they reach the LLM. Meta, Azure AI
SmoothLLM A randomized smoothing method: creating copies of a prompt with symbolic perturbations to disrupt the structure of adversarial attacks (e.g., GCG suffixes). Researchers (SmoothLLM authors)
Google Safety Filters Multi-level content filtering with customizable sensitivity thresholds and semantic vector analysis. Google (Gemini API)
4. System Instructions System Prompts / Tags Using special tokens (e.g., </start_header_id>) to separate system and user instructions. OpenAI, Meta, Anthropic
Instruction Hierarchy Prioritizing system instructions over user instructions to protect against prompt injection, especially when the model learns to ignore "forget past instructions" commands. OpenAI (GPT-4o Mini)
5. Testing (Red Teaming) Automated Attacks (GCG, AutoDAN) Using algorithms and other LLMs to generate hundreds of thousands of adversarial prompts to find vulnerabilities. Research groups
Tool Description Stars
πŸ›‘οΈ PurpleLlama Set of tools to assess and improve LLM security. GitHub stars
πŸ›‘οΈ Rebuff API with built-in rules for identifying prompt injection and detecting data leakage through canary words. (ProtectAI is now part of Palo Alto Networks) GitHub stars
πŸ”’ LLM Guard Self-hostable tool with multiple prompt and output scanners for various security issues. GitHub stars
🚧 NeMo Guardrails Tool that protects against jailbreak and hallucinations with customizable rulesets. GitHub stars
πŸ‘οΈ Vigil Offers dockerized and local setup options, using proprietary HuggingFace datasets for security detection. GitHub stars
🧰 LangKit Provides functions for jailbreak detection, prompt injection, and sensitive information detection. GitHub stars
πŸ› οΈ GuardRails AI Focuses on functionality, detects presence of secrets in responses. GitHub stars
🦸 Hyperion Alpha Detects prompt injections and jailbreaks. N/A
πŸ›‘οΈ LLM-Guard Tool for securing LLM interactions. (ProtectAI is now part of Palo Alto Networks) GitHub stars
🚨 Whistleblower Tool for detecting and preventing LLM vulnerabilities. GitHub stars
πŸ” Plexiglass Security tool for LLM applications. GitHub stars
πŸ” Prompt Injection defenses Rules for protected LLM GitHub stars
πŸ” LLM Data Protector Tools for protected LLM in chatbots N/A
πŸ” Gen AI & LLM Security for developers: Prompt attack mitigations on Gemini Security tool for LLM applications. GitHub stars
πŸ” TrustGate Generative Application Firewall that detects and blocks attacks against GenAI Applications. GitHub stars
πŸ›‘οΈ Tenuo Capability tokens for AI agents with task-scoped TTLs, offline verification and proof-of-possession binding. GitHub stars
πŸ›‘οΈ AIDEFEND Practical knowledge base for AI security defenses. Based on MAESTRO framework, MITRE D3FEND, ATLAS, ATT&CK, Google Secure AI Framework, and OWASP Top 10 LLM 2025/ML Security 2023. N/A

Threat Modeling

Frameworks and methodologies for identifying and modeling threats in LLM systems.

Tool Description
Secure LLM Deployment: Navigating and Mitigating Safety Risks Research paper on LLM security [sorry, but is really cool]
ThreatModels Repository for LLM threat models
Threat Modeling LLMs AI Village resource on threat modeling for LLMs
Sberbank AI Cybersecurity Threat Model Sberbank's threat model for AI cybersecurity
Pangea Attack Taxonomy Comprehensive taxonomy of AI/LLM attacks and vulnerabilities

image image

Monitoring

Tools and platforms for monitoring LLM applications, detecting anomalies, and tracking security events.

Tool Description
Langfuse Open Source LLM Engineering Platform with security capabilities.
HiveTrace LLM monitoring and security platform for GenAI applications. Detects prompt injection, jailbreaks, malicious HTML/Markdown elements, and PII. Provides real-time anomaly detection and security alerts.

Watermarking

Tools and techniques for watermarking LLM-generated content to detect AI-generated text.

Tool Description
MarkLLM An Open-Source Toolkit for LLM Watermarking.

Jailbreaks

Resources, databases, and benchmarks for understanding and testing jailbreak techniques against LLMs.

Resource Description Stars
JailbreakBench Website dedicated to evaluating and analyzing jailbreak methods for language models N/A
L1B3RT45 GitHub repository containing information and tools related to AI jailbreaking GitHub stars
llm-hacking-database This repository contains various attacks against Large Language Models GitHub stars
HaizeLabs jailbreak Database This database contains jailbreaks for multimodal language models N/A
Lakera PINT Benchmark A comprehensive benchmark for prompt injection detection systems. Evaluates detection systems across multiple categories (prompt injection, jailbreak, hard negatives, chat, documents) and supports evaluation in 20+ languages. Open-source benchmark with Jupyter notebook for custom evaluations. GitHub stars
EasyJailbreak An easy-to-use Python framework to generate adversarial jailbreak prompts GitHub stars

LLM Interpretability

Resources for understanding and interpreting LLM behavior, decision-making, and internal mechanisms.

Resource Description
Π˜Π½Ρ‚Π΅Ρ€ΠΏΡ€Π΅Ρ‚ΠΈΡ€ΡƒΠ΅ΠΌΠΎΡΡ‚ΡŒ LLM Dmitry Kolodezev's web page, which provides useful resources with LLM interpretation techniques

PINT Benchmark scores (by lakera)

Prompt Injection Test (PINT) benchmark scores comparing different prompt injection detection systems.

Name PINT Score Test Date
Lakera Guard 95.2200% 2025-05-02
Azure AI Prompt Shield for Documents 89.1241% 2025-05-02
protectai/deberta-v3-base-prompt-injection-v2 79.1366% 2025-05-02
Llama Prompt Guard 2 (86M) 78.7578% 2025-05-05
Google Model Armor 70.0664% 2025-08-27
Aporia Guardrails 66.4373% 2025-05-02
Llama Prompt Guard 61.8168% 2025-05-02

Note: ProtectAI is now part of Palo Alto Networks

Hallucinations Leaderboard

Top 25 Hallucination Rates

Note: For the complete and most up-to-date interactive leaderboard, visit the Hugging Face leaderboard or the GitHub repository.

From this repo (last updated: December 18, 2025)

image

This is a Safety Benchmark from Stanford University


RAG Security

Security considerations, attacks, and defenses for Retrieval-Augmented Generation (RAG) systems.

Resource Description
Security Risks in RAG Article on security risks in Retrieval-Augmented Generation (RAG)
How RAG Poisoning Made LLaMA3 Racist Blog post about RAG poisoning and its effects on LLaMA3
Adversarial AI - RAG Attacks and Mitigations GitHub repository on RAG attacks, mitigations, and defense strategies
PoisonedRAG GitHub repository about poisoned RAG systems
ConfusedPilot: Compromising Enterprise Information Integrity and Confidentiality with Copilot for Microsoft 365 Article about RAG vulnerabilities
Awesome Jailbreak on LLMs - RAG Attacks Collection of RAG-based LLM attack techniques

image

Agentic security

Security tools, benchmarks, and research focused on autonomous AI agents and their vulnerabilities.

Tool Description Stars
invariant A trace analysis tool for AI agents. GitHub stars
AgentBench A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24) GitHub stars
Agent Hijacking, the true impact of prompt injection Guide for attack langchain agents Article
Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification Research about typical agent vulnerabilities Article
Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers First large-scale empirical study of MCP servers security and maintainability Article
Awesome MCP Security Curated list of MCP security resources GitHub stars
Awesome LLM Agent Security Comprehensive collection of LLM agent security resources, attacks, vulnerabilities GitHub stars
MCP Security Analysis Research paper on MCP security vulnerabilities and analysis Article
Tenuo Capability-based authorization framework for AI agents. Task-scoped warrants with cryptographic attenuation, PoP binding, offline verification. LangChain/LangGraph/MCP integrations. GitHub stars

Agentic Browser Security

Security research and analysis of AI-powered browser agents and their unique attack vectors.

Resource Description Source
From Inbox to Wipeout: Perplexity Comet's AI Browser Quietly Erasing Google Drive Research on zero-click Google Drive wiper attack via Perplexity Comet. Shows how polite, well-structured emails can trigger destructive actions in agentic browsers. Straiker STAR Labs
Agentic Browser Security Analysis Research paper on security vulnerabilities in agentic browsers Article
Browser AI Agents: The New Weakest Link Analysis of security risks in browser-based AI agents Sqrx Labs
Comet Prompt Injection Vulnerability Brave's analysis of prompt injection vulnerabilities in Perplexity Comet browser Brave

PoC

Proof of Concept implementations demonstrating various LLM attacks, vulnerabilities, and security research.

Tool Description Stars
Visual Adversarial Examples Jailbreaking Large Language Models with Visual Adversarial Examples GitHub stars
Weak-to-Strong Generalization Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision GitHub stars
Image Hijacks Repository for image-based hijacks of large language models GitHub stars
CipherChat Secure communication tool for large language models GitHub stars
LLMs Finetuning Safety Safety measures for fine-tuning large language models GitHub stars
Virtual Prompt Injection Tool for virtual prompt injection in language models GitHub stars
FigStep Jailbreaking Large Vision-language Models via Typographic Visual Prompts GitHub stars
stealing-part-lm-supplementary Some code for "Stealing Part of a Production Language Model" GitHub stars
Hallucination-Attack Attack to induce LLMs within hallucinations GitHub stars
llm-hallucination-survey Reading list of hallucination in LLMs. Check out our new survey paper: "Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models" GitHub stars
LMSanitator LMSanitator: Defending Large Language Models Against Stealthy Prompt Injection Attacks GitHub stars
Imperio Imperio: Robust Prompt Engineering for Anchoring Large Language Models GitHub stars
Backdoor Attacks on Fine-tuned LLaMA Backdoor Attacks on Fine-tuned LLaMA Models GitHub stars
CBA Consciousness-Based Authentication for LLM Security GitHub stars
MuScleLoRA A Framework for Multi-scenario Backdoor Fine-tuning of LLMs GitHub stars
BadActs BadActs: Backdoor Attacks against Large Language Models via Activation Steering GitHub stars
TrojText Trojan Attacks on Text Classifiers GitHub stars
AnyDoor Create Arbitrary Backdoor Instances in Language Models GitHub stars
PromptWare A Jailbroken GenAI Model Can Cause Real Harm: GenAI-powered Applications are Vulnerable to PromptWares GitHub stars
BrokenHill Automated attack tool that generates crafted prompts to bypass restrictions in LLMs using greedy coordinate gradient (GCG) attack GitHub stars
OWASP Agentic AI OWASP Top 10 for Agentic AI (AI Agent Security) - Pre-release version GitHub stars

Study resource

Educational platforms, CTF challenges, courses, and training resources for learning LLM security.

Tool Description
Gandalf Interactive LLM security challenge game
Prompt Airlines Platform for learning and practicing prompt engineering
PortSwigger LLM Attacks Educational resource on WEB LLM security vulnerabilities and attacks
Invariant Labs CTF 2024 CTF. You should hack LLM agentic
Invariant Labs CTF Summer 24 Hugging Face Space with CTF challenges
Crucible LLM security training platform
Poll Vault CTF CTF challenge with ML/LLM components
MyLLMDoc LLM security training platform
AI CTF PHDFest2 2025 AI CTF competition from PHDFest2 2025
AI in Security Russian platform for AI security training
DeepLearning.AI Red Teaming Course Short course on red teaming LLM applications
Learn Prompting: Offensive Measures Guide on offensive prompt engineering techniques
Application Security LLM Testing Free LLM security testing
Salt Security Blog: ChatGPT Extensions Vulnerabilities Article on security flaws in ChatGPT browser extensions
safeguarding-llms TMLS 2024 Workshop: A Practitioner's Guide To Safeguarding Your LLM Applications
Damn Vulnerable LLM Agent Intentionally vulnerable LLM agent for security testing and education
GPT Agents Arena Platform for testing and evaluating LLM agents in various scenarios
AI Battle Interactive game focusing on AI security challenges
AI/LLM Exploitation Challenges Challenges to test your knowledge of AI, ML, and LLMs
TryHackMe AI/ML Security Threats Walkthrough and writeup for TryHackMe AI/ML Security Threats room

image

πŸ“Š Community research articles

Research articles, security advisories, and technical papers from the security community.

Title Authors Year
πŸ“„ Bypassing Meta's LLaMA Classifier: A Simple Jailbreak Robust Intelligence 2024
πŸ“„ Vulnerabilities in LangChain Gen AI Unit42 2024
πŸ“„ Detecting Prompt Injection: BERT-based Classifier WithSecure Labs 2024
πŸ“„ Practical LLM Security: Takeaways From a Year in the Trenches NVIDIA 2024
πŸ“„ Security ProbLLMs in xAI's Grok Embrace The Red 2024
πŸ“„ Persistent Pre-Training Poisoning of LLMs SpyLab AI 2024
πŸ“„ Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents Multiple Authors 2024
πŸ“„ Practical AI Agent Security Meta 2025
πŸ“„ Security Advisory: Anthropic's Slack MCP Server Vulnerable to Data Exfiltration Embrace The Red 2025

πŸŽ“ Tutorials

Step-by-step guides and tutorials for understanding and implementing LLM security practices.

Resource Description
πŸ“š HADESS - Web LLM Attacks Understanding how to carry out web attacks using LLM
πŸ“š Red Teaming with LLMs Practical methods for attacking AI systems
πŸ“š Lakera LLM Security Overview of attacks on LLM

πŸ“š Books

Comprehensive books covering LLM security, adversarial AI, and secure AI development practices.

πŸ“– Title πŸ–‹οΈ Author(s) πŸ” Description
The Developer's Playbook for Large Language Model Security Steve Wilson πŸ›‘οΈ Comprehensive guide for developers on securing LLMs
Generative AI Security: Theories and Practices (Future of Business and Finance) Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright, Jyoti Ponnapalli πŸ”¬ In-depth exploration of security theories, laws, terms and practices in Generative AI
Adversarial AI Attacks, Mitigations, and Defense Strategies: A cybersecurity professional's guide to AI attacks, threat modeling, and securing AI with MLSecOps John Sotiropoulos Practical examples of code for your best mlsecops pipeline

BLOGS

Security blogs, Twitter feeds, and Telegram channels focused on AI/LLM security.

Websites & Twitter

Resource Description
Embrace The Red Blog on AI security, red teaming, and LLM vulnerabilities
HiddenLayer AI security company blog
CyberArk Blog on AI agents, identity risks, and security
Straiker AI security research and agentic browser security
Firetail LLM security, prompt injection, and AI vulnerabilities
Palo Alto Networks Unit 42 research on AI security and agentic AI attacks
Trail of Bits Security research including AI/ML pickle file security
NCSC UK National Cyber Security Centre blog on AI safeguards
Knostic AI Security Posture Management (AISPM)
0din Secure LLM and RAG deployment practices
@llm_sec Twitter feed on LLM security
@LLM_Top10 Twitter feed on OWASP LLM Top 10
@aivillage_dc AI Village Twitter
@elder_plinius Twitter feed on AI security

Telegram Channels

Channel Language Description
PWN AI RU Practical AI Security and MLSecOps: LLM security, agents, guardrails, real-world threats
Борис_ь с ml RU Machine Learning + Information Security: ML, data science and cyber/AI security
Π•Π²Π³Π΅Π½ΠΈΠΉ ΠšΠΎΠΊΡƒΠΉΠΊΠΈΠ½ β€” Raft RU Building Raft AI and GPT-based applications: trust & safety, reliability and security
LLM Security RU Focused on LLM security: jailbreaks, prompt injection, adversarial attacks, benchmarks
AISecHub EN Global AI security hub: curated research, articles, reports and tools
AI Security Lab RU Laboratory by Raft x ITMO University: breaking and defending AI systems
ML&Sec Feed RU/EN Aggregated feed for ML & security: news, tools, research links
AISec [x_feed] RU/EN Digest of AI security content from X, blogs and papers
AI SecOps RU AI Security Operations: monitoring, incident response, SIEM/SOC integrations
OK ML RU ML/DS/AI channel with focus on repositories, tools and vulnerabilities
AI Attacks EN Stream of AI attack examples and threat intelligence
AGI Security EN Artificial General Intelligence Security discussions

DATA

Datasets for testing LLM security, prompt injection examples, and safety evaluation data.

Resource Description
Safety and privacy with Large Language Models GitHub repository on LLM safety and privacy
Jailbreak LLMs Data for jailbreaking Large Language Models
ChatGPT System Prompt Repository containing ChatGPT system prompts
Do Not Answer Project related to LLM response control
ToxiGen Microsoft dataset
SafetyPrompts A Living Catalogue of Open Datasets for LLM Safety
llm-security-prompt-injection This project investigates the security of large language models by performing binary classification of a set of input prompts to discover malicious prompts. Several approaches have been analyzed using classical ML algorithms, a trained LLM model, and a fine-tuned LLM model.
Prompt Injections Dataset Dataset containing prompt injection examples for testing and research

OPS

Operational security considerations: supply chain risks, infrastructure vulnerabilities, and production deployment security.

Group 4

Resource Description
https://sysdig.com/blog/llmjacking-stolen-cloud-credentials-used-in-new-ai-attack/ LLMJacking: Stolen Cloud Credentials Used in New AI Attack
https://huggingface.co/docs/hub/security Hugging Face Hub Security Documentation
https://github.com/ShenaoW/awesome-llm-supply-chain-security LLM Supply chain security resources
https://developer.nvidia.com/blog/secure-llm-tokenizers-to-maintain-application-integrity/ Secure LLM Tokenizers to Maintain Application Integrity
https://sightline.protectai.com/ Sightline by ProtectAI (ProtectAI is now part of Palo Alto Networks)

Check vulnerabilities on:
β€’ Nemo by Nvidia
β€’ Deep Lake
β€’ Fine-Tuner AI
β€’ Snorkel AI
β€’ Zen ML
β€’ Lamini AI
β€’ Comet
β€’ Titan ML
β€’ Deepset AI
β€’ Valohai

For finding LLMops tools vulnerabilities
ShadowMQ: How Code Reuse Spread Critical Vulnerabilities Across the AI Ecosystem Research on critical RCE vulnerabilities in AI inference servers (Meta Llama Stack, NVIDIA TensorRT-LLM, vLLM, SGLang, Modular) caused by unsafe ZeroMQ and pickle deserialization

πŸ— Frameworks

Comprehensive security frameworks, standards, and governance models for LLM and AI security.


OWASP Top 10 for LLM Applications 2025 (v2.0)

Updated list including System Prompt Leakage, Vector and Embedding Weaknesses

OWASP Top 10 for Agentic Applications (2026 Edition)

First industry standard for autonomous AI agent risks (released Dec 2025)

OWASP AI Testing Guide v1

Open standard for testing AI system trustworthiness (Nov 2025)

GenAI Security Solutions Reference Guide

Vendor-neutral guide for GenAI security architecture (Q2-Q3 2025)

LLM AI Cybersecurity & Governance Checklist

Security and governance checklist

LLMSecOps Cybersecurity Solution Landscape

Solution landscape overview

All OWASP GenAI Resources: genai.owasp.org/resources/

LLMSECOPS, by OWASP

Group 12

Additional Security Frameworks

Framework Organization Description
MCP Security Governance Cloud Security Alliance Governance framework for the Model Context Protocol ecosystem. Developing policies, standards, and assessment tools for secure MCP server deployment.
Databricks AI Security Framework (DASF) 2.0 Databricks Actionable framework for managing AI security. Includes 62 security risks across three stages and 64 controls applicable to any data and AI platform.
Google Secure AI Framework (SAIF) 2.0 Google Secure AI Framework focused on agents. Practitioner-focused framework for building powerful agents users can trust.
Snowflake AI Security Framework Snowflake Comprehensive framework for securing AI deployments on Snowflake platform.

AI Security Solutions Radar

2025 AI Security Solutions Radar

Source: 2025 AI Security Solutions Radar by RiskInsight-Wavestone


🌐 Community

Community resources, platforms, and collaborative spaces for LLM security practitioners.

Platform Details
OWASP SLACK Channels:
β€’ #project-top10-for-llm
β€’ #ml-risk-top5
β€’ #project-ai-community
β€’ #project-mlsec-top10
β€’ #team-llm_ai-secgov
β€’ #team-llm-redteam
β€’ #team-llm-v2-brainstorm
Awesome LLM Security GitHub repository
Awesome AI Security Telegram Curated list of Telegram channels and chats on AI Security, AI/MLSecOps, LLM Security
LVE_Project Official website
Lakera AI Security resource hub Google Sheets document
llm-testing-findings Templates with recommendations, CWE and other
Arcanum Prompt Injection Taxonomy Structured taxonomy of prompt injection attacks categorizing attack intents, techniques, and evasions. Resource for security researchers, AI developers, and red teamers.

Benchmarks

Security benchmarks, evaluation frameworks, and standardized tests for assessing LLM security capabilities.

Resource Description Stars
Backbone Breaker Benchmark (b3) Human-grounded benchmark for testing AI agent security. Built by Lakera with UK AI Security Institute using 194,000+ human attack attempts from Gandalf: Agent Breaker. Tests backbone LLM resilience across 10 threat snapshots. Article
Backbone Breaker Benchmark Paper Research paper on the Backbone Breaker Benchmark methodology and findings Article
CyberSoCEval Meta's benchmark for evaluating LLM capabilities in malware analysis and threat intelligence reasoning Meta Research
Agent Security Bench (ASB) Benchmark for agent security GitHub stars
AI Safety Benchmark Comprehensive benchmark for AI safety evaluation N/A
AI Safety Benchmark Paper Research paper on AI safety benchmarking methodologies Article
Evaluating Prompt Injection Datasets Analysis and evaluation framework for prompt injection datasets HiddenLayer
LLM Security Guidance Benchmarks Benchmarking lightweight, open-source LLMs for security guidance effectiveness using SECURE dataset GitHub stars
SECURE Benchmark for evaluating LLMs in cybersecurity scenarios, focusing on Industrial Control Systems GitHub stars
NIST AI TEVV AI Test, Evaluation, Validation and Verification framework by NIST N/A
Taming the Beast: Inside the Llama 3 Red Teaming Process DEF CON 32 presentation on Llama 3 red teaming 2024

About

LLM | Security | Operations in one github repo with good links and pictures.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages