Skip to content

Pinned Loading

  1. AdaShield AdaShield Public

    [ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting."

    Python 69 4

  2. Awesome-T2I-safety-Papers Awesome-T2I-safety-Papers Public

    List of T2I safety papers, updated daily, welcome to discuss using Discussions

    67 1

Repositories

Showing 10 of 18 repositories
  • SaFo-Lab.github.io Public

    The homepage of SaFo Lab

    SaFo-Lab/SaFo-Lab.github.io’s past year of commit activity
    HTML 2 MIT 0 0 0 Updated Jan 28, 2026
  • DoxBench Public

    [ICLR 2026] The official code for "Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models"

    SaFo-Lab/DoxBench’s past year of commit activity
    Jupyter Notebook 22 Apache-2.0 2 0 0 Updated Jan 28, 2026
  • dVLM-AD Public

    Official Repo for “dVLM-AD: Enhance Diffusion Vision-Language-Model for Driving via Controllable Reasoning”

    SaFo-Lab/dVLM-AD’s past year of commit activity
    Python 4 0 0 0 Updated Jan 15, 2026
  • MetaAgent Public

    Offical Repository of MetaAgent Program

    SaFo-Lab/MetaAgent’s past year of commit activity
    Python 37 4 4 0 Updated Dec 2, 2025
  • AutoDAN-Reasoning Public

    A further improvement for the AutoDAN-Turbo through test-time scaling.

    SaFo-Lab/AutoDAN-Reasoning’s past year of commit activity
    Python 11 MIT 3 0 0 Updated Oct 21, 2025
  • AutoDAN-Turbo Public

    [ICLR 2025 Spotlight] The official implementation of our ICLR2025 paper "AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs".

    SaFo-Lab/AutoDAN-Turbo’s past year of commit activity
    Python 344 MIT 58 5 (1 issue needs help) 0 Updated Oct 8, 2025
  • DRIFT Public

    [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents".

    SaFo-Lab/DRIFT’s past year of commit activity
    Python 31 2 1 0 Updated Sep 26, 2025
  • PRISM Public

    PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality

    SaFo-Lab/PRISM’s past year of commit activity
    Python 5 MIT 1 0 0 Updated Sep 12, 2025
  • AGrail4Agent Public

    [ACL 2025] The official code for "AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection".

    SaFo-Lab/AGrail4Agent’s past year of commit activity
    Python 32 1 0 0 Updated Aug 4, 2025
  • llm-armor Public
    SaFo-Lab/llm-armor’s past year of commit activity
    JavaScript 0 0 0 0 Updated Jul 23, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Most used topics

Loading…