Skip to content

qcri/in-context-uncertainty

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models

The code in this paper can be used to reproduce the results in the paper.

Huggingface models used in the experiments:

RQ1: How does in-context information influence model be- havior and response uncertainty?

This project supports a study on how LLMs respond to different context conditions: with correct context (WCC), with incorrect context (WOC), or without context (WOC) during question answering. It includes tools for generating responses, extracting token-level uncertainty metrics, and visualizing uncertainty and error-type transitions.


📁 Repository Contents

generate_responses.py

Main pipeline for:

  • Generating responses from an LLM with and without context.
  • Collecting uncertainty metrics (aleatoric, epistemic).
  • Applying automatic correctness labeling and error-type classification.
  • Saving the enriched results into a CSV for analysis.

Usage

python generate_responses.py <model_name> <dataset_path> <output_path> <in_context>
  • <model_name>: HuggingFace model name or path (e.g., Qwen, Fanar, Gemma)
  • <dataset_path>: Path to .csv or .parquet file with questions, answers, and context
  • <output_path>: Destination file path to save output results
  • <in_context>: Either wic (with incorrect context) or wcc (with correct context)

visualization.py

Toolkit for:

  • Loading .parquet result files
  • Parsing token-level metrics and logits
  • Computing full-sequence reliability scores
  • Plotting KDE distributions of uncertainty metrics (EU, AU, reliability)

Customize the final call to plot_figure(...) to generate your desired output.

🔧 Requirements

Install dependencies with:

pip install torch transformers openai pandas numpy matplotlib seaborn tqdm python-dotenv

🛠️ Sample .env file

Create a .env file in the root of your project with the following keys:

AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/
AZURE_OPENAI_API_KEY=your-azure-openai-api-key
AZURE_DEPLOYMENT_NAME=gpt-4-deployment-name
AZURE_API_VERSION=2024-02-15-preview

HF_TOKEN=your-huggingface-token

Ensure this .env file is not committed to version control. Add it to your .gitignore file.

Research Question 2: Can Uncertainty Signals Be Used to Predict Response Reliability?

This directory contains four modular scripts designed to explore whether token-level uncertainty in large language models (LLMs) can be used to predict the factual reliability of generated responses. The full pipeline includes response generation, correctness labeling, hidden state extraction, and classifier-based probing.


📁 Modules Overview

  1. generation.py — Generates responses using greedy decoding.
  2. GPT_labeler.py — Uses GPT (e.g., ChatGPT) to label factual correctness and extract minimal answer spans.
  3. compute_hidden_states.py — Extracts token-level hidden states based on uncertainty-aware token selection strategies.
  4. probe_exp.py — Trains lightweight probing classifiers on extracted hidden state features.

🚀 Usage Instructions

1. Generate Responses

Use generation.py to generate LLM responses for selected datasets with greedy decoding:

python generation.py --module gemma --datasets truthfulqa triviaqa math

2. Label Responses with GPT

Label the generated responses for correctness and extract minimal answer spans:

python GPT_labeler.py --models fanar gemma qwen --datasets truthfulqa triviaqa math

3. Compute Hidden States

Extract hidden states based on token-level uncertainty:

python compute_hidden_states.py \
    --model [model] \
    --datasets [dataset(s)] \
    --uncertainty_type [au | eu | agg] \
    --K [K]
  • --uncertainty_type: Strategy for selecting target tokens (e.g., AU = aleatoric uncertainty, EU = epistemic uncertainty, AGG = aggregated).
  • --K: Number of top tokens to use for feature construction.

4. Train Probing Classifiers

Train classifiers to predict factual correctness using extracted features:

python probe_exp.py \
    --model [model] \
    --dataset [dataset] \
    --uncertainty_type [au | eu | agg] \
    --K [K]

This will evaluate probing performance across all layers of the selected model and dataset.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages