aHFR-TokenSHAP

Adaptive Hierarchically Feature-Restricted TokenSHAP for Binary Classification with Large Language Models

This repository provides a reproducible demonstration of feature-level attribution for LLM-based binary phenotype classification prompts rendered from structured (hierarchical) records—together with the accompanying manuscript and technical note.

Manuscript (PDF): paper_aHFR_TokenSHAP.pdf
Technical note (PDF): technical_note_complexity_reduction.pdf
Repository: stvsever/aHFR_TokenSHAP

Motivation

Large language models (LLMs) are increasingly used as inference engines over structured phenotypic records rendered into prompt templates (e.g., clinical risk-factor fields). In this regime, token-level attribution methods can over-attribute prompt scaffolding (headers, separators, boilerplate instructions) that is necessary for instruction-following but not the explanatory object of interest.

This repository focuses on feature-level explanations: which structured features (and which feature domains) drive the model’s binary decision?

Method in one paragraph

We introduce aHFR-TokenSHAP, a task-specific extension of TokenSHAP for binary classification prompts in which (i) the value function is the model’s binary decision score defined as label log-odds (rather than response-similarity between generated texts), and (ii) Shapley “players” are template-aligned leaf features organized by a pre-specified hierarchy rather than all prompt tokens. aHFR-TokenSHAP further incorporates an adaptive, hierarchy-constrained permutation generator: permutations are constructed via mixed-depth hierarchical frontiers, initialized by a short primary-layer calibration and updated across epochs to concentrate sampling on influential subtrees while preserving Shapley–Shubik marginal-contribution semantics. We validate aHFR-TokenSHAP in a controlled random hierarchical feature-injection experiment with 10 parent domains and 30 leaf features across 100 pseudo-profiles, and compare against (a) an internal baseline (Integrated Gradients on the same log-odds score, aggregated over value-only spans) and (b) an external knowledge-prior baseline (LLM-Select-style feature-name scoring).

Pseudocode of the aHFR-TokenSHAP algorithm

Figure: Example prompt overlay

Figure 1 shows a qualitative example prompt with an overlay of weighted feature-importance scores (Integrated Gradients + aHFR-TokenSHAP-style restricted Shapley's), illustrating increased emphasis on clinically relevant features relative to distractor “word-features”.

Quickstart

1) Create an environment & install dependencies

python -m venv .venv
source .venv/bin/activate  # macOS/Linux
# .venv\Scripts\activate   # Windows PowerShell

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
demonstration_results/visuals		demonstration_results/visuals
utils		utils
.gitignore		.gitignore
README.md		README.md
paper_aHFR_TokenSHAP.pdf		paper_aHFR_TokenSHAP.pdf
requirements.txt		requirements.txt
technical_note_complexity_reduction.pdf		technical_note_complexity_reduction.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aHFR-TokenSHAP

Motivation

Method in one paragraph

Pseudocode of the aHFR-TokenSHAP algorithm

Figure: Example prompt overlay

Quickstart

1) Create an environment & install dependencies

About

Uh oh!

Releases

Packages

Languages

stvsever/aHFR_TokenSHAP

Folders and files

Latest commit

History

Repository files navigation

aHFR-TokenSHAP

Motivation

Method in one paragraph

Pseudocode of the aHFR-TokenSHAP algorithm

Figure: Example prompt overlay

Quickstart

1) Create an environment & install dependencies

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages