This repository implements two state-of-the-art local-DP heavy-hitter discovery algorithms:
- SFP (Sequence Frequency Puzzle) – described in Apple’s differential privacy paper
- TrieHH – Algorithm 1 from our AISTATS 2020 submission
Our code provides an end-to-end pipeline to preprocess data, run Monte-Carlo simulations, and reproduce the F1-score comparison between the two algorithms.
sfp/— Implementation of Apple’s SFP algorithmtriehh/— Implementation of TrieHH (interactive prefix-trie)main.py— Runs Monte-Carlo simulations and generates the F1-score plotpreprocess.py— Prepares datasets for both SFP and TrieHHdictionary.txt— (Optional) list of in-vocab words for out-of-vocab (OOV) experiments
To reproduce our results:
bash run.sh
This script will:
- Download the Sentiment140 dataset
- Run
preprocess.py - Run simulations using
main.py - Automatically generate the F1 vs. K plot comparing SFP and TrieHH
All default parameters reproduce Figure 4 from our paper.
dictionary.txt is intentionally empty by default.
To simulate OOV detection (similar to Figure 5 in the paper):
- Add one in-vocab English word per line into
dictionary.txt - Preprocessing will automatically remove these in-vocab words from the dataset
- This produces an OOV-only dataset for evaluation
Based on our written experiment documentation (see written_report.pdf, pp. 1–3):
- Source: March 2025 English Wikipedia clickstream
- Filter: rows with
type == "link" - Sampled to ~2 M records → expanded into 1.78 M synthetic clients
- Each token padded/truncated to length L = 16
$appended for TrieHH- Preprocessing produces:
clients_triehh.txtclients_sfp.txtword_frequencies.txt(ground-truth histogram)
- ϵ = 4
- δ = 2.3 × 10⁻¹²
- R = 5 Monte-Carlo runs
- TrieHH vote threshold θ = 15
- Batch size = 39,153
- Run
SimulateTrieHHandSimulateSFPfor R trials - For K = 10…299, compute:
- precision
- recall
- F1 score
- Produce the final plot with 95% confidence intervals (Student-t)
The F1 vs. K curve (Figure 1 in the report) shows:
- Recovers ~30 titles per run
- Recall = 0 until K ≥ 30
- F1 peak < 0.20
- Under this strict δ and small n, SFP outputs an empty set
- Curve stays at 0 for all K
These results faithfully match theoretical behavior under tight DP constraints.
As explained in the report:
- TrieHH is limited by the vote threshold θ = 15
- SFP’s Bayesian posterior rejects all candidate strings under strong δ
- Increasing n ≥ 5 M or loosening δ to 10⁻⁶ significantly improves utility
- Exploring the privacy–utility tradeoffs in (ϵ, δ, L) is promising future work