Serverless Official Benchmark Visualizer built with React, TypeScript, Ant Design, and D3.
- Config-Driven: Load official benchmark sources from a client-side configuration file
- HuggingFace Integration: Automatically discover and fetch timestamped result files from HuggingFace datasets
- Multi-File Upload: Upload and compare multiple JSON/JSONL result files side-by-side
- Dynamic Schema Inference: Schema is derived from data at runtime using Zod
- Interactive Visualizations: D3-powered charts for category-based analysis, including expandable dashboards with bar charts and radar plots
- CSV Export: Export pivot tables and summaries
- Fully Static: Deployable to GitHub Pages, Netlify, or Vercel without a backend
npm install
npm run devnpm run build
npm run previewEdit public/config/benchmarks.config.json to configure official benchmark sources:
{
"official": [
{
"id": "twinkle_eval_v1",
"label": "Twinkle Eval v1 — Official",
"provider": "HuggingFace",
"modelName": "gpt-xyz-128k",
"variance": "default",
"openSource": false,
"hfFolderUrl": "https://huggingface.co/datasets/<owner>/<repo>/resolve/main/results/"
}
],
"ui": {
"defaultScale0100": false,
"pageSizes": [10, 20, 30, 50, 100]
},
"security": {
"allowOrigins": [
"https://huggingface.co",
"https://raw.githubusercontent.com"
]
}
}The app discovers result files from HuggingFace using two strategies:
-
Manifest File (Recommended): Place an
index.jsonfile in your results folder listing available files:["results_20250408_1216.json", "results_20250407_0930.json"]
-
HuggingFace API Fallback: If no manifest exists, the app attempts to use the HF API to list files (requires CORS to be enabled)
Files must follow the pattern: results_YYYYMMDD_HHMM.json
- Example:
results_20250408_1216.json→ displayed as "2025-04-08 12:16" - The latest file (by timestamp) is automatically selected by default
Expected JSON structure:
{
"timestamp": "2025-10-01T12:34:56Z",
"config": {
"model": {
"name": "gpt-xyz-128k"
}
},
"dataset_results": {
"datasets/benchmark_name": {
"average_accuracy": 0.742,
"results": [
{
"file": "category.task.jsonl",
"accuracy_mean": 0.81,
"accuracy_std": 0.02
}
]
}
}
}JSONL format is also supported (one JSON object per line).
- React 18 + TypeScript (strict mode)
- Vite for fast development and building
- Ant Design v5 for UI components
- Ant Design Icons (no emojis in UI)
- D3 v7+ for all visualizations
- Tailwind CSS v4 for styling
- Zod for runtime schema validation
/public
/config/benchmarks.config.json # Configuration
/examples/results_example.json # Example data
/src
/components # UI components
ControlsPanel.tsx
DownloadButtons.tsx
FileUploader.tsx
/charts # D3 visualizations
CategoryDashboard.tsx
CompactDashboard.tsx
/features # Core logic (parse, transform, schema, etc.)
csv.ts
discover.ts
parse.ts
schema.ts
transform.ts
types.d.ts
/pages # Page components
Home.tsx
/types # Type definitions
ant-design-charts.d.ts
App.tsx # Root component
main.tsx # Entry point
This project is licensed under the MIT License. See the LICENSE file for details.