Eval Analyzer Website

Serverless Official Benchmark Visualizer built with React, TypeScript, Ant Design, and D3.

Features

Config-Driven: Load official benchmark sources from a client-side configuration file
HuggingFace Integration: Automatically discover and fetch timestamped result files from HuggingFace datasets
Multi-File Upload: Upload and compare multiple JSON/JSONL result files side-by-side
Dynamic Schema Inference: Schema is derived from data at runtime using Zod
Interactive Visualizations: D3-powered charts for category-based analysis, including expandable dashboards with bar charts and radar plots
CSV Export: Export pivot tables and summaries
Fully Static: Deployable to GitHub Pages, Netlify, or Vercel without a backend

Quick Start

Development

npm install
npm run dev

Build for Production

npm run build
npm run preview

Configuration

Edit public/config/benchmarks.config.json to configure official benchmark sources:

{
  "official": [
    {
      "id": "twinkle_eval_v1",
      "label": "Twinkle Eval v1 — Official",
      "provider": "HuggingFace",
      "modelName": "gpt-xyz-128k",
      "variance": "default",
      "openSource": false,
      "hfFolderUrl": "https://huggingface.co/datasets/<owner>/<repo>/resolve/main/results/"
    }
  ],
  "ui": {
    "defaultScale0100": false,
    "pageSizes": [10, 20, 30, 50, 100]
  },
  "security": {
    "allowOrigins": [
      "https://huggingface.co",
      "https://raw.githubusercontent.com"
    ]
  }
}

HuggingFace Integration

The app discovers result files from HuggingFace using two strategies:

Manifest File (Recommended): Place an index.json file in your results folder listing available files:
```
["results_20250408_1216.json", "results_20250407_0930.json"]
```
HuggingFace API Fallback: If no manifest exists, the app attempts to use the HF API to list files (requires CORS to be enabled)

Result File Naming Convention

Files must follow the pattern: results_YYYYMMDD_HHMM.json

Example: results_20250408_1216.json → displayed as "2025-04-08 12:16"
The latest file (by timestamp) is automatically selected by default

Data Format

Expected JSON structure:

{
  "timestamp": "2025-10-01T12:34:56Z",
  "config": {
    "model": {
      "name": "gpt-xyz-128k"
    }
  },
  "dataset_results": {
    "datasets/benchmark_name": {
      "average_accuracy": 0.742,
      "results": [
        {
          "file": "category.task.jsonl",
          "accuracy_mean": 0.81,
          "accuracy_std": 0.02
        }
      ]
    }
  }
}

JSONL format is also supported (one JSON object per line).

Tech Stack

React 18 + TypeScript (strict mode)
Vite for fast development and building
Ant Design v5 for UI components
Ant Design Icons (no emojis in UI)
D3 v7+ for all visualizations
Tailwind CSS v4 for styling
Zod for runtime schema validation

Project Structure

/public
  /config/benchmarks.config.json    # Configuration
  /examples/results_example.json    # Example data
/src
  /components                        # UI components
    ControlsPanel.tsx
    DownloadButtons.tsx
    FileUploader.tsx
  /charts                           # D3 visualizations
    CategoryDashboard.tsx
    CompactDashboard.tsx
  /features                         # Core logic (parse, transform, schema, etc.)
    csv.ts
    discover.ts
    parse.ts
    schema.ts
    transform.ts
    types.d.ts
  /pages                            # Page components
    Home.tsx
  /types                            # Type definitions
    ant-design-charts.d.ts
  App.tsx                           # Root component
  main.tsx                          # Entry point

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
public		public
scripts		scripts
src		src
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc.yaml		.prettierrc.yaml
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tailwind.config.js		tailwind.config.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Eval Analyzer Website

Features

Quick Start

Development

Build for Production

Configuration

HuggingFace Integration

Result File Naming Convention

Data Format

Tech Stack

Project Structure

License

About

Uh oh!

Releases

Packages

Languages

License

ai-twinkle/eval-analyzer-website

Folders and files

Latest commit

History

Repository files navigation

Eval Analyzer Website

Features

Quick Start

Development

Build for Production

Configuration

HuggingFace Integration

Result File Naming Convention

Data Format

Tech Stack

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages