Indexer Migration Validator

A TypeScript CLI tool for validating data correctness when migrating from TheGraph subgraphs to Envio HyperIndex indexers.

Overview

This tool queries both a subgraph and a HyperIndex endpoint, compares the returned data, and reports any differences. It's designed to help verify that a migrated indexer produces identical results to the original subgraph.

Features

Schema-driven: Automatically generates entity configs from GraphQL schema files
Sample Mode: Quick comparison using random samples (default)
Deep Mode: Full comparison with pagination for thorough validation
Field-level diffing: Shows exact field differences with percentage variance for numeric fields
Progress tracking: Live progress updates during deep comparisons
JSON reports: Detailed diff reports saved to files
Flexible filtering: Compare specific entities or all entities

Installation

pnpm install

Quick Start

Set up your environment - Copy .env.example to .env and configure your endpoints:
```
cp .env.example .env
```
Add your schema files - Place your GraphQL schema files in the project root:
- subgraph-schema.graphql - Your subgraph's GraphQL schema
- hyperindex-schema.graphql - Your HyperIndex GraphQL schema
Run comparison:
```
pnpm compare
```

Usage

Basic Commands

# Compare all entities with default sample size (50)
pnpm compare

# Compare specific entity
pnpm compare --entity Pool

# Increase sample size
pnpm compare --sample 200

# Deep comparison (fetch ALL records)
pnpm compare --deep --entity Pool

# Deep comparison with limit
pnpm compare --deep-limit 1000 --entity Token

# Skip JSON report generation
pnpm compare --no-json

# Show help
pnpm compare --help

Options

Option	Description
`--entity <name>`	Compare only the specified entity
`--sample <n>`	Number of random samples per entity (default: 50)
`--deep`	Deep comparison: fetch ALL records using pagination
`--deep-limit <n>`	Deep comparison with max N records per entity
`--output <path>`	Custom output path for JSON report
`--no-json`	Skip JSON report generation
`--subgraph-schema <path>`	Path to subgraph schema file
`--hyperindex-schema <path>`	Path to HyperIndex schema file
`--generate-config`	Generate entity config from schemas and exit
`--help, -h`	Show help

Configuration

Environment Variables

Create a .env file with your endpoints:

# GraphQL endpoints (required)
SUBGRAPH_URL=https://api.thegraph.com/subgraphs/name/your-subgraph
HYPERINDEX_URL=https://your-indexer.hyperindex.xyz/v1/graphql

# Schema file paths (optional, defaults shown)
SUBGRAPH_SCHEMA=./subgraph-schema.graphql
HYPERINDEX_SCHEMA=./hyperindex-schema.graphql
OVERRIDES_PATH=./overrides.json

Overrides File

Create an overrides.json file to handle field mappings and known issues:

{
  "fieldMappings": {
    "EntityName": {
      "subgraphFieldName": "hyperindexFieldName"
    }
  },
  "knownIdMismatch": [
    "EntityWithDifferentIdFormat"
  ],
  "skipEntities": [
    "EntityToSkip"
  ]
}

See overrides.template.json for a blank template.

Key Differences Handled

The tool automatically handles common differences between subgraph and HyperIndex:

Subgraph	HyperIndex
`first: N`	`limit: N`
`skip: N`	`offset: N`
`orderBy: field`	`order_by: {field: asc}`
`where: {id_in: [...]}`	`where: {id: {_in: [...]}}`
`entity { id }` (nested)	`entity_id` (flat)

Output

Console Output

Shows colored diff output with:

Entity summary (record counts, match/mismatch stats)
Field-level differences with values from both sources
Percentage differences for numeric fields
Missing IDs in either direction (deep mode)

JSON Reports

Saved to output/comparison-YYYY-MM-DD_HH-MM-SS.json with full diff details:

{
  "timestamp": "2024-01-15T10:30:00.000Z",
  "summary": {
    "totalEntities": 14,
    "entitiesWithDifferences": 5,
    "totalMatched": 500,
    "totalMismatched": 100
  },
  "entities": [
    {
      "entityName": "Pool",
      "subgraphCount": 34136,
      "hyperindexCount": 34136,
      "matchedCount": 34136,
      "mismatchedCount": 0,
      "fieldMismatches": []
    }
  ]
}

Examples

The examples/ directory contains real-world migration examples:

examples/flaunch/ - Flaunch protocol migration with 66 entities

Each example includes:

subgraph-schema.graphql - The original subgraph schema
hyperindex-schema.graphql - The migrated HyperIndex schema
overrides.json - Field mappings and known issues
README.md - Migration-specific notes

Known Limitations

Maximum 100k records per entity in deep mode (safety limit)
Nested object comparisons limited to ID extraction
Array fields not fully supported yet

Troubleshooting

"No common IDs found"

Either:

The entity has no data in one or both sources
ID formats differ between systems (add to knownIdMismatch in overrides.json)

Timeout errors

Reduce sample size or use --deep-limit
Check endpoint connectivity

Field not found errors

Verify field names match between schema and config
Check for renamed fields and add to fieldMappings in overrides.json

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
examples/flaunch		examples/flaunch
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
overrides.template.json		overrides.template.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Indexer Migration Validator

Overview

Features

Installation

Quick Start

Usage

Basic Commands

Options

Configuration

Environment Variables

Overrides File

Key Differences Handled

Output

Console Output

JSON Reports

Examples

Known Limitations

Troubleshooting

"No common IDs found"

Timeout errors

Field not found errors

License

About

Uh oh!

Releases

Packages

Languages

enviodev/indexer-migration-validator

Folders and files

Latest commit

History

Repository files navigation

Indexer Migration Validator

Overview

Features

Installation

Quick Start

Usage

Basic Commands

Options

Configuration

Environment Variables

Overrides File

Key Differences Handled

Output

Console Output

JSON Reports

Examples

Known Limitations

Troubleshooting

"No common IDs found"

Timeout errors

Field not found errors

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages