Skip to content

enviodev/indexer-migration-validator

Repository files navigation

Indexer Migration Validator

A TypeScript CLI tool for validating data correctness when migrating from TheGraph subgraphs to Envio HyperIndex indexers.

Overview

This tool queries both a subgraph and a HyperIndex endpoint, compares the returned data, and reports any differences. It's designed to help verify that a migrated indexer produces identical results to the original subgraph.

Features

  • Schema-driven: Automatically generates entity configs from GraphQL schema files
  • Sample Mode: Quick comparison using random samples (default)
  • Deep Mode: Full comparison with pagination for thorough validation
  • Field-level diffing: Shows exact field differences with percentage variance for numeric fields
  • Progress tracking: Live progress updates during deep comparisons
  • JSON reports: Detailed diff reports saved to files
  • Flexible filtering: Compare specific entities or all entities

Installation

pnpm install

Quick Start

  1. Set up your environment - Copy .env.example to .env and configure your endpoints:

    cp .env.example .env
  2. Add your schema files - Place your GraphQL schema files in the project root:

    • subgraph-schema.graphql - Your subgraph's GraphQL schema
    • hyperindex-schema.graphql - Your HyperIndex GraphQL schema
  3. Run comparison:

    pnpm compare

Usage

Basic Commands

# Compare all entities with default sample size (50)
pnpm compare

# Compare specific entity
pnpm compare --entity Pool

# Increase sample size
pnpm compare --sample 200

# Deep comparison (fetch ALL records)
pnpm compare --deep --entity Pool

# Deep comparison with limit
pnpm compare --deep-limit 1000 --entity Token

# Skip JSON report generation
pnpm compare --no-json

# Show help
pnpm compare --help

Options

Option Description
--entity <name> Compare only the specified entity
--sample <n> Number of random samples per entity (default: 50)
--deep Deep comparison: fetch ALL records using pagination
--deep-limit <n> Deep comparison with max N records per entity
--output <path> Custom output path for JSON report
--no-json Skip JSON report generation
--subgraph-schema <path> Path to subgraph schema file
--hyperindex-schema <path> Path to HyperIndex schema file
--generate-config Generate entity config from schemas and exit
--help, -h Show help

Configuration

Environment Variables

Create a .env file with your endpoints:

# GraphQL endpoints (required)
SUBGRAPH_URL=https://api.thegraph.com/subgraphs/name/your-subgraph
HYPERINDEX_URL=https://your-indexer.hyperindex.xyz/v1/graphql

# Schema file paths (optional, defaults shown)
SUBGRAPH_SCHEMA=./subgraph-schema.graphql
HYPERINDEX_SCHEMA=./hyperindex-schema.graphql
OVERRIDES_PATH=./overrides.json

Overrides File

Create an overrides.json file to handle field mappings and known issues:

{
  "fieldMappings": {
    "EntityName": {
      "subgraphFieldName": "hyperindexFieldName"
    }
  },
  "knownIdMismatch": [
    "EntityWithDifferentIdFormat"
  ],
  "skipEntities": [
    "EntityToSkip"
  ]
}

See overrides.template.json for a blank template.

Key Differences Handled

The tool automatically handles common differences between subgraph and HyperIndex:

Subgraph HyperIndex
first: N limit: N
skip: N offset: N
orderBy: field order_by: {field: asc}
where: {id_in: [...]} where: {id: {_in: [...]}}
entity { id } (nested) entity_id (flat)

Output

Console Output

Shows colored diff output with:

  • Entity summary (record counts, match/mismatch stats)
  • Field-level differences with values from both sources
  • Percentage differences for numeric fields
  • Missing IDs in either direction (deep mode)

JSON Reports

Saved to output/comparison-YYYY-MM-DD_HH-MM-SS.json with full diff details:

{
  "timestamp": "2024-01-15T10:30:00.000Z",
  "summary": {
    "totalEntities": 14,
    "entitiesWithDifferences": 5,
    "totalMatched": 500,
    "totalMismatched": 100
  },
  "entities": [
    {
      "entityName": "Pool",
      "subgraphCount": 34136,
      "hyperindexCount": 34136,
      "matchedCount": 34136,
      "mismatchedCount": 0,
      "fieldMismatches": []
    }
  ]
}

Examples

The examples/ directory contains real-world migration examples:

  • examples/flaunch/ - Flaunch protocol migration with 66 entities

Each example includes:

  • subgraph-schema.graphql - The original subgraph schema
  • hyperindex-schema.graphql - The migrated HyperIndex schema
  • overrides.json - Field mappings and known issues
  • README.md - Migration-specific notes

Known Limitations

  • Maximum 100k records per entity in deep mode (safety limit)
  • Nested object comparisons limited to ID extraction
  • Array fields not fully supported yet

Troubleshooting

"No common IDs found"

Either:

  • The entity has no data in one or both sources
  • ID formats differ between systems (add to knownIdMismatch in overrides.json)

Timeout errors

  • Reduce sample size or use --deep-limit
  • Check endpoint connectivity

Field not found errors

  • Verify field names match between schema and config
  • Check for renamed fields and add to fieldMappings in overrides.json

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published