Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 145 additions & 0 deletions .claude/skills/add-indexers/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
---
name: add-indexers
description: "Add extra indexers to the local Graph protocol network. Use when the user asks to add indexers, spin up another indexer, get more indexers up, bring up new indexers, or wants extra indexers for testing. Also trigger when user says a number followed by 'indexers' (e.g. 'add 3 indexers', 'spin up 2 more')."
argument-hint: "[count]"
allowed-tools:
- Bash
- Read
- Grep
---

# Add Extra Indexers

Add N extra indexers to the running local network. Each extra indexer gets a fully isolated stack: postgres, graph-node, indexer-agent, indexer-service, and tap-agent. Protocol subgraphs (network, epoch, TAP) are read from the primary graph-node -- extra graph-nodes only handle actual indexing work.

The argument is the number of NEW indexers to add (defaults to 1).

## Accounts

Extra indexers use hardhat "junk" mnemonic accounts starting at index 2. Maximum 18 extra (indices 2-19).

Each indexer gets a unique operator derived from a mnemonic of the form `test test test ... test {bip39_word}` (11 "test" + 1 valid checksum word). The generator handles mnemonic validation, operator address derivation, ETH funding, on-chain `setOperator` authorization for both SubgraphService and HorizonStaking, and PaymentsEscrow deposits for DIPs signer validation.

| Suffix | Mnemonic Index | Address |
|--------|---------------|---------|
| 2 | 2 | 0x3C44CdDdB6a900fa2b585dd299e03d12FA4293BC |
| 3 | 3 | 0x90F79bf6EB2c4f870365E785982E1f101E93b906 |
| 4 | 4 | 0x15d34AAf54267DB7D7c367839AAf71A00a2C6A65 |
| 5 | 5 | 0x9965507D1a55bcC2695C58ba16FB37d819B0A4dc |

## Steps

### 1. Determine current extra indexer count

```bash
docker ps --format '{{.Names}}' | grep 'indexer-agent-' | sed 's/indexer-agent-//' | sort -n | tail -1
```

If no matches, current extra count is 0. Otherwise the highest suffix minus 1 gives the count (suffix 2 = 1 extra, suffix 3 = 2 extras, etc.).

### 2. Calculate new total

New total = current extra count + number requested by user.

Cap at 18. If the user asks for more than available slots, warn and cap.

### 3. Regenerate compose file

```bash
python3 scripts/gen-extra-indexers.py <NEW_TOTAL>
```

This regenerates the full compose file for ALL extras (existing + new). It's idempotent -- running it with the same number produces the same file.

### 4. Bring up new containers

Two-step process to avoid bouncing shared services.

First, run `start-indexing-extra` to register new indexers on-chain (stake, operator auth, escrow deposits):

```bash
DOCKER_DEFAULT_PLATFORM= docker compose \
-f docker-compose.yaml \
-f compose/dev/dips.yaml \
-f compose/extra-indexers.yaml \
run --rm start-indexing-extra
```

Then start all new containers in a single command with `--no-deps --no-recreate`. List all new service names space-separated:

```bash
DOCKER_DEFAULT_PLATFORM= docker compose \
-f docker-compose.yaml \
-f compose/dev/dips.yaml \
-f compose/extra-indexers.yaml \
up -d --no-deps --no-recreate postgres-2 graph-node-2 indexer-agent-2 indexer-service-2 tap-agent-2 [... all suffixes ...]
```

`--no-deps` prevents compose from walking the dependency tree and bouncing shared services. `--no-recreate` prevents touching already-running containers.

### 5. Verify container health

Indexer-services share a `flock`-serialized cargo build, so they come up sequentially. The first service to start builds the binary (~2-3 minutes if not cached); subsequent services acquire the lock, find the binary already built, and start immediately.

Wait 30 seconds after `up -d` completes, then check status:

```bash
docker ps --format '{{.Names}}\t{{.Status}}' | grep -E '(indexer-agent|indexer-service)-[0-9]' | sort
```

All agents and services should show `(healthy)`. If a service is still `(health: starting)`, it may be waiting for the cargo build lock -- wait another 60 seconds and recheck.

### 6. Wait for network subgraph to index URL registrations

After agents start, they call `subgraphService.register(url, geo)` on-chain. The network subgraph must index these events before IISA or dipper can see the new indexers. Poll until all indexers have URLs:

```bash
curl -s -X POST -H "Content-Type: application/json" \
-d '{"query":"{ indexers(where: { url_not: \"\" }) { id } }"}' \
http://localhost:8000/subgraphs/name/graph-network \
| python3 -c "import json,sys; print(len(json.load(sys.stdin)['data']['indexers']))"
```

This should return `TOTAL_EXPECTED` (1 primary + N extras). If it's lower, the subgraph is still catching up -- wait 10 seconds and recheck. Typically takes 30-90 seconds after agents register.

### 7. Trigger IISA score refresh

The IISA cronjob exposes `POST /run` on port 9090 for manual scoring runs. Without triggering it, IISA won't see the new indexers until the next scheduled cycle (default 120s).

```bash
DOCKER_DEFAULT_PLATFORM= docker compose \
-f docker-compose.yaml \
-f compose/dev/dips.yaml \
-f compose/extra-indexers.yaml \
exec iisa-cronjob curl -s -X POST http://localhost:9090/run
```

Then verify scores were written for the expected number of indexers:

```bash
DOCKER_DEFAULT_PLATFORM= docker compose \
-f docker-compose.yaml \
-f compose/dev/dips.yaml \
-f compose/extra-indexers.yaml \
logs iisa-cronjob --since 30s 2>&1 | grep -E "Wrote|indexers"
```

### 8. Report

Show a summary including:
- All running indexers (primary + extras) with container names, addresses, and health status
- Number of indexers visible in the network subgraph (with URLs)
- Number of indexers scored by IISA
- Confirmation that the pipeline is ready for `/send-indexing-request`

## Constraints

- Always prefix docker compose with `DOCKER_DEFAULT_PLATFORM=`
- Always use all three compose files: `-f docker-compose.yaml -f compose/dev/dips.yaml -f compose/extra-indexers.yaml`
- Never use `--force-recreate` when adding indexers to a running stack
- The generator script is at `scripts/gen-extra-indexers.py`
- The `start-indexing-extra` container handles on-chain GRT staking, operator authorization, and PaymentsEscrow deposits
- Agents poll for on-chain staking automatically (up to 450s), so `start-indexing-extra` can run in parallel with container startup
- Agents retry automatically (30 attempts, 10s delay) -- don't manually restart unless the error is persistent and non-transient
- If COMPOSE_FILE in .environment doesn't include `compose/extra-indexers.yaml`, warn the user to add it
- The `/fresh-deploy` skill must include `compose/extra-indexers.yaml` in its `down -v` command, otherwise extra indexer postgres volumes survive and agents have stale state on the next deploy
15 changes: 15 additions & 0 deletions .claude/skills/deploy-test-subgraphs/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
name: deploy-test-subgraphs
description: Publish test subgraphs to GNS on the local network. Use when the user asks to "deploy subgraphs", "add subgraphs", "deploy 50 subgraphs", "create test subgraphs", or wants to populate the network with subgraphs for testing. Also trigger when the user says a number followed by "subgraphs" (e.g. "deploy 500 subgraphs").
argument-hint: "[count] [prefix]"
---

Run `python3 scripts/deploy-test-subgraph.py <count> [prefix]` from the local-network repo root.

- `count` defaults to 1 if the user doesn't specify a number
- `prefix` defaults to `test-subgraph` -- each subgraph is named `<prefix>-1`, `<prefix>-2`, etc.
- Subgraphs are published to GNS on-chain only -- they are NOT deployed to graph-node and will not be indexed

The script builds once (~10s), then each publish is sub-second. 100 subgraphs takes ~30s total.

After publishing, run `python3 scripts/network-status.py` and output the result in a code block so the user can see the updated network state.
165 changes: 165 additions & 0 deletions .claude/skills/fresh-deploy/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
---
name: fresh-deploy
description: Full stack reset and fresh deploy of the local-network Docker Compose environment. Use when the user asks to tear down and redeploy, do a fresh deploy, reset the stack, or bring everything up from scratch. Also use after merging PRs that change container code, or when debugging stuck state.
---

# Fresh Deploy

Reset the local-network Docker Compose environment to a clean state and bring all services up ready for DIPs testing.

## Prerequisites

The contracts repo at `$CONTRACTS_SOURCE_ROOT` (typically `/Users/samuel/Documents/github/contracts`) must be on `fix/horizon-staking-ignition-dependency` (or `mde/dips-ignition-deployment` + BUG-007 fix). This branch has `IndexingAgreementManager`, RecurringCollector in toolshed/ignition natively, and the HorizonStaking deployment ordering fix.

After checking out the branch, the toolshed package must be compiled: `cd packages/toolshed && pnpm build:self`.

To verify: `cd $CONTRACTS_SOURCE_ROOT && git log --oneline -3` should show the HorizonStaking fix on top of the mde branch.

## Steps

### 1. Tear down everything including volumes

Build the compose file list dynamically to include extra-indexers if present. This is critical -- omitting `compose/extra-indexers.yaml` leaves extra indexer containers and their postgres volumes alive, causing stale state on the next deploy (agents think they're registered on the old chain).

```bash
COMPOSE_FILES="-f docker-compose.yaml -f compose/dev/dips.yaml"
[ -f compose/extra-indexers.yaml ] && COMPOSE_FILES="$COMPOSE_FILES -f compose/extra-indexers.yaml"
DOCKER_DEFAULT_PLATFORM= docker compose $COMPOSE_FILES down -v
```

This destroys all data: chain state, postgres (including extra indexer postgres volumes), subgraph deployments, config volume with contract addresses.

### 2. Clear stale Ignition journals

If a previous deployment failed (especially `graph-contracts`), the Hardhat Ignition journal at `$CONTRACTS_SOURCE_ROOT/packages/subgraph-service/ignition/deployments/chain-1337/` will contain partial state that prevents a clean redeploy. Delete it:

```bash
rm -rf $CONTRACTS_SOURCE_ROOT/packages/subgraph-service/ignition/deployments/chain-1337
```

This is safe after a `down -v` since the chain state it references no longer exists.

### 3. Bring everything up

Use only the base compose files for the initial deploy. Extra indexers are added separately via the `/add-indexers` skill after the core stack is healthy.

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml up -d --build
```

The `--build` flag ensures any changes to `run.sh` scripts or Dockerfiles are picked up (e.g. chain's `--block-time` flag, config changes baked into images). Without it, Docker reuses cached images and local changes are silently ignored.

Wait for containers to stabilize. The `graph-contracts` container runs first (deploys all Solidity contracts and writes addresses to the config volume), then `subgraph-deploy` deploys three subgraphs (network, TAP, block-oracle). Other services start as their health check dependencies are met.

**Note:** The initial `up -d` may exit with an error if `start-indexing` fails. This is expected -- see step 5. If `graph-contracts` itself fails, check its logs -- the most likely cause is a missing prerequisite commit (see Prerequisites) or a stale Ignition journal (see step 2).

### 4. Verify RecurringCollector was written to horizon.json

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml exec indexer-agent \
jq '.["1337"].RecurringCollector' /opt/config/horizon.json
```

If this returns null, the contracts toolshed wasn't rebuilt. Run `cd $CONTRACTS_SOURCE_ROOT/packages/toolshed && pnpm build:self` and repeat from step 1.

### 5. Fix nonce race failures

Multiple containers use ACCOUNT0 concurrently after `graph-contracts` finishes (`start-indexing`, `tap-escrow-manager`). This causes "nonce too low" errors that can fail either container. The cascade is the real problem: if `start-indexing` fails, `dipper` and `ready` never start because they depend on it.

Check whether `start-indexing` exited successfully:

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml ps -a start-indexing --format '{{.Status}}'
```

If it shows `Exited (1)`, restart it:

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml start start-indexing
```

Always restart `tap-escrow-manager` regardless of whether `start-indexing` succeeded. Even when authorization succeeds, the deposit step can hit "nonce too low" from competing with `start-indexing`. The `AlreadyAuthorized` error on restart is harmless -- it re-runs the deposit with a fresh nonce.

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml restart tap-escrow-manager
```

### 6. Bring up any cascade-failed containers

If `start-indexing` failed on the initial `up -d`, containers that depend on it (`dipper`, `ready`) will be stuck in `Created` state. Run `up -d` again to catch them:

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml up -d --build
```

This is idempotent -- already-running containers are left alone.

### 7. Verify signer authorization

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml logs tap-escrow-manager --since 60s 2>&1 | grep -i "authorized"
```

Expected: either `authorized signer=0x70997970C51812dc3A010C7d01b50e0d17dc79C8` (fresh auth) or `AuthorizableSignerAlreadyAuthorized` (already done on first run). Both are fine.

### 8. Wait for TAP subgraph indexing, then verify dipper

The TAP subgraph needs to index the `SignerAuthorized` event before the indexer-service will accept paid queries. Dipper may restart once or twice with "bad indexers: BadResponse(402)" during this window -- this is normal and self-resolves.

Check:

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml ps dipper --format '{{.Name}} {{.Status}}'
```

Should show `dipper Up ... (healthy)`. If still restarting after 60 seconds, check gateway logs for persistent 402s.

### 9. Full status check

```bash
DOCKER_DEFAULT_PLATFORM= docker compose -f docker-compose.yaml -f compose/dev/dips.yaml ps --format '{{.Name}} {{.Status}}' | sort
```

All services should be Up. The key health-checked services are: chain, graph-node, postgres, ipfs, redpanda, indexer-agent, indexer-service, gateway, iisa-scoring, iisa, block-oracle, dipper.

## Architecture notes

The authorization chain that makes gateway queries work:

1. `graph-contracts` deploys all contracts, writes addresses to config volume (`horizon.json`, `tap-contracts.json`)
2. `subgraph-deploy` deploys the TAP subgraph pointing at the Horizon PaymentsEscrow address (from `horizon.json`)
3. `tap-escrow-manager` authorizes ACCOUNT1 (gateway signer) on the PaymentsEscrow contract
4. The TAP subgraph indexes the `SignerAuthorized` event
5. `indexer-service` queries the TAP subgraph, sees ACCOUNT1 is authorized for ACCOUNT0 (the payer)
6. Gateway queries signed by ACCOUNT1 are accepted with 200 instead of 402

## Known issues

- **ACCOUNT0 nonce race**: `start-indexing` and `tap-escrow-manager` both use ACCOUNT0 concurrently after `graph-contracts` finishes. Either can fail with "nonce too low". If `start-indexing` fails, `dipper` and `ready` never start (cascade). The fix is to restart the failed container and run `up -d` again.
- **Stale Ignition journals**: After a failed `graph-contracts` deployment, the journal at `packages/subgraph-service/ignition/deployments/chain-1337/` contains partial state. A fresh `down -v` destroys the chain but not the journal (it's in the mounted source). Always delete it before retrying (step 2).
- The contracts toolshed must be compiled (JS, not just TS) for the RecurringCollector whitelist to take effect. Use `pnpm build:self` in `packages/toolshed` (not `pnpm build` which fails on the `interfaces` package).
- **Extra indexer stale state**: If `compose/extra-indexers.yaml` is not included in the `down -v` command, extra indexer containers and their postgres volumes survive the teardown. On the next deploy, agents have stale state from the old chain -- they believe they're already registered and never re-register URLs on the new chain. The network subgraph then shows `url: null` for these indexers and IISA can't select them.

## Key contract addresses (change each deploy)

Read from the config volume:

```bash
# All Horizon contracts
docker compose exec indexer-agent cat /opt/config/horizon.json | jq '.["1337"]'

# TAP contracts
docker compose exec indexer-agent cat /opt/config/tap-contracts.json

# Important ones for manual testing:
# GRT Token: jq '.["1337"].L2GraphToken.address' horizon.json
# PaymentsEscrow: jq '.["1337"].PaymentsEscrow.address' horizon.json
# RecurringCollector: jq '.["1337"].RecurringCollector.address' horizon.json
# GraphTallyCollector: jq '.["1337"].GraphTallyCollector.address' horizon.json
```

## Accounts

- ACCOUNT0 (`0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266`): deployer, admin, payer
- ACCOUNT1 (`0x70997970C51812dc3A010C7d01b50e0d17dc79C8`): gateway signer
- RECEIVER (`0xf4EF6650E48d099a4972ea5B414daB86e1998Bd3`): indexer (mnemonic index 0 of "test...zero")
8 changes: 8 additions & 0 deletions .claude/skills/network-status/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
name: network-status
description: Show the current state of the local Graph protocol network. Use when the user asks for "network status", "show me the network", "what's deployed", "which indexers", "which subgraphs", "what's running", or wants to see allocations, sync status, or the network tree.
---

Run `python3 scripts/network-status.py` from the local-network repo root to fetch the current network state.

Output the FULL result directly as text in a code block so it renders inline without the user needing to expand tool results. Do NOT truncate, summarize, or abbreviate any part of the output -- show every line including all deployment hashes.
Loading