Skip to content

Conversation

@RedaRahmani
Copy link

Fix: Prevent duplicate activation race condition (Issue #2257)

Problem

The activator has two independent triggers for entity activation:

  • websocket_task - real-time blockchain events
  • get_snapshot_poll - 60-second polling snapshot

Both can observe the same entity with status=Pending and enqueue it to the
same mpsc::channel(128) queue. The single-threaded processor then receives
both events sequentially, causing:

  1. First event: allocates resources (tunnel_id, tunnel_net, dz_ip), sends
    ActivateUser TX successfully
  2. Second event: still has stale Pending status from queue, allocates
    more resources, sends duplicate TX which fails with InvalidStatus
    because on-chain status is already Activated

Solution

Add an in-flight guard at the dispatcher level (Processor::process_event)
that tracks which pubkeys are currently being processed:

  • HashSet<Pubkey> tracks in-flight entities
  • Before processing Pending/Updating states, check if already in-flight
  • If duplicate detected: skip processing, emit metric, log info
  • Remove from set after processing completes

This prevents resource waste and duplicate on-chain calls while preserving the
existing InvalidStatus handling as a safety-net fallback.

Metrics Added

  • doublezero_activator_duplicate_event_skipped{entity_type=user|link} -
    counts events skipped by in-flight guard (expect this to increment during
    normal operation)
  • doublezero_activator_invalid_status_encountered{entity_type=user|link} -
    counts on-chain InvalidStatus errors (should be rare now, indicates guard
    missed something)

Testing

  • 4 new unit tests verifying guard logic
  • All 37 activator tests pass

@RedaRahmani RedaRahmani changed the title activator: add in-flight dedupe guard for duplicate pending events (#… Fix: Prevent duplicate activation race condition Jan 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant