fix: deflake //rs/tests/idx:ii_delegation_test by basvandijk · Pull Request #9305 · dfinity/ic

basvandijk · 2026-03-10T23:26:27Z

//rs/tests/idx:ii_delegation_test flaked twice in the last week in the same way:

$ bazel run //ci/githubstats:query -- last --flaky //rs/tests/idx:ii_delegation_test --columns=last_started_at,errors
...
╒════╤═════════════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│    │   last started at (UTC) │ errors per attempt                                                                                                 │
╞════╪═════════════════════════╪════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│  0 │ Fri 2026-03-06 16:47:11 │ 1: test: Minimum expected counter value 1 on counter canister was not observed after 10 r...                       │
├────┼─────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
...
├────┼─────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│  2 │ Wed 2026-03-04 16:43:08 │ 1: test: Minimum expected counter value 1 on counter canister was not observed after 10 r...                       │
╘════╧═════════════════════════╧════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╛

Root Cause

The test was flaky because the update call to increment the counter canister used update() which fires the HTTP request but does NOT wait for the update to be processed. The test then immediately tried to read the counter value, but since the update hadn't been executed yet, the counter was still 0. The retry loop (10 retries, 1 second each) sometimes wasn't enough time for the update to be processed on the application subnet.

Fix

Use update_and_wait() instead of update() to properly wait for the update to be processed before checking the counter value. This also lets us directly verify the counter value from the delegated query response, removing the need for the separate retry-based assertion (assert_canister_counter_with_retries).

Note that the test already used update_and_wait() for the universal canister call later in the test (line 164), making this fix consistent with the existing pattern.

We also re-implement the retry-loop in update_and_wait() with ic_system_test_driver::retry_with_msg_async! such that it has proper logging and uses a backoff duration in case it fails.

This PR was created following the steps in .claude/skills/fix-flaky-tests/SKILL.md.

The test was flaky because the update call to increment the counter canister used `update()` which fires the request but does NOT wait for it to be processed. The test then immediately tried to read the counter value, but since the update hadn't been executed yet, the counter was still 0. The retry loop (10 retries, 1 second each) sometimes wasn't enough for the update to be processed. Fix: Use `update_and_wait()` instead of `update()` to wait for the update to be processed before asserting the counter value. This also lets us directly verify the counter value from the delegated query response, removing the need for the separate retry-based assertion.

Copilot

Pull request overview

This PR addresses flakiness in the ii_delegation_test system test by ensuring delegated update calls actually complete before the test asserts on the resulting counter state.

Changes:

Replaced the fire-and-forget delegated update() with update_and_wait() for the counter increment.
Removed the retry-based counter assertion helper and directly asserted the delegated query response value.
Cleaned up unused imports related to the removed retry logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rs/tests/idx/ii_delegation_test.rs

Address review comment: update_and_wait() polls up to polling_timeout, which was set to 10s via UPDATE_POLLING_TIMEOUT. If the app-subnet update occasionally takes >10s to execute, the test can still intermittently fail with a polling timeout. Increase to 60s to reliably wait long enough.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rs/tests/driver/src/util/delegations.rs

… with retry_with_msg_async!

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rs/tests/driver/src/util/delegations.rs

cgundy

Nice!

pierugo-dfinity

Approving single line Consensus-owned change

Instead of only polling for the reply leaf, first check request_status/<id>/status to detect terminal states early. Returns immediately with a descriptive error on 'rejected' (including reject code and message) or 'done', and only reads the reply leaf when the status is 'replied'.

The 'rejected' and 'done' states are terminal — retrying them via the consistent with other terminal error handling in update_and_wait.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rs/tests/driver/src/util/delegations.rs

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rs/tests/driver/src/util/delegations.rs

for terminal states (rejected, done, malformed reply). Uses a nested Result pattern (anyhow::Result<Result<Vec<u8>, String>>) so that: - Ok(Err(...)) stops retrying immediately for terminal failures - Ok(Ok(...)) returns success Error messages now include canister_id, method_name, and request_id for better debuggability. This makes the function composable so callers can handle errors instead of unwinding on panic.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rs/tests/driver/src/util/delegations.rs

…sync closure Inside the retry_with_msg_async! polling closure in update_and_wait, several .unwrap()/.expect() calls could panic and bypass the retry logic entirely. Convert them to ? with anyhow::Context so failures are logged and retried instead of crashing the test process.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ling The read_state status 'pruned' is a terminal state in the IC spec, but it was falling into the default retry arm, causing pointless retries until polling_timeout. Treat it as terminal (like 'done') and return an immediate error.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rs/tests/driver/src/util/delegations.rs

…mplex for this code

…ry loops Change send_http_request() to return anyhow::Result<Response> instead of panicking on transport failures. This ensures that transient errors inside retry_with_msg_async! are retried cleanly rather than bypassing the retry loop via a panic. Update the signatures of update() and query() accordingly, and adjust all call sites to handle the new Result return type.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-11T18:07:51Z

rs/tests/driver/src/util/delegations.rs

+        let _ = self
+            .send_http_request("call", canister_id, body)
+            .await
+            .context("failed to send update call")?;


update() ignores the HTTP response status from the call endpoint. If the replica returns a non-success status (e.g., 4xx/5xx), the code will still return a request_id and then update_and_wait may poll until timeout with a misleading error. Consider failing fast by checking the HTTP status (e.g., via error_for_status or an explicit status check) before returning Ok(request_id).

Suggested change

let _ = self

.send_http_request("call", canister_id, body)

.await

.context("failed to send update call")?;

let response = self

.send_http_request("call", canister_id, body)

.await

.context("failed to send update call")?;

if !response.status().is_success() {

bail!("update call failed with HTTP status: {}", response.status());

}

Copilot · 2026-03-11T18:07:52Z

rs/tests/driver/src/util/delegations.rs

            sender_pubkey: Some(Blob(self.pubkey.clone().into_vec())),
            sender_sig: Some(Blob(signature.signature.unwrap())),
        };
        let body = serde_cbor::ser::to_vec(&envelope).unwrap();
-        let _ = self.send_http_request("call", canister_id, body).await;
-        request_id
+        let _ = self


Now that update() returns anyhow::Result, serde_cbor::ser::to_vec(&envelope).unwrap() can still panic and bypass the structured error handling you added. Prefer propagating the serialization failure as an error with context instead of unwrapping.

Copilot · 2026-03-11T18:07:52Z

rs/tests/driver/src/util/delegations.rs

@@ -213,9 +225,15 @@
            sender_sig: Some(Blob(signature.signature.unwrap())),
        };
        let body = serde_cbor::ser::to_vec(&envelope).unwrap();


query() returns anyhow::Result, but serde_cbor::ser::to_vec(&envelope).unwrap() can still panic and bypass the new error handling path. Prefer returning a contextual error instead of unwrapping.

Suggested change

let body = serde_cbor::ser::to_vec(&envelope).unwrap();

let body = serde_cbor::ser::to_vec(&envelope)

.context("failed to serialize query envelope")?;

Copilot · 2026-03-11T18:07:52Z

rs/tests/driver/src/util/delegations.rs

            .send()
            .await
-            .unwrap()
+            .context(format!("failed to send {method} request to {canister_id}"))


send_http_request builds the context string eagerly via context(format!(...)), which allocates even on the success path and is called frequently during update_and_wait polling. Prefer with_context(|| format!(...)) (or equivalent) so the formatting only happens when there is an error.

Suggested change

.context(format!("failed to send {method} request to {canister_id}"))

.with_context(|| format!("failed to send {method} request to {canister_id}"))

github-actions bot added the fix label Mar 10, 2026

basvandijk requested a review from Copilot March 10, 2026 23:26

Copilot started reviewing on behalf of basvandijk March 10, 2026 23:27 View session

Copilot AI reviewed Mar 10, 2026

View reviewed changes

rs/tests/idx/ii_delegation_test.rs Show resolved Hide resolved

basvandijk requested a review from Copilot March 10, 2026 23:36

Copilot started reviewing on behalf of basvandijk March 10, 2026 23:37 View session

basvandijk marked this pull request as ready for review March 10, 2026 23:38

basvandijk requested a review from a team as a code owner March 10, 2026 23:38

github-actions bot added the @idx label Mar 10, 2026

Copilot AI reviewed Mar 10, 2026

View reviewed changes

rs/tests/driver/src/util/delegations.rs Show resolved Hide resolved

chore: implement the AgentWithDelegation's update_and_wait retry loop…

cec2b32

… with retry_with_msg_async!

basvandijk requested a review from a team as a code owner March 11, 2026 09:20

github-actions bot added the @consensus label Mar 11, 2026

basvandijk requested a review from Copilot March 11, 2026 09:24

Copilot started reviewing on behalf of basvandijk March 11, 2026 09:25 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

rs/tests/driver/src/util/delegations.rs Outdated Show resolved Hide resolved

cgundy approved these changes Mar 11, 2026

View reviewed changes

pierugo-dfinity approved these changes Mar 11, 2026

View reviewed changes

basvandijk requested a review from Copilot March 11, 2026 11:18

Copilot started reviewing on behalf of basvandijk March 11, 2026 11:19 View session

chore: panic on terminal rejection/done states instead of retrying

9b6948b

The 'rejected' and 'done' states are terminal — retrying them via the consistent with other terminal error handling in update_and_wait.

Copilot AI reviewed Mar 11, 2026

View reviewed changes

rs/tests/driver/src/util/delegations.rs Outdated Show resolved Hide resolved

rs/tests/driver/src/util/delegations.rs Show resolved Hide resolved

basvandijk requested a review from Copilot March 11, 2026 11:23

Copilot started reviewing on behalf of basvandijk March 11, 2026 11:24 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

rs/tests/driver/src/util/delegations.rs Outdated Show resolved Hide resolved

basvandijk requested a review from Copilot March 11, 2026 11:37

Copilot started reviewing on behalf of basvandijk March 11, 2026 11:38 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

rs/tests/driver/src/util/delegations.rs Outdated Show resolved Hide resolved

rs/tests/driver/src/util/delegations.rs Show resolved Hide resolved

basvandijk requested a review from Copilot March 11, 2026 15:47

Copilot started reviewing on behalf of basvandijk March 11, 2026 15:47 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

basvandijk requested a review from Copilot March 11, 2026 16:02

Copilot started reviewing on behalf of basvandijk March 11, 2026 16:03 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

rs/tests/driver/src/util/delegations.rs Outdated Show resolved Hide resolved

rs/tests/driver/src/util/delegations.rs Outdated Show resolved Hide resolved

basvandijk added 2 commits March 11, 2026 17:51

don't lookup reject_code since it needs to be decoded which is too co…

8f9865e

…mplex for this code

basvandijk requested a review from Copilot March 11, 2026 18:03

Copilot started reviewing on behalf of basvandijk March 11, 2026 18:03 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

	let body = serde_cbor::ser::to_vec(&envelope).unwrap();
	let body = serde_cbor::ser::to_vec(&envelope)
	.context("failed to serialize query envelope")?;

	.context(format!("failed to send {method} request to {canister_id}"))
	.with_context(\|\| format!("failed to send {method} request to {canister_id}"))

Conversation

basvandijk commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Cause

Fix

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

cgundy left a comment

Choose a reason for hiding this comment

Uh oh!

pierugo-dfinity left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

basvandijk commented Mar 10, 2026 •

edited

Loading