Skip to content

fix(deployment): reduce noisy websocket errors in Sentry#3050

Open
baktun14 wants to merge 8 commits intomainfrom
fix/deployment-reduce-noisy-websocket-errors
Open

fix(deployment): reduce noisy websocket errors in Sentry#3050
baktun14 wants to merge 8 commits intomainfrom
fix/deployment-reduce-noisy-websocket-errors

Conversation

@baktun14
Copy link
Copy Markdown
Contributor

@baktun14 baktun14 commented Apr 9, 2026

Why

Sentry issue DEPLOY-WEB-27J is escalating — 595 occurrences across 79 users in ~2 days. The Error: websocket error fires from the deployment logs WebSocket stream (followLogs) on /deployments/[dseq].

Most of these errors are caused by users navigating away from the LOGS tab, which aborts the WebSocket connection and triggers the error handler unnecessarily. The remaining genuine errors have a generic message that makes Sentry triage difficult.

Fixes CON-204

What

  • Skip Sentry reporting when abort signal has fired — when the useEffect cleanup runs (user navigates away), the abort controller fires, closing the WS. Previously this still reported to Sentry; now we check abortController.signal.aborted before reporting.
  • Downgrade severity to warning — genuine WebSocket connection failures (provider down, network issue) are transient and expected in a streaming context; warning is more appropriate than error.
  • Include close code and reason in error message — instead of the generic "websocket error", the message now reads "websocket error: 1006 unknown reason", making Sentry grouping and triage much easier.

Summary by CodeRabbit

  • Bug Fixes
    • Deployment log streaming: avoid updating loading/connection indicators and prevent duplicate/error-level reports when a log request is intentionally cancelled — such cases are now reported as warnings.
    • Websocket errors: improved error messages include close-event reasons when available and use a more consistent, standardized message for non-close errors.

…27J)

Skip error reporting when the abort signal has fired (user navigated
away from the logs tab) and downgrade remaining websocket errors to
warning severity. Include close code and reason in the error message
for better Sentry triage.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Refined WebSocket and deployment log error handling: WebSocket errors now include a reason and standardized message; tests updated to match. Deployment log stream catch-handler now checks abort signals, conditions UI state updates, and reports errors with severity and an optional cause.

Changes

Cohort / File(s) Summary
WebSocket implementation & test
apps/deploy-web/src/lib/websocket/createWebsocket.ts, apps/deploy-web/src/lib/websocket/createWebsocket.spec.ts
createWsError(event) now sets cause.reason from event.reason (or "unknown reason") and changes the Error message from "Generic websocket error" to "websocket error". Updated test assertion to expect the new exact message.
Deployment logs error handling
apps/deploy-web/src/components/deployments/DeploymentLogs.tsx
.catch handler for getLogsStream(...) now checks abortController.signal.aborted to skip UI updates when aborted; when not aborted it sets setIsLoadingLogs(false) and setIsConnectionEstablished(false) before reporting. errorHandler.reportError payload now includes severity ("warning" if aborted, otherwise "error") and a cause field populated from the caught Error when available.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰✨ I nibble at bugs with a careful hop,
Websockets now tell why they had to stop,
Abort signs noticed, states set true or light,
Errors reported with cause in sight —
Hooray for logs that behave polite!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: reducing noisy websocket errors in Sentry by improving error handling and severity reporting.
Description check ✅ Passed The PR description is comprehensive and well-structured. It clearly explains the motivation (Sentry issue escalation), the specific problems (noisy aborted connections and generic error messages), and provides detailed bullet points covering all implemented changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/deployment-reduce-noisy-websocket-errors

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/deploy-web/src/components/deployments/DeploymentLogs.tsx`:
- Around line 137-145: The catch block for getLogsStream(...) leaves
isLoadingLogs true on non-abort failures causing the skeleton to hang; in the
catch handler (the block referencing abortController.signal.aborted and
errorHandler.reportError) set isLoadingLogs to false (or call the setter that
updates the loading state) before calling errorHandler.reportError so the UI
clears the loading state on any non-abort error; ensure you reference the same
abortController and do not change abort handling logic for the abort case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ab3f6078-41ce-43e8-a0d1-f85a8215e34f

📥 Commits

Reviewing files that changed from the base of the PR and between 9ccf71a and 5486e16.

📒 Files selected for processing (3)
  • apps/deploy-web/src/components/deployments/DeploymentLogs.tsx
  • apps/deploy-web/src/lib/websocket/createWebsocket.spec.ts
  • apps/deploy-web/src/lib/websocket/createWebsocket.ts

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 16.66667% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.69%. Comparing base (9ccf71a) to head (6c948ff).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...-web/src/components/deployments/DeploymentLogs.tsx 0.00% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3050      +/-   ##
==========================================
- Coverage   59.71%   59.69%   -0.02%     
==========================================
  Files        1034     1034              
  Lines       24273    24279       +6     
  Branches     6003     6007       +4     
==========================================
  Hits        14494    14494              
- Misses       8528     8533       +5     
- Partials     1251     1252       +1     
Flag Coverage Δ
api 81.30% <ø> (ø)
deploy-web 43.13% <16.66%> (-0.02%) ⬇️
log-collector 85.17% <ø> (ø)
notifications 86.06% <ø> (ø)
provider-console 81.48% <ø> (ø)
provider-proxy 85.21% <ø> (ø)
tx-signer 76.26% <ø> (ø)
Files with missing lines Coverage Δ
...ps/deploy-web/src/lib/websocket/createWebsocket.ts 97.26% <100.00%> (+0.03%) ⬆️
...-web/src/components/deployments/DeploymentLogs.tsx 0.00% <0.00%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

baktun14 added 2 commits April 9, 2026 15:22
…e target

Extract error handling from DeploymentLogs useEffect into a testable
handleLogsStreamError function and add unit tests covering both the
abort-skip and warning-report paths.
@github-actions github-actions bot added size: S and removed size: XS labels Apr 9, 2026
setIsLoadingLogs: (v: boolean) => void,
setIsConnectionEstablished: (v: boolean) => void
): void {
if (signal.aborted) return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you want to report errors if signal is aborted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is what is causing the spam of this error no? Why do we want to report an aborted error?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly, idk. i'm just asking. my concern is we could miss some important ones

…ws error message

Inline handleLogsStreamError into the component per review feedback, use a
stable "websocket error" message for Sentry grouping, and forward cause as
extra context.
@github-actions github-actions bot added size: XS and removed size: S labels Apr 10, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/deploy-web/src/components/deployments/DeploymentLogs.tsx`:
- Around line 138-147: The reportError call currently fires even for aborted
streams and sets severity inversely; change it so you skip reporting entirely
when abortController.signal.aborted is true, and for non-abort failures set
severity to "warning" (not "error"); update the error payload sent to
errorHandler.reportError (the call in this block) to include the error message
(e.g., error instanceof Error ? error.message : String(error)) as a property or
tag so grouping is stable, and keep using the existing tags { category:
"deployments", label: "followLogs" } while leaving setIsLoadingLogs and
setIsConnectionEstablished behavior unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7ac9ae8-d042-4c8d-9a70-bdf7eb902dd9

📥 Commits

Reviewing files that changed from the base of the PR and between 34f4035 and f0b99b1.

📒 Files selected for processing (1)
  • apps/deploy-web/src/components/deployments/DeploymentLogs.tsx

Comment on lines +138 to +147
if (!abortController.signal.aborted) {
setIsLoadingLogs(false);
setIsConnectionEstablished(false);
}

errorHandler.reportError({
error,
tags: { category: "deployments", label: "followLogs" }
severity: abortController.signal.aborted ? "warning" : "error",
tags: { category: "deployments", label: "followLogs" },
cause: error instanceof Error ? (error as Error & { cause?: unknown }).cause : undefined
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Skip reporting aborted streams and downgrade real stream failures to warning.

Line 143 still reports even when abortController.signal.aborted is true, and Line 145 sets non-abort failures to "error". This keeps expected navigation aborts noisy and inverts the intended severity policy. Also, Line 147 adds cause as extra context, but that alone won’t standardize grouping by message.

💡 Proposed fix
     ).catch(error => {
-      if (!abortController.signal.aborted) {
-        setIsLoadingLogs(false);
-        setIsConnectionEstablished(false);
-      }
-
-      errorHandler.reportError({
-        error,
-        severity: abortController.signal.aborted ? "warning" : "error",
-        tags: { category: "deployments", label: "followLogs" },
-        cause: error instanceof Error ? (error as Error & { cause?: unknown }).cause : undefined
-      });
+      if (abortController.signal.aborted) return;
+
+      setIsLoadingLogs(false);
+      setIsConnectionEstablished(false);
+
+      errorHandler.reportError({
+        error,
+        severity: "warning",
+        tags: { category: "deployments", label: "followLogs" }
+      });
     });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!abortController.signal.aborted) {
setIsLoadingLogs(false);
setIsConnectionEstablished(false);
}
errorHandler.reportError({
error,
tags: { category: "deployments", label: "followLogs" }
severity: abortController.signal.aborted ? "warning" : "error",
tags: { category: "deployments", label: "followLogs" },
cause: error instanceof Error ? (error as Error & { cause?: unknown }).cause : undefined
).catch(error => {
if (abortController.signal.aborted) return;
setIsLoadingLogs(false);
setIsConnectionEstablished(false);
errorHandler.reportError({
error,
severity: "warning",
tags: { category: "deployments", label: "followLogs" }
});
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/deploy-web/src/components/deployments/DeploymentLogs.tsx` around lines
138 - 147, The reportError call currently fires even for aborted streams and
sets severity inversely; change it so you skip reporting entirely when
abortController.signal.aborted is true, and for non-abort failures set severity
to "warning" (not "error"); update the error payload sent to
errorHandler.reportError (the call in this block) to include the error message
(e.g., error instanceof Error ? error.message : String(error)) as a property or
tag so grouping is stable, and keep using the existing tags { category:
"deployments", label: "followLogs" } while leaving setIsLoadingLogs and
setIsConnectionEstablished behavior unchanged.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We intentionally kept reporting for aborted streams — the reviewer raised a concern about silently missing important errors. The current approach differentiates by severity: aborted streams get "warning", real failures get "error". This way nothing is silenced, but aborted noise is clearly separated in Sentry. What's your take on this tradeoff vs silencing aborts entirely?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baktun14, that's a fair and well-reasoned tradeoff. Here's the breakdown of each side:

Keep reporting at "warning" for aborts (current implementation):

  • ✅ Full observability — no errors silently discarded
  • ✅ Severity levels in Sentry can be filtered/alerted on independently
  • ⚠️ Still generates Sentry events for every navigation-away (just at a lower level), so the volume reduction from the PR's stated goal (~595 occurrences) may be limited in practice — they shift from error to warning rather than disappearing

Silence aborts entirely (my suggestion):

  • ✅ Completely eliminates navigation-abort noise — cleaner signal for real failures
  • ✅ Aligns directly with the PR's stated motivation (reduce noisy Sentry errors from followLogs)
  • ⚠️ Loses visibility into abort-coincident failures — though since abortController.abort() is only ever called from the useEffect cleanup (i.e., a controlled, intentional abort by the component), these are never genuinely unexpected errors

The core question is whether you want Sentry to reflect operational health (where user navigations are irrelevant) or full observability including user behavior patterns (where abort rate itself might be a signal). If the team has no use case for monitoring abort frequency in Sentry, silencing them entirely is the cleaner approach and better matches the PR's motivation. If you want to retain that data for trend analysis, the current implementation is a reasonable compromise.

Given your team's explicit concern about silently missing errors, the current approach is defensible — just worth knowing that warning-level events still contribute to Sentry volume. Either way, I'll defer to your judgment here.


✏️ Learnings added
Learnt from: baktun14
Repo: akash-network/console PR: 3050
File: apps/deploy-web/src/components/deployments/DeploymentLogs.tsx:138-147
Timestamp: 2026-04-10T18:43:35.084Z
Learning: In `DeploymentLogs.tsx` (`apps/deploy-web/src/components/deployments/DeploymentLogs.tsx`), the team intentionally reports aborted WebSocket streams (when `abortController.signal.aborted === true`) to Sentry at `"warning"` severity rather than silencing them entirely. This is a deliberate tradeoff: the team wants full observability and to avoid silently missing errors, while using severity levels to separate expected navigation aborts from real WebSocket failures (`"error"`). Do not suggest silencing aborted stream errors entirely in future reviews.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: ygrishajev
Repo: akash-network/console PR: 2863
File: apps/log-collector/src/services/pod-logs-collector/pod-logs-collector.service.ts:173-180
Timestamp: 2026-03-05T19:43:55.998Z
Learning: In `PodLogsCollectorService` (`apps/log-collector/src/services/pod-logs-collector/pod-logs-collector.service.ts`), the `remainingBuffer` (holding bytes received without a trailing `\n`) is intentionally NOT flushed when `logStream` emits `close`. Two cases cover all close scenarios: (1) AbortSignal fires because the pod was untracked — collection resumes from the last written timestamp on re-discovery, so the partial line is re-delivered; (2) the K8s log stream ends because the container terminated — the partial line is genuinely truncated and malformed. Writing a truncated line in either case would produce malformed output or a redundant duplicate, so the deliberate drop is correct design.

Learnt from: baktun14
Repo: akash-network/console PR: 1725
File: apps/api/src/utils/constants.ts:5-5
Timestamp: 2025-07-24T17:00:52.361Z
Learning: In the Akash Network Console project, when cross-cutting concerns or broader refactoring issues are identified during PR review, the preferred approach is to create a separate GitHub issue to track the work rather than expanding the scope of the current PR. This maintains focus and allows for proper planning of architectural improvements.

Learnt from: stalniy
Repo: akash-network/console PR: 2700
File: apps/deploy-web/src/components/shared/DynamicMonacoEditor/DynamicMonacoEditor.tsx:39-54
Timestamp: 2026-02-11T13:51:34.635Z
Learning: In TypeScript/JSX/TSX files, retain comments that explain the rationale behind non-obvious implementation decisions (e.g., why a workaround or error suppression is necessary). Prefer comments that justify the approach and trade-offs rather than describing what the code already does. Use clear, concise language and ensure the rationale is accessible to future maintainers.

tags: { category: "deployments", label: "followLogs" }
severity: abortController.signal.aborted ? "warning" : "error",
tags: { category: "deployments", label: "followLogs" },
cause: error instanceof Error ? (error as Error & { cause?: unknown }).cause : undefined
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to log cause explicitly, sentry walks over .causes, additionally our logger logs error as well and that info becomes a part of sentry issue breadcrumbs

setIsConnectionEstablished(false);
}

errorHandler.reportError({
Copy link
Copy Markdown
Contributor

@stalniy stalniy Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I doubt that this PR actually reduces websocket errors because all AbortErrors are ignored at errorHandler level ->

if (error && typeof error === "object" && "name" in error && error.name === "AbortError") {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

browser websocket on(error) event doesn't expose details about the actual issue. That's why we probably need to send additional http request to websocket, in case of error to get the actual error details (for example, 404, provider network issues, provider is down, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants