Skip to content

Conversation

@sfmskywalker
Copy link
Member

@sfmskywalker sfmskywalker commented Apr 30, 2025

This commit introduces a new ErrorMetrics class to track and record workflow errors, leveraging OpenTelemetry. It replaces DefaultErrorSpanHandler with DefaultExceptionHandler and adds support for custom metric handling through the IErrorMetricHandler interface. Additionally, the middleware and service configuration have been updated to integrate error metrics tracking seamlessly.

image

This change is Reviewable

sfmskywalker and others added 2 commits April 30, 2025 15:16
Introduce WorkflowErrorSpanHandler interface and context for improved error span handling in workflows. Separate activity and workflow error handling using dedicated abstractions and context models. Update error handling logic in tracing middleware and adjust DI configuration accordingly.
This commit introduces a new `ErrorMetrics` class to track and record workflow errors, leveraging OpenTelemetry. It replaces `DefaultErrorSpanHandler` with `DefaultExceptionHandler` and adds support for custom metric handling through the `IErrorMetricHandler` interface. Additionally, the middleware and service configuration have been updated to integrate error metrics tracking seamlessly.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces error metrics tracking into workflow incident handling by replacing the old error span handler with a new exception handler and adding support for custom error metric handlers using OpenTelemetry. Key changes include:

  • Adding the new ErrorMetrics class and IErrorMetricHandler interface for tracking workflow errors.
  • Updating middleware and DI registrations to integrate error metric tracking.
  • Renaming and consolidating error span handlers to align with the new behavior.

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/modules/Elsa.OpenTelemetry/Models/ErrorMetricContext.cs Introduces a new context model for error metrics using a modern constructor syntax.
src/modules/Elsa.OpenTelemetry/Middleware/OpenTelemetryTracingActivityExecutionMiddleware.cs Updates the middleware constructor and tag assignments, and integrates error tracking.
src/modules/Elsa.OpenTelemetry/Metrics/ErrorMetrics.cs Implements the error metrics tracking mechanism with OpenTelemetry counters.
src/modules/Elsa.OpenTelemetry/Handlers/FaultExceptionHandler.cs Renames and extends the fault exception handler to process error metric contexts.
src/modules/Elsa.OpenTelemetry/Handlers/DefaultExceptionHandler.cs Introduces a unified exception handler covering span and metric contexts.
src/modules/Elsa.OpenTelemetry/Features/OpenTelemetryFeature.cs Updates DI registrations to incorporate the new error metric tracking components.
src/modules/Elsa.OpenTelemetry/Contracts/IErrorMetricHandler.cs Adds an interface to support customizable error metric handling.
src/apps/Elsa.Server.Web/Program.cs Adjusts metrics configuration to register the new error metrics and updates exporter settings.
Comments suppressed due to low confidence (2)

src/modules/Elsa.OpenTelemetry/Middleware/OpenTelemetryTracingActivityExecutionMiddleware.cs:33

  • Ensure that using activity.Id instead of activity.NodeId for the 'activity.id' tag is consistent with the identifier usage elsewhere in the system.
span.SetTag("activity.id", activity.Id);

src/apps/Elsa.Server.Web/Program.cs:157

  • [nitpick] Verify that the instrumentation lines commented out are intentionally disabled; if so, consider adding a comment to clarify their purpose or remove them if no longer necessary.
.AddMeter(ErrorMetrics.MeterName)

.AddScoped<IWorkflowErrorSpanHandler, DefaultExceptionHandler>()
.AddScoped<IWorkflowErrorSpanHandler, FaultExceptionHandler>()
.AddScoped<IErrorMetricHandler, DefaultExceptionHandler>()
.AddScoped<IErrorMetricHandler, FaultExceptionHandler>()
Copy link

Copilot AI Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Include a brief comment explaining the purpose of the ErrorMetrics registration to aid future maintainability.

Suggested change
.AddScoped<IErrorMetricHandler, FaultExceptionHandler>()
.AddScoped<IErrorMetricHandler, FaultExceptionHandler>()
// Registers the ErrorMetrics service to track and report error-related metrics for OpenTelemetry.

Copilot uses AI. Check for mistakes.
Renamed `FaultExceptionActivityErrorSpanHandler` to `FaultExceptionErrorSpanHandler` for consistency and clarity. Updated error attribute tagging to align with Datadog's well-known attributes. Adjusted incident selection logic to use the first incident instead of the last.
Added well-known Datadog attributes to ensure proper handling of fault exception tags. Introduced a separate object for unknown attributes to prevent them from being ignored by Datadog.
Previously, the middleware selected the last incident in the list, which could lead to incorrect behavior. This change ensures the middleware selects the first incident, aligning with the expected logic.
Standardized comments in the .gitignore file by adding missing spaces and capitalizing as needed. Updated the `/docker/data/` entry to `/docker/azurite-data/` for clarity.
Replaced `LastOrDefault` with `FirstOrDefault` to ensure the correct activity execution context is retrieved when handling errors. This change resolves potential inaccuracies in identifying the faulted activity node.
Base automatically changed from enh/otel-error to develop/3.5.0 May 7, 2025 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants