Skip to content

instrumentation telemetry: validate session id headers#6510

Draft
mabdinur wants to merge 30 commits intomainfrom
munir/test-stable-headers
Draft

instrumentation telemetry: validate session id headers#6510
mabdinur wants to merge 30 commits intomainfrom
munir/test-stable-headers

Conversation

@mabdinur
Copy link
Copy Markdown
Contributor

@mabdinur mabdinur commented Mar 16, 2026

Motivation

Enable telemetry session ID header tests (DD-Session-ID, DD-Root-Session-ID, DD-Parent-Session-ID) across process forks per the Stable Service Instance Identifier RFC.

Changes

  • GET /spawn_child – New endpoint in weblogs (Python flask, Node.js express4, Ruby rails72, PHP, Go net-http, Java spring-boot, .NET poc). Params: sleep, crash, fork. Uses fork when supported, exec otherwise. Runtimes without fork (Java, Go, PHP, .NET) return 400 for fork=true.
  • Teststest_session_id_headers_across_forks and test_session_id_headers_across_spawned validate session ID headers in lifecycle telemetry. Uses get_lifecycle_events() to avoid lib-datadog metrics/log events. Asserts: DD-Session-ID = runtime_id, one root per app instance, at least two runtimes (parent + child).
  • Library interfaceget_lifecycle_events() added to filter lifecycle events.
  • Docs – Endpoint spec in docs/weblog/end-to-end_weblog.md.
  • Manifests – Enabled for Ruby rails72; missing_feature for other weblogs and non-fork runtimes.

Workflow

  1. ⚠️ Create your PR as draft ⚠️
  2. Work on your PR until the CI passes
  3. Mark it as ready for review
    • Test logic is modified? → Get a review from RFC owner.
    • Framework is modified, or non-obvious usage of it → Get a review from R&P team

🚀 Once your PR is reviewed and the CI is green, you can merge it!

🛟 #apm-shared-testing 🛟

SDK Implementations

Nodejs: DataDog/dd-trace-js#7821
Go: DataDog/dd-trace-go#4574
Java: DataDog/dd-trace-java#10914

@khanayan123
Copy link
Copy Markdown
Contributor

khanayan123 commented Mar 17, 2026

As per https://dd.slack.com/archives/D032MDTSCR1/p1773765779731369

We need to assert:

  1. Headers Present & Valid For Every Telemetry Event: DD-Session-ID always present, DD-Root-Session-ID present when a child process is forked/spawned
  2. Root Stability Across Fork: Session-ID regenerates per process, Root-Session-ID inherited and never changes

Co-authored-by: Munir Abdinur <munir.abdinur@datadoghq.com>
@khanayan123
Copy link
Copy Markdown
Contributor

khanayan123 commented Mar 18, 2026

Remaining gaps I believe are:

Gap 1: The test should assert that for every event where DD-Session-ID != root_session_id (i.e. every child event), DD-Root-Session-ID must be present not just that at least one event has it globally.

Gap 2 (exec vs fork): Same validation function for both test cases, exec propagation via env vars isn't distinctly tested.

@mabdinur
Copy link
Copy Markdown
Contributor Author

Remaining gaps I believe are:

Gap 1: The test should assert that for every event where DD-Session-ID != root_session_id (i.e. every child event), DD-Root-Session-ID must be present not just that at least one event has it globally.

Gap 2 (exec vs fork): Same validation function for both test cases, exec propagation via env vars isn't distinctly tested.

Both cases should be covered by the current test. We can discuss it in our next sync

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 18, 2026

CODEOWNERS have been resolved as:

utils/build/docker/dotnet/weblog/Endpoints/SpawnChildEndpoint.cs        @DataDog/apm-dotnet @DataDog/asm-dotnet @DataDog/system-tests-core
utils/build/docker/golang/app/_shared/common/spawn_child.go             @DataDog/dd-trace-go-guild @DataDog/system-tests-core
utils/build/docker/nodejs/express/fork_child.js                         @DataDog/dd-trace-js @DataDog/system-tests-core
utils/build/docker/php/common/spawn_child.php                           @DataDog/apm-php @DataDog/system-tests-core
docs/understand/weblogs/end-to-end_weblog.md                            @DataDog/system-tests-core
manifests/cpp.yml                                                       @DataDog/dd-trace-cpp
manifests/cpp_httpd.yml                                                 @DataDog/dd-trace-cpp
manifests/cpp_kong.yml                                                  @DataDog/system-tests-core
manifests/cpp_nginx.yml                                                 @DataDog/dd-trace-cpp
manifests/dotnet.yml                                                    @DataDog/apm-dotnet @DataDog/asm-dotnet
manifests/golang.yml                                                    @DataDog/dd-trace-go-guild
manifests/java.yml                                                      @DataDog/asm-java @DataDog/apm-java
manifests/nodejs.yml                                                    @DataDog/dd-trace-js
manifests/php.yml                                                       @DataDog/apm-php @DataDog/asm-php
manifests/python.yml                                                    @DataDog/apm-python @DataDog/asm-python
manifests/ruby.yml                                                      @DataDog/ruby-guild @DataDog/asm-ruby
tests/test_telemetry.py                                                 @DataDog/libdatadog-telemetry @DataDog/apm-sdk-capabilities @DataDog/system-tests-core
utils/build/docker/cpp_nginx/nginx/backend.c                            @DataDog/system-tests-core
utils/build/docker/cpp_nginx/nginx/nginx-waf.conf                       @DataDog/system-tests-core
utils/build/docker/cpp_nginx/nginx/nginx.conf                           @DataDog/system-tests-core
utils/build/docker/dotnet/weblog/Program.cs                             @DataDog/apm-dotnet @DataDog/asm-dotnet @DataDog/system-tests-core
utils/build/docker/golang/app/net-http/main.go                          @DataDog/dd-trace-go-guild @DataDog/system-tests-core
utils/build/docker/java/spring-boot/src/main/java/com/datadoghq/system_tests/springboot/App.java  @DataDog/apm-java @DataDog/asm-java @DataDog/system-tests-core
utils/build/docker/nodejs/express/app.js                                @DataDog/dd-trace-js @DataDog/system-tests-core
utils/build/docker/nodejs/install_ddtrace.sh                            @DataDog/dd-trace-js @DataDog/system-tests-core
utils/build/docker/php/apache-mod/php.conf                              @DataDog/apm-php @DataDog/system-tests-core
utils/build/docker/php/php-fpm/php-fpm.conf                             @DataDog/apm-php @DataDog/system-tests-core
utils/build/docker/python/flask/app.py                                  @DataDog/apm-python @DataDog/asm-python @DataDog/system-tests-core
utils/build/docker/ruby/rails72/app/controllers/system_test_controller.rb  @DataDog/ruby-guild @DataDog/asm-ruby @DataDog/system-tests-core
utils/build/docker/ruby/rails72/config/routes.rb                        @DataDog/ruby-guild @DataDog/asm-ruby @DataDog/system-tests-core
utils/interfaces/_library/core.py                                       @DataDog/system-tests-core

Copy link
Copy Markdown
Contributor

@khanayan123 khanayan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments, tests LGTM

@datadog-datadog-prod-us1-2

This comment has been minimized.

@mabdinur mabdinur requested review from a team as code owners April 13, 2026 15:24
@mabdinur mabdinur requested review from Anilm3, BridgeAR, Yun-Kim, ZStriker19, daniel-romano-DD, manuel-alvarez-alvarez and pawelchcki and removed request for a team April 13, 2026 15:24
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50bdced3a6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

"""
# Use lifecycle events only; metrics and log events from lib-datadog can contain
# runtime/session_ids that do not map to tracer-generated telemetry.
telemetry_data = list(interfaces.library.get_lifecycle_events())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Scope lifecycle checks to the spawn_child-triggered request

This validator reads all lifecycle telemetry (interfaces.library.get_lifecycle_events()) without filtering to the /spawn_child request that the setup just issued, so it can pass on unrelated startup/shutdown events even when child-process behavior is broken (or /spawn_child returns 404). Because the assertions are not request-scoped, the test does not reliably prove cross-process session header propagation.

Useful? React with 👍 / 👎.

Comment on lines +329 to +330
bool do_crash = strcmp(crash_str, "true") == 0;
bool use_fork = strcmp(fork_str, "true") == 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse /spawn_child booleans case-insensitively in cpp_nginx

The cpp_nginx handler compares crash and fork to "true" case-sensitively, but the new test setup passes Python booleans as query params (True/False). That makes fork=True evaluate as false here, so the fork path is silently skipped and crash requests are ignored; this invalidates fork-specific coverage for cpp_nginx.

Useful? React with 👍 / 👎.

@mabdinur mabdinur marked this pull request as draft April 13, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants