Skip to content

WIP: OTLP trace export#1641

Open
rachelyangdog wants to merge 11 commits intomainfrom
rachel.yang/OTLP-trace-export
Open

WIP: OTLP trace export#1641
rachelyangdog wants to merge 11 commits intomainfrom
rachel.yang/OTLP-trace-export

Conversation

@rachelyangdog
Copy link

@rachelyangdog rachelyangdog commented Mar 2, 2026

What does this PR do?

Summary

This PR adds an OTLP HTTP/JSON trace export path to TraceExporter. When OTEL_TRACES_EXPORTER=otlp is set, the exporter converts Datadog msgpack trace payloads to OTLP and POSTs them to the configured OTLP endpoint instead of the Datadog agent. The caller (e.g. dd-trace-py) sends traces in the same format as before — no changes are required on the tracer side.

Not in scope

  • http/protobuf and grpc protocols (they are parsed from env but not yet implemented; http/json is the default and only supported format). This is currently a POC so we will add additional support later.

Motivation

We are seeing an increasing number of scenarios where users have applications instrumented with the OTel SDK sending data to OTel collectors, and they would like to get additional features offered by the DD SDK without needing to update their OTel collector deployments. Although there will be follow-up work, this provides the ability for users to write vendor-neutral API instrumentation and emit vendor-neutral telemetry data so DD SDK users don't have to feel locked in when setting up Datadog APM.

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Describe here in detail how the change can be validated.

@rachelyangdog rachelyangdog requested a review from a team as a code owner March 2, 2026 21:42
@rachelyangdog rachelyangdog marked this pull request as draft March 2, 2026 21:42
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

📚 Documentation Check Results

⚠️ 1409 documentation warning(s) found

📦 libdd-data-pipeline - 860 warning(s)

📦 libdd-trace-utils - 549 warning(s)


Updated: 2026-03-11 23:32:59 UTC | Commit: bb71f5c | missing-docs job results

@github-actions
Copy link

github-actions bot commented Mar 2, 2026

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

  • Base Branch: origin/main
  • PR Branch: origin/rachel.yang/OTLP-trace-export

Summary by Rule

Rule Base Branch PR Branch Change
unwrap_used 2 2 No change (0%)
Total 2 2 No change (0%)

Annotation Counts by File

File Base Branch PR Branch Change
libdd-data-pipeline/src/trace_exporter/mod.rs 2 2 No change (0%)

Annotation Stats by Crate

Crate Base Branch PR Branch Change
clippy-annotation-reporter 5 5 No change (0%)
datadog-ffe-ffi 1 1 No change (0%)
datadog-ipc 28 28 No change (0%)
datadog-live-debugger 6 6 No change (0%)
datadog-live-debugger-ffi 10 10 No change (0%)
datadog-profiling-replayer 4 4 No change (0%)
datadog-remote-config 3 3 No change (0%)
datadog-sidecar 59 59 No change (0%)
libdd-common 10 10 No change (0%)
libdd-common-ffi 12 12 No change (0%)
libdd-data-pipeline 5 5 No change (0%)
libdd-ddsketch 2 2 No change (0%)
libdd-dogstatsd-client 1 1 No change (0%)
libdd-profiling 13 13 No change (0%)
libdd-telemetry 19 19 No change (0%)
libdd-tinybytes 4 4 No change (0%)
libdd-trace-normalization 2 2 No change (0%)
libdd-trace-obfuscation 9 9 No change (0%)
libdd-trace-utils 15 15 No change (0%)
Total 208 208 No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

@github-actions
Copy link

github-actions bot commented Mar 2, 2026

🔒 Cargo Deny Results

⚠️ 1 issue(s) found, showing only errors (advisories, bans, sources)

📦 libdd-data-pipeline - 1 error(s)

Show output
error[vulnerability]: Denial of Service via Stack Exhaustion
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:293:1
    │
293 │ time 0.3.41 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0009
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0009
    ├ ## Impact
      
      When user-provided input is provided to any type that parses with the RFC 2822 format, a denial of
      service attack via stack exhaustion is possible. The attack relies on formally deprecated and
      rarely-used features that are part of the RFC 2822 format used in a malicious manner. Ordinary,
      non-malicious input will never encounter this scenario.
      
      ## Patches
      
      A limit to the depth of recursion was added in v0.3.47. From this version, an error will be returned
      rather than exhausting the stack.
      
      ## Workarounds
      
      Limiting the length of user input is the simplest way to avoid stack exhaustion, as the amount of
      the stack consumed would be at most a factor of the length of the input.
    ├ Announcement: https://github.com/time-rs/time/blob/main/CHANGELOG.md#0347-2026-02-05
    ├ Solution: Upgrade to >=0.3.47 (try `cargo update -p time`)
    ├ time v0.3.41
      └── tracing-appender v0.2.3
          └── libdd-log v1.0.0
              └── (dev) libdd-data-pipeline v2.0.0

advisories FAILED, bans ok, sources ok

📦 libdd-trace-utils - ✅ No issues


Updated: 2026-03-11 23:36:34 UTC | Commit: bb71f5c | dependency-check job results

@pr-commenter
Copy link

pr-commenter bot commented Mar 2, 2026

Benchmarks

Comparison

Benchmark execution time: 2026-03-11 23:45:37

Comparing candidate commit 82435cb in PR branch rachel.yang/OTLP-trace-export with baseline commit 001bd56 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 2 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

@dd-octo-sts
Copy link
Contributor

dd-octo-sts bot commented Mar 2, 2026

Artifact Size Benchmark Report

aarch64-alpine-linux-musl
Artifact Baseline Commit Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a 97.30 MB 97.66 MB +.36% (+368.04 KB) 🔍
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so 8.51 MB 8.57 MB +.73% (+64.02 KB) 🔍
aarch64-unknown-linux-gnu
Artifact Baseline Commit Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a 112.92 MB 113.38 MB +.40% (+472.60 KB) 🔍
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so 11.12 MB 11.13 MB +.07% (+8.14 KB) 🔍
libdatadog-x64-windows
Artifact Baseline Commit Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll 27.16 MB 27.27 MB +.40% (+111.50 KB) 🔍
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib 76.26 KB 76.61 KB +.46% (+360 B) 🔍
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb 185.98 MB 186.54 MB +.30% (+576.00 KB) 🔍
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib 916.65 MB 920.13 MB +.38% (+3.48 MB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll 9.93 MB 9.99 MB +.52% (+53.50 KB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib 76.26 KB 76.61 KB +.46% (+360 B) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb 24.76 MB 24.87 MB +.41% (+104.00 KB) 🔍
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib 51.43 MB 51.63 MB +.40% (+211.46 KB) 🔍
libdatadog-x86-windows
Artifact Baseline Commit Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll 22.97 MB 23.06 MB +.42% (+99.50 KB) 🔍
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib 77.44 KB 77.80 KB +.45% (+364 B) 🔍
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb 190.16 MB 190.74 MB +.30% (+600.00 KB) 🔍
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib 900.31 MB 903.79 MB +.38% (+3.48 MB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll 7.53 MB 7.57 MB +.47% (+36.50 KB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib 77.44 KB 77.80 KB +.45% (+364 B) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb 26.52 MB 26.63 MB +.41% (+112.00 KB) 🔍
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib 47.06 MB 47.25 MB +.40% (+197.12 KB) 🔍
x86_64-alpine-linux-musl
Artifact Baseline Commit Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a 85.27 MB 85.59 MB +.37% (+326.28 KB) 🔍
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so 10.04 MB 10.08 MB +.35% (+36.02 KB) 🔍
x86_64-unknown-linux-gnu
Artifact Baseline Commit Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a 105.90 MB 106.31 MB +.39% (+425.24 KB) 🔍
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so 11.79 MB 11.83 MB +.31% (+37.65 KB) 🔍

@codecov-commenter
Copy link

codecov-commenter commented Mar 3, 2026

Codecov Report

❌ Patch coverage is 76.73861% with 97 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.30%. Comparing base (001bd56) to head (82435cb).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1641      +/-   ##
==========================================
+ Coverage   71.25%   71.30%   +0.04%     
==========================================
  Files         429      431       +2     
  Lines       63547    63964     +417     
==========================================
+ Hits        45279    45608     +329     
- Misses      18268    18356      +88     
Components Coverage Δ
libdd-crashtracker 62.40% <ø> (-0.02%) ⬇️
libdd-crashtracker-ffi 17.18% <ø> (ø)
libdd-alloc 98.77% <ø> (ø)
libdd-data-pipeline 88.04% <70.00%> (-0.29%) ⬇️
libdd-data-pipeline-ffi 76.91% <12.50%> (-0.04%) ⬇️
libdd-common 79.73% <ø> (ø)
libdd-common-ffi 73.40% <ø> (ø)
libdd-telemetry 62.48% <ø> (ø)
libdd-telemetry-ffi 16.75% <ø> (ø)
libdd-dogstatsd-client 82.64% <ø> (ø)
datadog-ipc 80.35% <ø> (-0.12%) ⬇️
libdd-profiling 81.60% <ø> (ø)
libdd-profiling-ffi 63.65% <ø> (ø)
datadog-sidecar 32.61% <ø> (ø)
datdog-sidecar-ffi 8.35% <ø> (ø)
spawn-worker 54.69% <ø> (ø)
libdd-tinybytes 93.16% <ø> (ø)
libdd-trace-normalization 81.71% <ø> (ø)
libdd-trace-obfuscation 91.80% <ø> (ø)
libdd-trace-protobuf 68.25% <ø> (ø)
libdd-trace-utils 88.66% <79.79%> (-0.42%) ⬇️
datadog-tracer-flare 88.95% <ø> (ø)
libdd-log 74.69% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

/// Sets the tracer/SDK name for OTLP resource attribute `telemetry.sdk.name`
/// (e.g. "dd-trace-py"). When unset, OTLP uses "libdatadog".
#[no_mangle]
pub unsafe extern "C" fn ddog_trace_exporter_config_set_tracer_name(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is tracer name a required field in the otlp trace payload. It would be nice if we went to with a minimal implementation where we defer unnecessary configurations. Also can the tracer name be interfered from other configurations?


#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct KeyValue {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we look into vendoring or adding opentelemetry-rust as a dependency to the libdd-data-pipeline so we can re-use existing functionality (ex).

At the minimum we should make it clear which components are borrowed from opentelemetry-rust and include a versioned link to the original implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked to Paul about this and we agreed that adding open-telemetry as a dependency would be more work and not worth the hassle.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the minimum we should make it clear which components are borrowed from opentelemetry-rust and include a versioned link to the original implementation.

I can add this. 🫡

}

#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DO we need the rename_all option here?

}

/// OTLP SpanKind enum values.
pub mod span_kind {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using snakeCasing here? This is inconsistent with AnyValue,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this comment? module names should be snake_case, and the naming does not impact how the structs are serialised

@rachelyangdog rachelyangdog marked this pull request as ready for review March 6, 2026 19:18
Copy link
Contributor

@paullegranddc paullegranddc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test the otlp trace export, you can add test agent snapshot tests here

Comment on lines +1 to +2
// Copyright 2024-Present Datadog, Inc. https://www.datadoghq.com/
// SPDX-License-Identifier: Apache-2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should parse the configurations from the host language, and not in libdatadg.

The way we would configure the TraceExporter for agent_url for instance would be

1) language_configuration resolves agent_url  from different sources
2) when the tracer wants to instantiate a TraceExporter it first creates a TraceExporterBuilder and calls setters on it (set_agent_url in this case) 
3) TraceExporterBuilder::build gives a TraceExporter containing the agent_url 
4) In TraceExporter::send we can just read self.agent_url

@rachelyangdog rachelyangdog requested a review from a team as a code owner March 10, 2026 14:08
Comment on lines +23 to +25
/// Note: dynamic OTLP headers from `OTEL_EXPORTER_OTLP_HEADERS` are not forwarded because
/// [`send_with_retry`] requires `&'static str` header keys. Support for arbitrary OTEL headers
/// would require the API to accept `HashMap<String, String>`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm true 🤔
I'll refactor send_with_retry so it's possible

// (equivalent to OTEL_TRACES_SAMPLER=parentbased_always_on).
if let Some(ref config) = self.otlp_config {
return self.send_otlp_traces_inner(traces, config).await;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed it the first pass, but we should not send non sampled spans https://docs.google.com/document/d/1AsUrJxjJavLvSG33kUAJLYzGU8IYgrnf1SMbFDsuGmo/edit?tab=t.0#bookmark=id.rj2adtyzxygy

So this should be moved after the stats::process_traces_for_stats call line 630

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants