Skip to content

feat: change regex to regex-lite#1712

Open
Aaalibaba42 wants to merge 9 commits intomainfrom
jwiriath/regex-crate-size-reduction
Open

feat: change regex to regex-lite#1712
Aaalibaba42 wants to merge 9 commits intomainfrom
jwiriath/regex-crate-size-reduction

Conversation

@Aaalibaba42
Copy link
Contributor

What does this PR do?

  1. When using a whole regex engine for trivial parsing, unroll a hand-made parser
  2. When trivially convertible to regex-lite, use it instead of regex
  3. Figure out what to do for cases that don't fit 1. and 2.

Motivation

Reduce binary sizes

Additional Notes

3 places currently are option 3:

  • datadog-live-debugger/src/expr_eval.rs
  • datadog-ffe/src/rules_based/ufc/models.rs
  • libdd-trace-obfuscation/src/replacer.rs

For each of these, the reason is user provided patterns. We should try and figure out if this pattern can contain stuff not supported by regex-lite.

How to test the change?

Testsuite is comprehensive enough

@Aaalibaba42 Aaalibaba42 requested review from a team as code owners March 11, 2026 14:20
@pr-commenter
Copy link

pr-commenter bot commented Mar 11, 2026

Benchmarks

Comparison

Benchmark execution time: 2026-03-12 17:46:05

Comparing candidate commit d2a1974 in PR branch jwiriath/regex-crate-size-reduction with baseline commit 0e8c2c6 in branch main.

Found 1 performance improvements and 1 performance regressions! Performance is the same for 56 metrics, 2 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:concentrator/add_spans_to_concentrator

  • 🟩 execution_time [-2.310ms; -2.304ms] or [-17.698%; -17.653%]

scenario:ip_address/quantize_peer_ip_address_benchmark

  • 🟥 execution_time [+2.914µs; +2.931µs] or [+57.424%; +57.768%]

Candidate

Candidate benchmark details

Group 1

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching string interning on wordpress profile execution_time 160.976µs 161.503µs ± 0.298µs 161.451µs ± 0.121µs 161.593µs 162.012µs 162.693µs 163.453µs 1.24% 2.741 12.837 0.18% 0.021µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching string interning on wordpress profile execution_time [161.461µs; 161.544µs] or [-0.026%; +0.026%] None None None

Group 2

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
concentrator/add_spans_to_concentrator execution_time 10.716ms 10.745ms ± 0.016ms 10.744ms ± 0.010ms 10.754ms 10.773ms 10.788ms 10.813ms 0.64% 0.793 1.336 0.15% 0.001ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
concentrator/add_spans_to_concentrator execution_time [10.743ms; 10.747ms] or [-0.020%; +0.020%] None None None

Group 3

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
ip_address/quantize_peer_ip_address_benchmark execution_time 7.944µs 7.997µs ± 0.041µs 7.987µs ± 0.014µs 7.998µs 8.086µs 8.092µs 8.093µs 1.33% 1.331 0.467 0.51% 0.003µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
ip_address/quantize_peer_ip_address_benchmark execution_time [7.991µs; 8.003µs] or [-0.071%; +0.071%] None None None

Group 4

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
profile_add_sample_frames_x1000 execution_time 4.164ms 4.169ms ± 0.007ms 4.168ms ± 0.001ms 4.169ms 4.172ms 4.177ms 4.257ms 2.13% 11.498 147.981 0.16% 0.000ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
profile_add_sample_frames_x1000 execution_time [4.168ms; 4.170ms] or [-0.022%; +0.022%] None None None

Group 5

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time 204.800µs 205.737µs ± 0.403µs 205.706µs ± 0.301µs 206.014µs 206.364µs 206.723µs 207.342µs 0.80% 0.459 0.448 0.20% 0.028µs 1 200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput 4822950.152op/s 4860584.622op/s ± 9511.298op/s 4861295.736op/s ± 7116.029op/s 4868157.310op/s 4874519.912op/s 4879147.531op/s 4882815.077op/s 0.44% -0.446 0.415 0.20% 672.550op/s 1 200
normalization/normalize_name/normalize_name/bad-name execution_time 18.505µs 18.639µs ± 0.058µs 18.638µs ± 0.042µs 18.682µs 18.733µs 18.766µs 18.875µs 1.28% 0.362 0.503 0.31% 0.004µs 1 200
normalization/normalize_name/normalize_name/bad-name throughput 52978847.970op/s 53651400.820op/s ± 166529.819op/s 53654899.418op/s ± 119934.446op/s 53773302.239op/s 53900552.407op/s 54010649.957op/s 54040265.493op/s 0.72% -0.340 0.452 0.31% 11775.436op/s 1 200
normalization/normalize_name/normalize_name/good execution_time 10.817µs 10.932µs ± 0.044µs 10.926µs ± 0.026µs 10.954µs 11.014µs 11.045µs 11.125µs 1.82% 0.822 1.793 0.40% 0.003µs 1 200
normalization/normalize_name/normalize_name/good throughput 89889305.780op/s 91479563.073op/s ± 364145.314op/s 91521319.138op/s ± 213604.470op/s 91726834.686op/s 92006063.698op/s 92189976.345op/s 92448835.568op/s 1.01% -0.785 1.685 0.40% 25748.962op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time [205.682µs; 205.793µs] or [-0.027%; +0.027%] None None None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput [4859266.447op/s; 4861902.796op/s] or [-0.027%; +0.027%] None None None
normalization/normalize_name/normalize_name/bad-name execution_time [18.631µs; 18.647µs] or [-0.043%; +0.043%] None None None
normalization/normalize_name/normalize_name/bad-name throughput [53628321.389op/s; 53674480.251op/s] or [-0.043%; +0.043%] None None None
normalization/normalize_name/normalize_name/good execution_time [10.926µs; 10.938µs] or [-0.055%; +0.055%] None None None
normalization/normalize_name/normalize_name/good throughput [91429096.035op/s; 91530030.112op/s] or [-0.055%; +0.055%] None None None

Group 6

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_trace/test_trace execution_time 246.257ns 256.289ns ± 13.356ns 251.205ns ± 3.786ns 257.128ns 284.744ns 305.809ns 308.203ns 22.69% 2.221 4.548 5.20% 0.944ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_trace/test_trace execution_time [254.438ns; 258.141ns] or [-0.722%; +0.722%] None None None

Group 7

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
receiver_entry_point/report/2598 execution_time 3.422ms 3.454ms ± 0.029ms 3.447ms ± 0.009ms 3.459ms 3.502ms 3.528ms 3.721ms 7.96% 4.664 36.647 0.83% 0.002ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
receiver_entry_point/report/2598 execution_time [3.450ms; 3.458ms] or [-0.115%; +0.115%] None None None

Group 8

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
write only interface execution_time 1.181µs 3.234µs ± 1.440µs 2.997µs ± 0.037µs 3.028µs 3.650µs 14.093µs 14.935µs 398.30% 7.274 54.424 44.42% 0.102µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
write only interface execution_time [3.034µs; 3.434µs] or [-6.172%; +6.172%] None None None

Group 9

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
two way interface execution_time 18.048µs 26.635µs ± 10.020µs 18.516µs ± 0.410µs 35.389µs 45.971µs 46.508µs 65.694µs 254.79% 0.836 -0.131 37.52% 0.709µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
two way interface execution_time [25.247µs; 28.024µs] or [-5.214%; +5.214%] None None None

Group 10

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
sql/obfuscate_sql_string execution_time 89.164µs 89.353µs ± 0.394µs 89.297µs ± 0.057µs 89.367µs 89.563µs 90.003µs 94.468µs 5.79% 11.297 141.340 0.44% 0.028µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
sql/obfuscate_sql_string execution_time [89.298µs; 89.408µs] or [-0.061%; +0.061%] None None None

Group 11

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
profile_add_sample2_frames_x1000 execution_time 731.940µs 733.492µs ± 0.587µs 733.465µs ± 0.357µs 733.835µs 734.505µs 735.177µs 735.406µs 0.26% 0.362 0.607 0.08% 0.042µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
profile_add_sample2_frames_x1000 execution_time [733.411µs; 733.574µs] or [-0.011%; +0.011%] None None None

Group 12

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
sdk_test_data/rules-based execution_time 144.040µs 146.196µs ± 1.802µs 145.893µs ± 0.511µs 146.434µs 148.647µs 156.081µs 160.885µs 10.28% 4.815 30.547 1.23% 0.127µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
sdk_test_data/rules-based execution_time [145.946µs; 146.445µs] or [-0.171%; +0.171%] None None None

Group 13

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
redis/obfuscate_redis_string execution_time 34.107µs 34.636µs ± 0.777µs 34.266µs ± 0.069µs 34.482µs 36.217µs 36.282µs 38.386µs 12.02% 1.869 2.622 2.24% 0.055µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
redis/obfuscate_redis_string execution_time [34.528µs; 34.743µs] or [-0.311%; +0.311%] None None None

Group 14

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching serializing traces from their internal representation to msgpack execution_time 14.093ms 14.150ms ± 0.034ms 14.145ms ± 0.012ms 14.154ms 14.237ms 14.267ms 14.340ms 1.38% 2.605 8.846 0.24% 0.002ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching serializing traces from their internal representation to msgpack execution_time [14.145ms; 14.155ms] or [-0.033%; +0.033%] None None None

Group 15

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
tags/replace_trace_tags execution_time 2.387µs 2.447µs ± 0.022µs 2.439µs ± 0.008µs 2.467µs 2.481µs 2.489µs 2.491µs 2.11% -0.063 -0.136 0.88% 0.002µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
tags/replace_trace_tags execution_time [2.444µs; 2.450µs] or [-0.123%; +0.123%] None None None

Group 16

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching deserializing traces from msgpack to their internal representation execution_time 49.251ms 49.548ms ± 1.300ms 49.357ms ± 0.045ms 49.420ms 49.608ms 57.351ms 62.138ms 25.89% 8.241 68.663 2.62% 0.092ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching deserializing traces from msgpack to their internal representation execution_time [49.368ms; 49.728ms] or [-0.364%; +0.364%] None None None

Group 17

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time 495.178µs 496.277µs ± 0.811µs 496.183µs ± 0.269µs 496.516µs 496.925µs 497.413µs 506.007µs 1.98% 8.685 101.750 0.16% 0.057µs 1 200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput 1976256.595op/s 2015010.478op/s ± 3248.258op/s 2015383.873op/s ± 1093.513op/s 2016282.828op/s 2017845.489op/s 2018557.347op/s 2019474.919op/s 0.20% -8.547 99.556 0.16% 229.687op/s 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time 371.785µs 372.573µs ± 0.362µs 372.576µs ± 0.266µs 372.838µs 373.131µs 373.386µs 373.575µs 0.27% 0.054 -0.390 0.10% 0.026µs 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput 2676835.227op/s 2684037.259op/s ± 2608.316op/s 2684013.015op/s ± 1917.927op/s 2685928.903op/s 2688145.576op/s 2689657.459op/s 2689727.533op/s 0.21% -0.049 -0.392 0.10% 184.436op/s 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time 169.261µs 169.893µs ± 0.209µs 169.896µs ± 0.139µs 170.041µs 170.196µs 170.390µs 170.410µs 0.30% -0.309 0.245 0.12% 0.015µs 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput 5868187.410op/s 5886058.018op/s ± 7237.340op/s 5885947.719op/s ± 4818.356op/s 5890470.270op/s 5898457.506op/s 5904582.026op/s 5908050.942op/s 0.38% 0.317 0.252 0.12% 511.757op/s 1 200
normalization/normalize_service/normalize_service/[empty string] execution_time 38.036µs 38.294µs ± 0.127µs 38.281µs ± 0.091µs 38.388µs 38.510µs 38.566µs 38.648µs 0.96% 0.174 -0.494 0.33% 0.009µs 1 200
normalization/normalize_service/normalize_service/[empty string] throughput 25874636.761op/s 26113940.204op/s ± 86642.330op/s 26122292.802op/s ± 62267.927op/s 26169962.875op/s 26260650.301op/s 26280630.971op/s 26290624.162op/s 0.64% -0.159 -0.502 0.33% 6126.538op/s 1 200
normalization/normalize_service/normalize_service/test_ASCII execution_time 46.183µs 46.307µs ± 0.053µs 46.302µs ± 0.034µs 46.336µs 46.402µs 46.434µs 46.550µs 0.54% 0.835 1.520 0.12% 0.004µs 1 200
normalization/normalize_service/normalize_service/test_ASCII throughput 21482261.738op/s 21595251.977op/s ± 24875.780op/s 21597269.242op/s ± 15764.591op/s 21613058.179op/s 21628702.586op/s 21639666.999op/s 21653046.499op/s 0.26% -0.825 1.485 0.11% 1758.983op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time [496.164µs; 496.389µs] or [-0.023%; +0.023%] None None None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput [2014560.301op/s; 2015460.656op/s] or [-0.022%; +0.022%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time [372.523µs; 372.624µs] or [-0.013%; +0.013%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput [2683675.771op/s; 2684398.746op/s] or [-0.013%; +0.013%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time [169.864µs; 169.922µs] or [-0.017%; +0.017%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput [5885054.992op/s; 5887061.043op/s] or [-0.017%; +0.017%] None None None
normalization/normalize_service/normalize_service/[empty string] execution_time [38.277µs; 38.312µs] or [-0.046%; +0.046%] None None None
normalization/normalize_service/normalize_service/[empty string] throughput [26101932.410op/s; 26125947.998op/s] or [-0.046%; +0.046%] None None None
normalization/normalize_service/normalize_service/test_ASCII execution_time [46.299µs; 46.314µs] or [-0.016%; +0.016%] None None None
normalization/normalize_service/normalize_service/test_ASCII throughput [21591804.434op/s; 21598699.521op/s] or [-0.016%; +0.016%] None None None

Group 18

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
single_flag_killswitch/rules-based execution_time 190.528ns 193.341ns ± 1.865ns 193.138ns ± 1.267ns 194.138ns 197.093ns 199.031ns 201.149ns 4.15% 1.183 1.910 0.96% 0.132ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
single_flag_killswitch/rules-based execution_time [193.083ns; 193.600ns] or [-0.134%; +0.134%] None None None

Group 19

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz d2a1974 1773336536 jwiriath/regex-crate-size-reduction
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
credit_card/is_card_number/ execution_time 3.894µs 3.914µs ± 0.003µs 3.914µs ± 0.002µs 3.916µs 3.918µs 3.920µs 3.923µs 0.22% -1.857 15.604 0.07% 0.000µs 1 200
credit_card/is_card_number/ throughput 254925245.541op/s 255471900.796op/s ± 170867.389op/s 255481694.749op/s ± 100347.089op/s 255566332.243op/s 255687994.602op/s 255745225.153op/s 256777029.199op/s 0.51% 1.886 15.847 0.07% 12082.149op/s 1 200
credit_card/is_card_number/ 3782-8224-6310-005 execution_time 79.473µs 80.136µs ± 0.154µs 80.124µs ± 0.094µs 80.217µs 80.322µs 80.609µs 81.194µs 1.34% 1.865 13.253 0.19% 0.011µs 1 200
credit_card/is_card_number/ 3782-8224-6310-005 throughput 12316153.088op/s 12478842.073op/s ± 23899.066op/s 12480705.458op/s ± 14614.046op/s 12495353.604op/s 12505414.282op/s 12513392.167op/s 12582894.853op/s 0.82% -1.798 12.888 0.19% 1689.919op/s 1 200
credit_card/is_card_number/ 378282246310005 execution_time 67.804µs 67.931µs ± 0.086µs 67.917µs ± 0.047µs 67.971µs 68.057µs 68.149µs 68.621µs 1.04% 3.002 19.962 0.13% 0.006µs 1 200
credit_card/is_card_number/ 378282246310005 throughput 14572804.154op/s 14720834.952op/s ± 18470.424op/s 14723782.341op/s ± 10189.061op/s 14733037.612op/s 14742117.436op/s 14746202.154op/s 14748460.032op/s 0.17% -2.954 19.449 0.13% 1306.056op/s 1 200
credit_card/is_card_number/37828224631 execution_time 3.895µs 3.916µs ± 0.003µs 3.916µs ± 0.002µs 3.918µs 3.921µs 3.922µs 3.926µs 0.25% -1.073 6.709 0.08% 0.000µs 1 200
credit_card/is_card_number/37828224631 throughput 254716161.955op/s 255349765.555op/s ± 216466.072op/s 255341739.195op/s ± 139240.258op/s 255494289.705op/s 255668506.026op/s 255725205.945op/s 256718075.927op/s 0.54% 1.092 6.835 0.08% 15306.463op/s 1 200
credit_card/is_card_number/378282246310005 execution_time 64.588µs 64.733µs ± 0.072µs 64.725µs ± 0.049µs 64.780µs 64.857µs 64.906µs 64.979µs 0.39% 0.582 0.043 0.11% 0.005µs 1 200
credit_card/is_card_number/378282246310005 throughput 15389494.198op/s 15447998.413op/s ± 17164.820op/s 15450070.780op/s ± 11782.168op/s 15460812.638op/s 15472286.660op/s 15476640.447op/s 15482698.085op/s 0.21% -0.576 0.032 0.11% 1213.736op/s 1 200
credit_card/is_card_number/37828224631000521389798 execution_time 45.431µs 45.722µs ± 0.088µs 45.727µs ± 0.060µs 45.781µs 45.864µs 45.893µs 45.932µs 0.45% -0.301 0.178 0.19% 0.006µs 1 200
credit_card/is_card_number/37828224631000521389798 throughput 21771532.717op/s 21871478.950op/s ± 42085.283op/s 21868964.665op/s ± 28801.252op/s 21902045.103op/s 21940004.417op/s 21972918.961op/s 22011362.084op/s 0.65% 0.313 0.197 0.19% 2975.879op/s 1 200
credit_card/is_card_number/x371413321323331 execution_time 6.536µs 6.619µs ± 0.020µs 6.620µs ± 0.015µs 6.636µs 6.642µs 6.645µs 6.648µs 0.42% -1.021 1.343 0.29% 0.001µs 1 200
credit_card/is_card_number/x371413321323331 throughput 150426297.382op/s 151089113.464op/s ± 447236.873op/s 151059001.584op/s ± 339435.262op/s 151338210.044op/s 151907886.603op/s 152422745.320op/s 153001175.741op/s 1.29% 1.041 1.424 0.30% 31624.423op/s 1 200
credit_card/is_card_number_no_luhn/ execution_time 3.894µs 3.914µs ± 0.003µs 3.914µs ± 0.001µs 3.916µs 3.918µs 3.920µs 3.922µs 0.20% -2.206 19.666 0.07% 0.000µs 1 200
credit_card/is_card_number_no_luhn/ throughput 254980782.632op/s 255467076.053op/s ± 169715.880op/s 255480302.469op/s ± 87801.399op/s 255558580.416op/s 255662895.620op/s 255705723.992op/s 256837274.638op/s 0.53% 2.240 19.975 0.07% 12000.725op/s 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time 65.498µs 65.702µs ± 0.051µs 65.698µs ± 0.028µs 65.729µs 65.784µs 65.840µs 65.850µs 0.23% 0.112 1.385 0.08% 0.004µs 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput 15185961.026op/s 15220311.374op/s ± 11724.467op/s 15221218.020op/s ± 6594.120op/s 15227279.842op/s 15237720.771op/s 15243146.545op/s 15267582.342op/s 0.30% -0.105 1.392 0.08% 829.045op/s 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time 53.352µs 53.451µs ± 0.045µs 53.449µs ± 0.031µs 53.481µs 53.526µs 53.556µs 53.598µs 0.28% 0.375 -0.121 0.08% 0.003µs 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 throughput 18657412.590op/s 18708813.540op/s ± 15605.668op/s 18709274.053op/s ± 10860.394op/s 18719614.699op/s 18731352.776op/s 18736451.641op/s 18743264.139op/s 0.18% -0.371 -0.128 0.08% 1103.487op/s 1 200
credit_card/is_card_number_no_luhn/37828224631 execution_time 3.894µs 3.916µs ± 0.003µs 3.916µs ± 0.002µs 3.918µs 3.919µs 3.921µs 3.922µs 0.17% -2.127 15.649 0.07% 0.000µs 1 200
credit_card/is_card_number_no_luhn/37828224631 throughput 254940609.315op/s 255381197.909op/s ± 181376.937op/s 255375759.701op/s ± 116163.189op/s 255486484.775op/s 255624273.032op/s 255684874.334op/s 256774809.899op/s 0.55% 2.155 15.907 0.07% 12825.286op/s 1 200
credit_card/is_card_number_no_luhn/378282246310005 execution_time 50.152µs 50.224µs ± 0.035µs 50.226µs ± 0.026µs 50.250µs 50.282µs 50.294µs 50.300µs 0.15% 0.004 -0.742 0.07% 0.002µs 1 200
credit_card/is_card_number_no_luhn/378282246310005 throughput 19880608.991op/s 19910741.272op/s ± 13719.308op/s 19910045.419op/s ± 10356.200op/s 19920631.226op/s 19932183.143op/s 19938176.182op/s 19939364.393op/s 0.15% -0.001 -0.743 0.07% 970.102op/s 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time 45.452µs 45.706µs ± 0.088µs 45.710µs ± 0.054µs 45.766µs 45.843µs 45.898µs 45.899µs 0.41% -0.340 0.041 0.19% 0.006µs 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput 21787151.271op/s 21878962.744op/s ± 42227.412op/s 21876868.512op/s ± 26080.643op/s 21902272.911op/s 21956479.791op/s 21982434.073op/s 22001369.108op/s 0.57% 0.351 0.050 0.19% 2985.929op/s 1 200
credit_card/is_card_number_no_luhn/x371413321323331 execution_time 6.544µs 6.623µs ± 0.018µs 6.626µs ± 0.012µs 6.636µs 6.644µs 6.649µs 6.653µs 0.40% -0.982 1.180 0.28% 0.001µs 1 200
credit_card/is_card_number_no_luhn/x371413321323331 throughput 150312210.988op/s 151000554.302op/s ± 419387.524op/s 150913935.405op/s ± 280073.087op/s 151226260.490op/s 151759503.165op/s 152260832.336op/s 152803859.874op/s 1.25% 1.000 1.252 0.28% 29655.176op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
credit_card/is_card_number/ execution_time [3.914µs; 3.915µs] or [-0.009%; +0.009%] None None None
credit_card/is_card_number/ throughput [255448220.220op/s; 255495581.373op/s] or [-0.009%; +0.009%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 execution_time [80.115µs; 80.157µs] or [-0.027%; +0.027%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 throughput [12475529.892op/s; 12482154.254op/s] or [-0.027%; +0.027%] None None None
credit_card/is_card_number/ 378282246310005 execution_time [67.919µs; 67.943µs] or [-0.017%; +0.017%] None None None
credit_card/is_card_number/ 378282246310005 throughput [14718275.129op/s; 14723394.776op/s] or [-0.017%; +0.017%] None None None
credit_card/is_card_number/37828224631 execution_time [3.916µs; 3.917µs] or [-0.012%; +0.012%] None None None
credit_card/is_card_number/37828224631 throughput [255319765.439op/s; 255379765.670op/s] or [-0.012%; +0.012%] None None None
credit_card/is_card_number/378282246310005 execution_time [64.723µs; 64.743µs] or [-0.015%; +0.015%] None None None
credit_card/is_card_number/378282246310005 throughput [15445619.534op/s; 15450377.292op/s] or [-0.015%; +0.015%] None None None
credit_card/is_card_number/37828224631000521389798 execution_time [45.710µs; 45.734µs] or [-0.027%; +0.027%] None None None
credit_card/is_card_number/37828224631000521389798 throughput [21865646.335op/s; 21877311.566op/s] or [-0.027%; +0.027%] None None None
credit_card/is_card_number/x371413321323331 execution_time [6.616µs; 6.621µs] or [-0.041%; +0.041%] None None None
credit_card/is_card_number/x371413321323331 throughput [151027130.735op/s; 151151096.194op/s] or [-0.041%; +0.041%] None None None
credit_card/is_card_number_no_luhn/ execution_time [3.914µs; 3.915µs] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/ throughput [255443555.065op/s; 255490597.042op/s] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time [65.695µs; 65.709µs] or [-0.011%; +0.011%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput [15218686.476op/s; 15221936.273op/s] or [-0.011%; +0.011%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time [53.445µs; 53.457µs] or [-0.012%; +0.012%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 throughput [18706650.745op/s; 18710976.336op/s] or [-0.012%; +0.012%] None None None
credit_card/is_card_number_no_luhn/37828224631 execution_time [3.915µs; 3.916µs] or [-0.010%; +0.010%] None None None
credit_card/is_card_number_no_luhn/37828224631 throughput [255356060.809op/s; 255406335.008op/s] or [-0.010%; +0.010%] None None None
credit_card/is_card_number_no_luhn/378282246310005 execution_time [50.219µs; 50.229µs] or [-0.010%; +0.010%] None None None
credit_card/is_card_number_no_luhn/378282246310005 throughput [19908839.908op/s; 19912642.636op/s] or [-0.010%; +0.010%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time [45.694µs; 45.718µs] or [-0.027%; +0.027%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput [21873110.431op/s; 21884815.057op/s] or [-0.027%; +0.027%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 execution_time [6.620µs; 6.625µs] or [-0.038%; +0.038%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 throughput [150942431.225op/s; 151058677.380op/s] or [-0.038%; +0.038%] None None None

Baseline

Omitted due to size.

@codecov-commenter
Copy link

codecov-commenter commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.28%. Comparing base (b04809b) to head (d2a1974).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1712      +/-   ##
==========================================
+ Coverage   71.18%   71.28%   +0.10%     
==========================================
  Files         429      429              
  Lines       63503    63802     +299     
==========================================
+ Hits        45207    45484     +277     
- Misses      18296    18318      +22     
Components Coverage Δ
libdd-crashtracker 62.37% <ø> (+0.04%) ⬆️
libdd-crashtracker-ffi 16.71% <ø> (+0.15%) ⬆️
libdd-alloc 98.77% <ø> (ø)
libdd-data-pipeline 87.33% <ø> (-0.87%) ⬇️
libdd-data-pipeline-ffi 72.87% <ø> (-3.09%) ⬇️
libdd-common 79.73% <ø> (ø)
libdd-common-ffi 73.40% <ø> (ø)
libdd-telemetry 62.48% <ø> (ø)
libdd-telemetry-ffi 16.75% <ø> (ø)
libdd-dogstatsd-client 82.64% <ø> (ø)
datadog-ipc 80.35% <ø> (-0.12%) ⬇️
libdd-profiling 81.60% <ø> (+0.01%) ⬆️
libdd-profiling-ffi 63.65% <ø> (ø)
datadog-sidecar 33.10% <ø> (+0.61%) ⬆️
datdog-sidecar-ffi 10.49% <ø> (+2.68%) ⬆️
spawn-worker 54.69% <ø> (ø)
libdd-tinybytes 93.16% <ø> (ø)
libdd-trace-normalization 81.71% <ø> (ø)
libdd-trace-obfuscation 91.80% <ø> (ø)
libdd-trace-protobuf 68.25% <ø> (ø)
libdd-trace-utils 89.24% <ø> (+0.16%) ⬆️
datadog-tracer-flare 88.28% <ø> (-2.17%) ⬇️
libdd-log 74.69% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dd-octo-sts
Copy link
Contributor

dd-octo-sts bot commented Mar 11, 2026

Artifact Size Benchmark Report

aarch64-alpine-linux-musl
Artifact Baseline Commit Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a 100.28 MB 100.77 MB +.48% (+498.75 KB) 🔍
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so 8.63 MB 8.70 MB +.72% (+64.04 KB) 🔍
aarch64-unknown-linux-gnu
Artifact Baseline Commit Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so 11.21 MB 11.29 MB +.67% (+77.40 KB) 🔍
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a 116.94 MB 117.52 MB +.49% (+590.82 KB) 🔍
libdatadog-x64-windows
Artifact Baseline Commit Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll 27.16 MB 27.43 MB +.96% (+269.00 KB) 🔍
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib 76.26 KB 76.26 KB 0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb 186.01 MB 187.42 MB +.75% (+1.40 MB) 🔍
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib 917.15 MB 920.55 MB +.37% (+3.40 MB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll 9.93 MB 9.99 MB +.55% (+56.50 KB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib 76.26 KB 76.26 KB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb 24.77 MB 24.91 MB +.56% (+144.00 KB) 🔍
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib 51.43 MB 51.70 MB +.50% (+268.06 KB) 🔍
libdatadog-x86-windows
Artifact Baseline Commit Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll 22.97 MB 23.19 MB +.98% (+231.00 KB) 🔍
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib 77.44 KB 77.44 KB 0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb 190.26 MB 191.62 MB +.71% (+1.35 MB) 🔍
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib 900.80 MB 903.96 MB +.35% (+3.15 MB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll 7.53 MB 7.57 MB +.57% (+44.00 KB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib 77.44 KB 77.44 KB 0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb 26.51 MB 26.66 MB +.53% (+144.00 KB) 🔍
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib 47.05 MB 47.29 MB +.50% (+244.20 KB) 🔍
x86_64-alpine-linux-musl
Artifact Baseline Commit Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a 87.50 MB 87.94 MB +.50% (+455.52 KB) 🔍
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so 10.21 MB 10.26 MB +.49% (+52.04 KB) 🔍
x86_64-unknown-linux-gnu
Artifact Baseline Commit Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a 109.81 MB 110.29 MB +.43% (+488.93 KB) 🔍
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so 11.95 MB 12.01 MB +.47% (+58.67 KB) 🔍

@github-actions
Copy link

github-actions bot commented Mar 11, 2026

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

  • Base Branch: origin/main
  • PR Branch: origin/jwiriath/regex-crate-size-reduction

Summary by Rule

Rule Base Branch PR Branch Change
unwrap_used 9 9 No change (0%)
Total 9 9 No change (0%)

Annotation Counts by File

File Base Branch PR Branch Change
datadog-live-debugger/src/redacted_names.rs 2 2 No change (0%)
libdd-common/src/azure_app_services.rs 1 1 No change (0%)
libdd-common/src/entity_id/unix/container_id.rs 3 3 No change (0%)
libdd-trace-obfuscation/src/ip_address.rs 3 3 No change (0%)

Annotation Stats by Crate

Crate Base Branch PR Branch Change
clippy-annotation-reporter 5 5 No change (0%)
datadog-ffe-ffi 1 1 No change (0%)
datadog-ipc 28 28 No change (0%)
datadog-live-debugger 6 6 No change (0%)
datadog-live-debugger-ffi 10 10 No change (0%)
datadog-profiling-replayer 4 4 No change (0%)
datadog-remote-config 3 3 No change (0%)
datadog-sidecar 59 59 No change (0%)
libdd-common 10 10 No change (0%)
libdd-common-ffi 12 12 No change (0%)
libdd-data-pipeline 5 5 No change (0%)
libdd-ddsketch 2 2 No change (0%)
libdd-dogstatsd-client 1 1 No change (0%)
libdd-profiling 13 13 No change (0%)
libdd-telemetry 19 19 No change (0%)
libdd-tinybytes 4 4 No change (0%)
libdd-trace-normalization 2 2 No change (0%)
libdd-trace-obfuscation 9 9 No change (0%)
libdd-trace-utils 15 15 No change (0%)
Total 208 208 No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

@github-actions
Copy link

github-actions bot commented Mar 11, 2026

📚 Documentation Check Results

⚠️ 690 documentation warning(s) found

📦 libdd-common - 168 warning(s)

📦 libdd-trace-obfuscation - 522 warning(s)


Updated: 2026-03-12 17:36:59 UTC | Commit: 821f815 | missing-docs job results

@github-actions
Copy link

github-actions bot commented Mar 11, 2026

🔒 Cargo Deny Results

No issues found!

📦 libdd-common - ✅ No issues

📦 libdd-trace-obfuscation - ✅ No issues


Updated: 2026-03-12 17:40:38 UTC | Commit: 821f815 | dependency-check job results

Copy link
Contributor

@VianneyRuhlmann VianneyRuhlmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, any idea why the benchmark are not showing any size improvement ? Is it because of the remaining "option 3" ?

Copy link
Contributor

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the goal of hand-made parsers (1.) (instead of using regex-lite) to get rid entirely of any regex dependency in some crates, is it for performance reasons or both ? While the parsers are reasonably simple, it does have a small cost in term of readability and maintenance IMHO (manual indexing is quite off-by-one-error-prone). Just wondering about the overall trade-off.

}

let candidate = &s[s.len() - 36..];
const TEMPLATE: &[u8; 36] = b"hhhhhhhh-hhhh-hhhh-hhhh-hhhhhhhhhhhh";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect this template to change in the future? If yes, then maybe we should consider the regex for easier updates/edits. If no, maybe we should go to the end of the hand-rolled parsed approach and directly check the different slices/chars with try_match_hex64 and cos?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it could change no.

we should go to the end of the hand-rolled parsed approach and directly check the different slices/chars with try_match_hex64 and cos?

Isn't that what we're doing ? I think I don't understand what you mean by that.

Copy link
Contributor

@yannham yannham Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't being very clear. I felt like keeping a template string that is "interpreted" at runtime is a bit of an in-between solution, since the structure is known statically. So if you sacrifice the flexibility and readability of regexes, you could alternatively write something like:

is_hex(candidate[0..8])?;
is_hex(candidate[9..13])?;
is_hex(candidate[14..18])?;
is_hex(candidate[19..23])?;
is_hex(candidate[24..36])?;

if [8, 13, 19, 23].iter().any(|i|!matches!(candidate[i], b'-' | b'_')) {
    return None;
}

Though it is a bit verbose as all, so I'm not sure it's actually better (especially if performance wasn't in fact your original motivation). I'll just leave this here, feel free to resolve either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had something like that originally, but felt like it was really not readable and error-prone and AI found this trick to have something immediately readable and less error-prone given you see the pattern right away, as per a regex, but without the associated complexity.

@Aaalibaba42
Copy link
Contributor Author

Is the goal of hand-made parsers (1.) (instead of using regex-lite) to get rid entirely of any regex dependency in some crates, is it for performance reasons or both ? While the parsers are reasonably simple, it does have a small cost in term of readability and maintenance IMHO (manual indexing is quite off-by-one-error-prone). Just wondering about the overall trade-off.

It's mostly that I find the whole regex machinery for trivial parsing a code smell. I did not measure the performance impact, but for the majority of them, I think it would be a net positive, but that was not really the point.

I don't think those places are performance sensitive, and arguably what's done in container_id.rs is more than trivial parsing, so if it is thought that using regex-lite for this is preferable I won't fight it.

if resource_group.is_empty() {
return None;
}
Some(resource_group.to_string())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, the behavior of this parser is different from the original regexp. For example, foo+bar-baz-webspace-Linux is accepted by the original regexp but isn't accepted by this function (which basically doesn't allow a - just before webspace). On the other hand, it doesn't check anything after webspace, so for example foo+bar-bazwebspace-MacOS is accepted, while it is not by the original regexp. Not sure this is important/intentional, or if the original regexp was wrong, but just in case 👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep since this was mostly about capturing the correct thing I omitted that, I don't know if it's relevant, I should check with the consumers of this

@yannham
Copy link
Contributor

yannham commented Mar 12, 2026

I don't think those places are performance sensitive, and arguably what's done in container_id.rs is more than trivial parsing, so if it is thought that using regex-lite for this is preferable I won't fight it.

Once again the parsers aren't too complex, so this is somehow of a grey area. This PR does add a bit more code than it removes, the parsing code is more involved to understand, plus it doesn't exactly preserve behavior (which isn't entirely trivial with manual rfind and indexing etc.). I think it could be entirely justified for performance or binary size reduction reasons. Here I just find the motivation to be lacking a bit, code smell not being a very objective or precise one. All of that being said, this is in no way critical, and just like, my opinion. Feel free to make what you want of it and don't consider it blocking 🙂

@paullegranddc
Copy link
Contributor

paullegranddc commented Mar 12, 2026

The code for parsing container id is run only once per process anyway so I don't think using regex-lite would have any impact.
And when capture groups are involved, using a regex looks somewhat cleaner than manual calls to trim/strip/match.

On the manual parsing code, one recommendation I would have is to structure the code more like parser combinators, or maybe to straight up use nom

@Aaalibaba42
Copy link
Contributor Author

I'll favor regex-lite then 👍🏻

@Aaalibaba42
Copy link
Contributor Author

Hand rolled parsers were removed, just changed regex to regex-lite when not user input provided

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants