Skip to content

fix(tornado): match http routes in correct order and missing routes certain regex patterns#17515

Open
florentinl wants to merge 5 commits intomainfrom
florentin.labelle/APPSEC-62190/tornado-routes-for-streamlit
Open

fix(tornado): match http routes in correct order and missing routes certain regex patterns#17515
florentinl wants to merge 5 commits intomainfrom
florentin.labelle/APPSEC-62190/tornado-routes-for-streamlit

Conversation

@florentinl
Copy link
Copy Markdown
Contributor

@florentinl florentinl commented Apr 14, 2026

Description

Two fixes to _find_route, the helper that determines the http.route tag for Tornado requests.

1. Route matching order in nested applications

_find_route pushes sub-rules of a nested Application onto the front of the processing deque with extendleft. Because extendleft reverses the input sequence as it inserts, the routes were processed in reverse declaration order — a catch-all defined after a specific route would incorrectly win. Fix: extendleft(reversed(rule.target.rules)).

2. http.route missing for complex regex patterns

Tornado's PathMatches._path is None whenever the route regex cannot be reversed (e.g. patterns containing non-capturing groups (?:...), lookaheads, or lookbehinds). Previously http.route was absent for those routes. This adds _regex_to_route, a lru_cache-backed helper that converts any regex pattern to a route string: capturing groups become %s (consistent with Tornado's own format), non-capturing constructs are kept verbatim.

Both issues were surfaced by Streamlit, which mounts a nested Tornado application with non-trivial route patterns.

Testing

  • test_nested_application_route_order: nested app with a specific route before a catch-all; asserts the specific route wins. Fails without fix 1.
  • test_complex_pattern_route: covers a pure non-capturing group route and a mixed non-capturing + capturing group route; asserts http.route is populated and capturing groups are rendered as %s. Fails without fix 2.
  • test_regex_to_route: parametrised unit tests for _regex_to_route in isolation, covering anchors, positional/named capturing groups, non-capturing groups, lookaheads, nested groups, and escaped parens.

Risks

None. Both changes only affect route tag computation; request dispatch is handled by Tornado independently.

  • This changes only adds http.route values for cases that were previously not supported.

Additional Notes

N/A

@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da bot commented Apr 14, 2026

Codeowners resolved as

ddtrace/contrib/internal/tornado/handlers.py                            @DataDog/apm-core-python @DataDog/apm-idm-python
tests/contrib/tornado/test_regex_to_route.py                            @DataDog/apm-core-python @DataDog/apm-idm-python

…mplex patterns

In nested Tornado applications, _find_route used deque.extendleft without
reversing the sub-rules first, causing them to be processed in reverse
declaration order. A catch-all route declared after a specific one would
incorrectly win.

Also introduces _regex_to_route, a cached helper that populates http.route
even for patterns Tornado cannot reverse (e.g. those containing non-capturing
groups). Capturing groups are normalised to %s to stay consistent with
Tornado's own PathMatches._path format.
@florentinl florentinl changed the title fix(tornado): match http routes in correct order fix(tornado): match http routes in correct order and missing routes certain regex patterns Apr 14, 2026
CodeQL flagged RouteMixedPatternHandler as a reflected XSS sink because
it wrote the URL-captured item_id directly to the response. The handler
only exists to verify route matching; use a static response instead.
@florentinl florentinl marked this pull request as ready for review April 14, 2026 14:37
@florentinl florentinl requested review from a team as code owners April 14, 2026 14:37
Copy link
Copy Markdown
Contributor

@KowalskiThomas KowalskiThomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release notes make sense to me.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f962375805

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ute parsing

_regex_to_route now skips over [...] character classes verbatim so that
'(' inside a class is not mistaken for a capturing group boundary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants