Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Python Liquid Change Log

## Version 2.1.0 (unreleased)

**Features**

- Added the `escapejs` filter for escaping characters for use in JavaScript string literals. Whereas the standard `escape` filter replaces `&`, `<`, `>`, `'` and `"` with their equivalent HTML escape sequence, `escapejs` replaces control characters and potentially dangerous symbols with their corresponding Unicode escape sequences.

**Docs**

- Improved documentation for HTML auto escaping and the `escape` filter.

## Version 2.0.2

- Fixed static analysis of filters in ternary expressions. See [#180](https://github.com/jg-rp/liquid/issues/180).
Expand Down
29 changes: 27 additions & 2 deletions docs/environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,9 +119,25 @@ env = Environment(

## HTML auto escape

When `autoescape` is `True`, [render context variables](render_context.md) will be automatically escaped to produce HTML-safe strings on output.
When `autoescape=True`, all rendered context variables are automatically escaped to produce HTML-safe output. This protects against injection attacks by escaping characters that could be interpreted as HTML when inserted into a page.

You can be explicitly mark strings as _safe_ by wrapping them in `Markup()` and [drops](variables_and_drops.md) can implement the [special `__html__()` method](variables_and_drops.md#__html__).
Autoescaping replaces the following characters with their HTML-safe equivalents:

- `&` -> `&amp;`
- `<` -> `&lt;`
- `>` -> `&gt;`
- `'` -> `&#39;`
- `"` -> `&#34;`

This escaping is equivalent to applying the [`escape` filter](./filter_reference.md#escape) to every variable, unless the variable is explicitly marked as safe.

!!! warning

Auto escape and the [`escape`](./filter_reference.md#escape) filter do **not** make strings safe for use in JavaScript, including in `<script>` blocks, inline event handler attributes (e.g. `onerror`), or other JavaScript contexts. For those cases, see the [`escapejs`](./filter_reference.md#escapejs) filter instead.

### Safe strings

You can explicitly mark a strings as _safe_ by wrapping it in `Markup()`, and [drops](variables_and_drops.md) can implement the [special `__html__()` method](variables_and_drops.md#__html__).

```python
from markupsafe import Markup
Expand All @@ -132,6 +148,15 @@ template = env.from_string("<p>Hello, {{ you }}</p>")
print(template.render(you=Markup("<em>World!</em>")))
```

There's also the [`safe`](./filter_reference.md#safe) filter which allows template authors to mark a string as safe to use in HTML. If that sounds like a bad idea, you can remove the `safe` filter from your Liquid environment.

```python
from liquid import Environment

env = Environment()
del env.tags["safe"]
```

## Resource limits

For deployments where template authors are untrusted, you can set limits on some resources to avoid malicious templates from consuming too much memory or too many CPU cycles. Limits are set by subclassing [`Environment`](api/environment.md) and setting some class attributes.
Expand Down
59 changes: 57 additions & 2 deletions docs/filter_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -510,7 +510,21 @@ If the input is undefined, an empty string is returned.
<string> | escape
```

Return the input string with characters `&`, `<` and `>` converted to HTML-safe sequences.
Escape special characters in a string for safe use in HTML.

This filter replaces the characters `&`, `<`, `>`, `'`, and `"` with their corresponding HTML-safe sequences:

- `&` -> `&amp;`
- `<` -> `&lt;`
- `>` -> `&gt;`
- `'` -> `&#39;`
- `"` -> `&#34;`

This helps prevent HTML injection when rendering untrusted content in HTML element bodies or attributes.

!!! warning

This filter does **not** make strings safe for use in JavaScript, including in `<script>` blocks, inline event handler attributes (e.g. `onerror`), or other JavaScript contexts. For those cases, use the [`escapejs`](#escapejs) filter instead.

```liquid2
{{ "Have you read 'James & the Giant Peach'?" | escape }}
Expand All @@ -520,13 +534,54 @@ Return the input string with characters `&`, `<` and `>` converted to HTML-safe
Have you read &#39;James &amp; the Giant Peach&#39;?
```

## escapejs

**_New in version 2.1.0_**

```
<string> | escapejs
```

Escape characters for safe use in JavaScript string literals.

This filter escapes a string for embedding inside **JavaScript string literals**, using either single or double quotes (e.g. `'...'` or `"..."`). It replaces control characters and potentially dangerous symbols with their corresponding Unicode escape sequences.

Escaped characters include:

- ASCII control characters (U+0000 to U+001F)
- Characters like quotes, angle brackets, ampersands, equals signs - Line/paragraph separators (U+2028, U+2029)

!!! warning

This filter does **not** make strings safe for use in JavaScript template literals (backtick strings), or in raw JavaScript expressions. Use it only when placing data inside quoted JS strings within inline `<script>` blocks or event handlers.

**Recommended alternatives:**

- Pass data using HTML `data-*` attributes and read them in JS via `element.dataset`.
- For structured data, prefer a JSON-serialization approach using the [JSON filter](./optional_filters.md#json).

```liquid2
{% assign some_string = "<script>alert('x')</script>" %}
<img src="" onerror="{{ some_string | escapejs }}" />
```

```plain title="output"
<img src="" onerror="\u003Cscript\u003Ealert(\u0027x\u0027)\u003C/script\u003E" />
```

## escape_once

```
<string> | escape_once
```

Return the input string with characters `&`, `<` and `>` converted to HTML-safe sequences while preserving existing HTML escape sequences.
Escape a string for HTML, but avoid double-escaping existing entities.

Converts characters like `&`, `<`, and `>` to their HTML-safe sequences, but leaves existing HTML entities untouched (e.g., `&amp;` stays `&amp;`).

This is useful when escaping content that may already be partially escaped.

See the [`escape`](#escape) filter for details and limitations.

```liquid2
{{ "Have you read 'James &amp; the Giant Peach'?" | escape_once }}
Expand Down
2 changes: 1 addition & 1 deletion liquid/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@

from . import future

__version__ = "2.0.2"
__version__ = "2.1.0"

__all__ = (
"AwareBoundTemplate",
Expand Down
2 changes: 2 additions & 0 deletions liquid/builtin/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
from .filters.array import sum_
from .filters.array import uniq
from .filters.array import where
from .filters.extra import escapejs
from .filters.extra import safe
from .filters.math import abs_
from .filters.math import at_least
Expand Down Expand Up @@ -197,3 +198,4 @@ def register(env: Environment) -> None: # noqa: PLR0915
env.add_filter("date", date)

env.add_filter("safe", safe)
env.add_filter("escapejs", escapejs)
62 changes: 62 additions & 0 deletions liquid/builtin/filters/extra.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

from __future__ import annotations

import re
from typing import TYPE_CHECKING

from liquid import Markup
Expand All @@ -19,3 +20,64 @@ def safe(val: str, *, environment: Environment) -> str:
if environment.autoescape:
return Markup(val)
return val


# `escapejs` is inspired by https://github.com/salesforce/secure-filters and Django's
# escapejs filter, https://github.com/django/django/blob/485f483d49144a2ea5401442bc3b937a370b3ca6/django/utils/html.py#L63

_ESCAPE_MAP = {
"\\": "\\u005C",
"'": "\\u0027",
'"': "\\u0022",
">": "\\u003E",
"<": "\\u003C",
"&": "\\u0026",
"=": "\\u003D",
"-": "\\u002D",
";": "\\u003B",
"`": "\\u0060",
"\u2028": "\\u2028",
"\u2029": "\\u2029",
}

_ESCAPE_MAP.update({chr(c): f"\\u{c:04X}" for c in range(32)})
_ESCAPE_RE = re.compile("[" + re.escape("".join(_ESCAPE_MAP.keys())) + "]")


@with_environment
@string_filter
def escapejs(val: str, *, environment: Environment) -> str:
"""Escape characters for safe use in JavaScript string literals.

This filter escapes a string for embedding inside **JavaScript string
literals**, using either single or double quotes (e.g. `'...'` or `"..."`).
It replaces control characters and potentially dangerous symbols with
their corresponding Unicode escape sequences.

**Important:** This filter does **not** make strings safe for use in
JavaScript template literals (backtick strings), or in raw JavaScript
expressions. Use it only when placing data inside quoted JS strings
within inline `<script>` blocks or event handlers.

**Recommended alternatives:**
- Pass data using HTML `data-*` attributes and read them in JS via
`element.dataset`.
- For structured data, prefer a JSON-serialization approach using the
JSON filter.

Escaped characters include:
- ASCII control characters (U+0000 to U+001F)
- Characters like quotes, angle brackets, ampersands, equals signs
- Line/paragraph separators (U+2028, U+2029)

Args:
val: The input string to escape.
environment: The active Liquid environment

Returns:
A JavaScript-safe string, with problematic characters escaped as Unicode.
"""
escaped = _ESCAPE_RE.sub(lambda m: _ESCAPE_MAP[m.group()], val)
if environment.autoescape:
return Markup(escaped)
return escaped
40 changes: 36 additions & 4 deletions liquid/builtin/filters/string.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,35 @@ def downcase(val: str) -> str:
@with_environment
@string_filter
def escape(val: str, *, environment: Environment) -> str:
"""Return _val_ with the characters &, < and > converted to HTML-safe sequences."""
"""Escape special characters in a string for safe use in HTML.

This filter replaces the characters `&`, `<`, `>`, `'`, and `"` with their
corresponding HTML-safe sequences:

- `&` -> `&amp;`
- `<` -> `&lt;`
- `>` -> `&gt;`
- `'` -> `&#39;`
- `"` -> `&#34;`

This helps prevent HTML injection (XSS) when rendering untrusted content in
HTML element bodies or attributes.

Important: This filter does **not** make strings safe for use in JavaScript,
including in `<script>` blocks, inline event handler attributes (e.g. `onerror`),
or other JavaScript contexts. For those cases, use the `escapejs` filter instead.

When `autoescape` is enabled in the environment, this filter uses the same
escaping logic as the environment (via `markupsafe.escape()`). Otherwise, it
falls back to Python's standard `html.escape()`.

Args:
val: The input string to escape.
environment: The current rendering environment.

Returns:
A string with HTML-special characters replaced by safe escape sequences.
"""
if environment.autoescape:
return markupsafe_escape(str(val))
return html.escape(val)
Expand All @@ -65,10 +93,14 @@ def escape(val: str, *, environment: Environment) -> str:
@with_environment
@string_filter
def escape_once(val: str, *, environment: Environment) -> str:
"""Return _val_ with the characters &, < and > converted to HTML-safe sequences.
"""Escape a string for HTML, but avoid double-escaping existing entities.

Converts characters like `&`, `<`, and `>` to their HTML-safe sequences,
but leaves existing HTML entities untouched (e.g., `&amp;` stays `&amp;`).

This is useful when escaping content that may already be partially escaped.

It is safe to use `escape_one` on string values that already contain HTML escape
sequences.
See the `escape` filter for details and limitations.
"""
if environment.autoescape:
return Markup(val).unescape()
Expand Down
80 changes: 80 additions & 0 deletions tests/filters/test_escapejs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
import operator
from dataclasses import dataclass
from dataclasses import field
from functools import partial
from inspect import isclass
from typing import Any

import pytest

from liquid import Environment
from liquid.builtin.filters.extra import escapejs
from liquid.exceptions import FilterArgumentError
from liquid.exceptions import LiquidError


@dataclass
class Case:
description: str
val: Any
expect: Any
args: list[Any] = field(default_factory=list)
kwargs: dict[str, Any] = field(default_factory=dict)


ENV = Environment()

TEST_CASES = [
Case(
description="escape <script> tag",
val="<script>alert('x')</script>",
args=[],
kwargs={},
expect="\\u003Cscript\\u003Ealert(\\u0027x\\u0027)\\u003C/script\\u003E",
),
Case(
description="escape quotes and backslash",
val='"foo\\bar"',
args=[],
kwargs={},
expect="\\u0022foo\\u005Cbar\\u0022",
),
Case(
description="escape control characters",
val="foo\x00bar\x1fbaz",
args=[],
kwargs={},
expect="foo\\u0000bar\\u001Fbaz",
),
Case(
description="not a string",
val=123,
args=[],
kwargs={},
expect="123",
),
Case(
description="unexpected argument",
val="test",
args=[1],
kwargs={},
expect=FilterArgumentError,
),
Case(
description="undefined left value",
val=ENV.undefined("test"),
args=[],
kwargs={},
expect="",
),
]


@pytest.mark.parametrize("case", TEST_CASES, ids=operator.attrgetter("description"))
def test_escapejs_filter(case: Case) -> None:
_escapejs = partial(escapejs, environment=ENV)
if isclass(case.expect) and issubclass(case.expect, LiquidError):
with pytest.raises(case.expect):
_escapejs(case.val, *case.args, **case.kwargs)
else:
assert _escapejs(case.val, *case.args, **case.kwargs) == case.expect
Loading