JustHTML is vulnerable to XSS via code fence breakout in <pre> content

Summary

to_markdown() is vulnerable when serializing attacker-controlled <pre> content. The <pre> handler emits a fixed three-backtick fenced code block, but writes decoded text content into that fence without choosing a delimiter longer than any backtick run inside the content.

An attacker can place backticks and HTML-like text inside a sanitized <pre> element so that the generated Markdown closes the fence early and leaves raw HTML outside the code block. When that Markdown is rendered by a CommonMark/GFM-style renderer that allows raw HTML, the HTML executes.

This is a bypass of the v1.12.0 Markdown hardening. That fix escaped HTML-significant characters for regular text nodes, but <pre> uses a separate serialization path and does not apply the same protection.

Details

The vulnerable <pre> Markdown path:

extracts decoded text from the <pre> subtree
opens a fenced block with a fixed delimiter of ``````
writes the decoded text directly into the output
closes with another fixed ``````

Because the fence length is fixed, attacker-controlled content containing a backtick run of length 3 or more can terminate the code block. If the content also contains decoded HTML-like text such as <img ...>, that text appears outside the fence in the resulting Markdown and is treated as raw HTML by downstream Markdown renderers.

The issue is not that HTML-like text appears inside code blocks. The issue is that the serializer allows attacker-controlled <pre> text to break out of the fixed fence.

Reproduction

from justhtml import JustHTML

payload = "<pre>&#96;&#96;&#96;\n&lt;img src=x onerror=alert(1)&gt;</pre>"
doc = JustHTML(payload, fragment=True)  # default sanitize=True

print(doc.to_html(pretty=False))
# <pre>```
# &lt;img src=x onerror=alert(1)&gt;</pre>

print(doc.to_markdown())
# ```
# ```
# <img src=x onerror=alert(1)>
# ```

Rendered as CommonMark/GFM-style Markdown, that output is interpreted as:

Line 1 opens a fenced code block
Line 2 closes it
Line 3 is raw HTML outside the fence
Line 4 opens a new fence

Impact

Applications that treat JustHTML(..., sanitize=True).to_markdown() output as safe for direct rendering in Markdown contexts may be exposed to XSS, depending on the downstream Markdown renderer's raw-HTML handling.

Root Cause

The <pre> Markdown serializer uses a fixed fence instead of selecting a delimiter longer than the longest backtick run in the content.

Fix

When serializing <pre> content to Markdown, choose a fence length longer than any backtick run present in the code block content, with a minimum length of 3.

References

EmilStenstrom published to EmilStenstrom/justhtml Mar 21, 2026

Published to the GitHub Advisory Database Mar 24, 2026

Reviewed Mar 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package

Affected versions

Patched versions

Description

Summary

Details

Reproduction

Impact

Root Cause

Fix

References

Severity

CVSS overall score

CVSS v4 base metrics

Exploitability Metrics

Vulnerable System Impact Metrics

Subsequent System Impact Metrics

CVSS v4 base metrics

Exploitability Metrics

Vulnerable System Impact Metrics

Subsequent System Impact Metrics

EPSS score

Weaknesses

Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')

Improper Neutralization of Script-Related HTML Tags in a Web Page (Basic XSS)

CVE ID

GHSA ID

Source code

Credits

Uh oh!