Skip to content

Fix bug with unclosed bold and italics in succession#372

Merged
kristian-clausal merged 1 commit intomainfrom
extra-apostrophes
Apr 14, 2025
Merged

Fix bug with unclosed bold and italics in succession#372
kristian-clausal merged 1 commit intomainfrom
extra-apostrophes

Conversation

@kristian-clausal
Copy link
Collaborator

Fixes wiktextract issue #1120 and others

What was broken: if you had text like aaa '''bolded'' with a typo that makes the formatters unbalanced,
the earlier token would eat the rest of the article.

Issue was fixed by adding continue into text_fn() in parser.py, in the while True block that would loop over the current parser stack in reverse and pop its items, to handle end-of-line breakpoints for those items.

The elif entry for italics and bolds were missing a continue, and so the break after the if block would execute, meaning the bold token was never popped and parsing continued with everything following being its children.

Also, I just spent too damn long trying to figure out why python -m unittest was stuck (couldn't even kill it with SIGINT)... It was because my uncommitted debugging code had breakpoint(), which of course opened the python debugger in the background of certain tests... I don't use the pdb enough for this to not be a surprise.

Fixes wiktextract issue #1120 and others

What was broken: if you had text like `aaa '''bolded''`
with a typo that makes the formatters unbalanced,
the earlier token would eat the rest of the article.

Issue was fixed by adding `continue` into text_fn() in
parser.py, in the `while True` block that would loop
over the current parser stack in reverse and pop its
items, to handle end-of-line breakpoints for those items.

The `elif` entry for italics and bolds were missing a continue,
and so the `break` after the `if` block would execute,
meaning the bold token was never popped and parsing
continued with everything following being its children.
@kristian-clausal kristian-clausal merged commit 3185d56 into main Apr 14, 2025
10 checks passed
@kristian-clausal kristian-clausal deleted the extra-apostrophes branch April 14, 2025 10:45
xxyzz added a commit to xxyzz/wikitextprocessor that referenced this pull request Apr 15, 2025
@xxyzz
Copy link
Collaborator

xxyzz commented Apr 15, 2025

The test process should be killed with sigkill or sigterm.

xxyzz added a commit that referenced this pull request Apr 15, 2025
Add test for previous pull request #372
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants