Skip to content
/ server Public

MDEV-38904: Fix latin7 collation corruption and my_convert infinite loop#4737

Open
itzanway wants to merge 9 commits intoMariaDB:mainfrom
itzanway:fix-mdev-38904
Open

MDEV-38904: Fix latin7 collation corruption and my_convert infinite loop#4737
itzanway wants to merge 9 commits intoMariaDB:mainfrom
itzanway:fix-mdev-38904

Conversation

@itzanway
Copy link

@itzanway itzanway commented Mar 5, 2026

This PR addresses MDEV-38904, which caused false index corruption in latin7 tables and a subsequent server hang during string conversion. The fix is divided into two parts: correcting the collation logic and hardening the string conversion utility.

  1. Collation Fix (strings/ctype-extra.c)
    The root cause was a transitivity violation in the latin7_general_ci collation. In the original sort_order_latin7_general_ci array, the hyphen (-, 0x2D) and the space ( , 0x20) were assigned weights that caused collisions during MyISAM/Aria index compression.

Changes: Adjusted the sort_order_latin7_general_ci weights to ensure that the space character (Index 32) is uniquely weighted as the minimum printable value and that the hyphen (Index 45) has a distinct weight.

Impact: This prevents CHECK TABLE from falsely reporting "Key in wrong position" and prevents the creation of circular B-tree pointers that caused the server to loop.

  1. Failsafe Loop Fix (strings/ctype.c)
    Even in cases of table corruption, the server should not hang at 100% CPU. The previous implementation of my_convert_using_func and my_convert_fix did not explicitly force a pointer advancement when the character set's mb_wc function returned a length of 0 (encountered during malformed/corrupt byte sequence reads).

Changes: Added an explicit from++ advancement when cnvres == 0.

Impact: This ensures the conversion loop always terminates, replacing malformed bytes with a '?' placeholder instead of looping infinitely.

Bug- https://jira.mariadb.org/browse/MDEV-38904?filter=-4

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Mar 6, 2026
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

Couple of things before I start reviewing the substance:

  • Please squash all of your commits into a single one
  • Please have a commit message that complies with CODING_STANDARDS.md
  • Please set your text editor to not convert spaces to table or vice versa.
  • Please do not do space only changes.
  • Please add test cases.
  • Please make sure all the buildbot hosts compile and run tests successfuly

After a brief consultation with the future final reviewer, I should add that we can't "adjust weights" on an existing collation. Especially if it's used to store data on disk. This is not backwards compatible. But this is just a heads up. To be resolved during the final review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

2 participants