Normalize

A library for normalizing unicode text. Implements all the Unicode Normalization Form algorithms. Normalization is buffered and takes O(n) time and O(1) space.

Note: the iterator version takes O(1) space, but the proc takes O(n) space.

Install

nimble install normalize

Compatibility

Nim +1.0.0

Usage

import normalize

# Normalization
assert toNfc("E◌̀") == "È"
assert toNfc("\u0045\u0300") == "\u00C8"
assert toNfd("È") == "E◌̀"
assert toNfd("\u00C8") == "\u0045\u0300"

# toNfkc and toNfkd are also available

# Canonical comparison
assert cmpNfd(
  "Voulez-vous un caf\u00E9?",
  "Voulez-vous un caf\u0065\u0301?")

# Normalization check (not always reliable, see docs)
assert isNfd(toNfd("\u1E0A"))

# isNfc, isNfkc and isNfkd are also available

Note: when printing to a terminal, the output may visually trick you. Better try printing the len or the runes

docs

Optimizations

The best optimization is to avoid normalizing when the text is already normalized. The isNf family of procs can be used for this purpose.

import normalize

template fastNfc(s: var string) =
  if not isNfc(s):
    s = toNfc(s)

Beware isNf may return false even after normalizing, this is because the internal check has 3 possible outputs "Yes", "No" and "MayBe". The problem is the output may always be "MayBe" for certain texts.

Tests

nimble test

LICENSE

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
docs		docs
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
normalize.nimble		normalize.nimble

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Normalize

Install

Compatibility

Usage

Optimizations

Tests

LICENSE

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

nitely/nim-normalize

Folders and files

Latest commit

History

Repository files navigation

Normalize

Install

Compatibility

Usage

Optimizations

Tests

LICENSE

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages