niho is a command-line tool for converting romanized Japanese text to Japanese characters.
$ echo _niho ha, Ro-ma ji_ wo nihongo_ni henkan_surutameno Tu-ru desu. | niho -d dics/sile.jsonl
nihoは、ローマ字を日本語に変換するためのツールです。$ cargo install niho
A command-line tool for converting romanized Japanese text to Japanese characters
Usage: niho [OPTIONS]
Options:
--version Print version
-h, --help Print help ('--help' for full help, '-h' for summary)
-t, --tokenize Output tokenized input as JSON instead of converting to Japanese
-d, --dictionary-file <PATH> Path to dictionary file [env: NIHO_DICTIONARY_FILE]niho converts romanized Japanese text to Japanese characters using the following syntax:
- Regular text: Converted to hiragana (e.g.,
konnnichiwa→こんにちは) - Capitalized text: Converted to katakana (e.g.,
Ko-hi-→コーヒー) - Text ending with
_: Converted to kanji using dictionary lookup (e.g.,nihongo_→日本語) - Text ending with multiple
_: Select specific kanji from multiple candidates (e.g.,ka__→ second kanji option for "ka") - Text prefixed with
_: Kept as raw text until whitespace (e.g.,_Hello desu→Helloです) - Whitespace: A whitespace immediately following a non-whitespace character is removed; all other whitespace is preserved as-is in output
# Convert hiragana
$ echo konnnichiwa | niho
こんにちは
# Convert katakana (use uppercase)
$ echo Ko-hi- | niho
コーヒー
# Convert kanji (use underscore suffix)
$ echo nihongo_ | niho
日本語
# Select from multiple kanji options
$ echo ka__ | niho
掛
# Mix different types
$ echo watashi ha Ko-hi- wo nomimasu | niho
わたしはコーヒーをのみます
# Keep raw text
$ echo '_English _to nihongo_' | niho
English to 日本語The dictionary is stored in a JSONL (JSON Lines) format, where each line contains a JSON object representing a character or word mapping. The dictionary contains four types of entries:
hiragana: Maps romanized text to hiragana characterskatakana: Maps romanized text to katakana characterskanji: Maps hiragana text to kanji characters (with multiple options support)
Example entries:
{"type": "hiragana", "from": "ka", "to": "か"}
{"type": "katakana", "from": "ka", "to": "カ"}
{"type": "kanji", "from": "にほんご", "to": ["日本語"]}The default dictionary can be found at dictionaries/default.jsonl.
For kanji conversion (text ending with _), the tool performs the following process:
- Romanized → Hiragana: First converts the romanized text to hiragana
- Hiragana → Kanji: Then looks up the hiragana in the kanji dictionary
- Multiple Options: If multiple kanji options exist, use additional
_to select (e.g.,ka_for first option,ka__for second) - Unknown words: If no mapping is found in the kanji dictionary, the hiragana-converted text is output with trailing underscores matching the selection index (e.g.,
humei_→ふめい_)