Turkish Syllable Splitter

turkish-syllable is a library for syllabification of Turkish text, written in C and accessible using Python connectors. It works quickly and efficiently, produces results that follow Turkish spelling rules, and offers optional inclusion of punctuation.

Important Note: This library is able to separate the syllables of words of Turkish origin according to the rules of the Turkish Language Association (TDK), but it does not provide a definitive solution for words of foreign origin. Although these words are often spelled correctly, incorrect spelling can be encountered due to language structure.

Features

Turkish Spelling: Works according to the spelling rules specific to the Turkish language (for example, “merhaba” → ['mer', 'ha', 'ba']).
Punctuation Support: Optionally adds punctuation marks and spaces to the syllable list (with_punctuation parameter).
Fast Performance: C-based algorithm provides fast results even for large texts.
Platform Compatibility: The library is platform independent as of version 0.2.0.

Installation

You can install it via PyPI:

pip install turkish-syllable

Sample Usage

Using with Python:

from turkish_syllable import syllabify

# with punctuation
result = syllabify("Merhaba, dünya!") # default value of with_punctuation is True
print(result)
# output: ['Mer', 'ha', 'ba', ',', ' ', 'dün', 'ya', '!']

# without punctuation
result = syllabify("Merhaba, dünya!", with_punctuation=False)
print(result)
# output: ['Mer', 'ha', 'ba', 'dün', 'ya']

or directly on the file:

from turkish_syllable.csyllable_tr import process_input_output

input_file = "input.txt"
output_file = "output.txt"

"""
function:
	- process_input_output: function that does the spelling on files
parameters:
	- input_file:  file with the text to be spelled
	- output_file: the name of the file where the spelled text will be written
	- with_punctuation: indicates whether punctuation and space characters should be included in the spelling 		process (default=True)
"""
process_input_output(input_file=input_file, output_file=output_file, with_punctuation=True)

with open(output_file, "r", encoding="utf-8") as f:
    print("With punctuation:")
    print(f.read())

process_input_output(input_file=input_file, output_file=output_file, with_punctuation=False)

with open(output_file, "r", encoding="utf-8") as f:
    print("\nWithout punctuation:")
    print(f.read())

Using with command line:

# with punctuation (default)
python3 -m turkish_syllable -i input.txt -o output.txt -p
# or enter the text directly:
python3 -m turkish_syllable -p
# sample input: "Merhaba, dünya!"
# output: Mer ha ba ,   dün ya !

# without punctuation
python3 -m turkish_syllable -i input.txt -o output.txt --no-punctuation
# or:
python3 -m turkish_syllable --no-punctuation
# sample input: "Merhaba, dünya!"
# output: Mer ha ba dün ya

Technical Details

Language: The algorithm is written in C and linked to Python with ctypes.
Spelling Algorithm: It follows the natural distinctions between vowels and consonants according to Turkish spelling rules. It is optimized for special cases (for example, words with 3 or 4 letters).
Dependencies: No extra Python dependencies are required, only standard libraries are used.
File Structure:
- syllable.c: C source code containing the spelling logic.
- libsyllable.so: Compiled shared library (Linux-many).
- libsyllable.dll: Compiled shared library (Windows).
- libsyllable.dylib: Compiled shared library (MacOS).
- csyllable_en.py: Python linker.

Requirements

Python 3.6 or higher
It can run on all operating systems.

License

Distributed under this project (MIT).

Contribution

If you want to contribute:

Fork the repository: github
Make your changes and send pull request.

Contact

For questions or suggestions: [email protected]

Version History

0.2.1: Platform independency, README improved
0.1.4: README improved
0.1.3: README improved and fixing some bugs
0.1.2: Fixing some bugs.
0.1.1: Added with_punctuation parameter, shortened function name to syllabify.
0.1.0: Initial release.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
include		include
tests		tests
turkish_syllable		turkish_syllable
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
local_test.py		local_test.py
pyproject.toml		pyproject.toml
roadmap.md		roadmap.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Turkish Syllable Splitter

Features

Installation

Sample Usage

Using with Python:

Using with command line:

Technical Details

Requirements

License

Contribution

Contact

Version History

About

Uh oh!

Releases 31

Packages

Uh oh!

Languages

License

ahmetozdemirrr/Turkish-Syllable

Folders and files

Latest commit

History

Repository files navigation

Turkish Syllable Splitter

Features

Installation

Sample Usage

Using with Python:

Using with command line:

Technical Details

Requirements

License

Contribution

Contact

Version History

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 31

Packages 0

Uh oh!

Languages

Packages