Linguistic Features in Text (LiFT)

LiFT is a library for extracting linguistic features from textual data.

LiFT is currently maintained by:

Procoli, FernUniversität in Hagen
Educational Measurement and Data Science, IPN Kiel

First steps

Philosophy

We rely on a UIMA CAS repesentation model based on the DKPro Core type system and preprocessing components. This makes LiFT multi-lingual, supporting all the languages included in DKPro Core. However, not all structures might be supported in each language.

LiFT distinguishes betwen linguistic structures (lemmas, POS tags, syllables, spelling errors, etc.) and features (based on these structures). Structures are represented in the document model and can be visualized. Features are numeric values that represent properties of the document, e.g. SpellingErrorRatio may have a value of 0.06 meaning that 6% of all tokens in the text contain a spelling error.

The project is under heavy development, but we are working towards a stable release.

We plan to implement the following types of structures:

casing
lemmas
quotations
POS tags
phrases
spelling errors
stems
syllables
tokens
T-units
voice

We also support various meta-features of linguistic complexity:

readability measures
type-token ratio (TTR)

Name		Name	Last commit message	Last commit date
Latest commit History 282 Commits
.github/workflows		.github/workflows
assets/img		assets/img
docs		docs
j_lift		j_lift
py_lift		py_lift
shared_resources		shared_resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Linguistic Features in Text (LiFT)

First steps

Philosophy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 11

Uh oh!

Languages

License

zesch/linguistic-features-in-text

Folders and files

Latest commit

History

Repository files navigation

Linguistic Features in Text (LiFT)

First steps

Philosophy

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 11

Uh oh!

Languages

Packages