Feature request: Rule-based Annotator for Cardinal/Ordinal

While DKPro has UIMA types for Cardinal and Ordinal, it seems there are no annotators that can produce them.

So I implemented my own CardOrdAnnotator for English based on the Stanford NLP QuantifiableEntityNormalizer class.

If you are interested, I could roll that into dkpro-core-api-ner-asl, or whatever module you think is appropriate.

I attach the classes and tests that I wrote for that. Note that you won't be able to run them as they use some utilities that I wrote for myself, but it should give you an idea of how they work.

Basically, the annotator uses a class CardOrdParser, which I wrote based on QuantifiableEntityNormalizer. This means that the annotator would have to be GPLed.

Note that at the moment, the parser is only available for English, but it would be probably be relatively easy to implement it for other languages. To do that however, we would have to re-write (or extend) QuantifiableEntityNormalizer because in its current implementation, it uses static variables to store words for cardinals and ordinals (ex: "first", "one", etc...). As a result, you cannot have different instances of QuantifiableEntityNormalizer for different languages. I guess we could rewrite QuantifiableEntityNormalizer altogether (using its code as "inspiration"). Not sure if that would be sufficient to remove the GPL constraint on CardOrdParser.

Let me know if you are interested.


[CardOrdAnnotator_files.zip](https://github.com/dkpro/dkpro-core/files/3874781/CardOrdAnnotator_files.zip)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature request: Rule-based Annotator for Cardinal/Ordinal #1429

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature request: Rule-based Annotator for Cardinal/Ordinal #1429

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions