Skip to content

Sort order for supertypes/subtypes #338

@zesch

Description

@zesch

When doing select(supertype) and annotations of the supertype and several subtypes being present in the CAS (e.g. POS and POS_NOUN, POS_VERB etc), the returned iterable is only ordered per subtype in the subsets, but not overall.
Example output for the POS example:

d.t.u.d.c.a.l.t.p.POS_VERB(PosValue=VAFIN, begin=4, end=7)
d.t.u.d.c.a.l.t.p.POS_VERB(PosValue=VVFIN, begin=25, end=29)
d.t.u.d.c.a.l.t.p.POS_VERB(PosValue=PTKVZ, begin=34, end=37)
d.t.u.d.c.a.l.t.p.POS_DET(PosValue=ART, begin=8, end=11)
d.t.u.d.c.a.l.t.p.POS(PosValue=PPER, begin=0, end=3)
d.t.u.d.c.a.l.t.p.POS(PosValue=VAFIN, begin=4, end=7)
d.t.u.d.c.a.l.t.p.POS(PosValue=ART, begin=8, end=11)
d.t.u.d.c.a.l.t.p.POS(PosValue=NN, begin=12, end=20)
d.t.u.d.c.a.l.t.p.POS(PosValue=KON, begin=21, end=24)
d.t.u.d.c.a.l.t.p.POS(PosValue=VVFIN, begin=25, end=29)
d.t.u.d.c.a.l.t.p.POS(PosValue=ADJD, begin=30, end=33)
d.t.u.d.c.a.l.t.p.POS(PosValue=PTKVZ, begin=34, end=37)
d.t.u.d.c.a.l.t.p.POS(PosValue=$., begin=37, end=38)
d.t.u.d.c.a.l.t.p.POS_CONJ(PosValue=KON, begin=21, end=24)
d.t.u.d.c.a.l.t.p.POS_PRON(PosValue=PPER, begin=0, end=3)
d.t.u.d.c.a.l.t.p.POS_NOUN(PosValue=NN, begin=12, end=20)
d.t.u.d.c.a.l.t.p.POS_PUNCT(PosValue=$., begin=37, end=38)
d.t.u.d.c.a.l.t.p.POS_ADJ(PosValue=ADJD, begin=30, end=33

The expected behavior would be that when retrieving the supertype (POS in this case), all returned annotations are sorted by offset position.

d.t.u.d.c.a.l.t.p.POS(PosValue=PPER, begin=0, end=3)
d.t.u.d.c.a.l.t.p.POS_PRON(PosValue=PPER, begin=0, end=3)
d.t.u.d.c.a.l.t.p.POS(PosValue=VAFIN, begin=4, end=7)
d.t.u.d.c.a.l.t.p.POS_VERB(PosValue=VAFIN, begin=4, end=7)
d.t.u.d.c.a.l.t.p.POS(PosValue=ART, begin=8, end=11)
d.t.u.d.c.a.l.t.p.POS_DET(PosValue=ART, begin=8, end=11)
...

(I know that having two POS tags at the same offsets is strange, only here for the sake of the example.)

This might be related to #247.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions