-
Notifications
You must be signed in to change notification settings - Fork 23
Description
When doing select(supertype) and annotations of the supertype and several subtypes being present in the CAS (e.g. POS and POS_NOUN, POS_VERB etc), the returned iterable is only ordered per subtype in the subsets, but not overall.
Example output for the POS example:
d.t.u.d.c.a.l.t.p.POS_VERB(PosValue=VAFIN, begin=4, end=7)
d.t.u.d.c.a.l.t.p.POS_VERB(PosValue=VVFIN, begin=25, end=29)
d.t.u.d.c.a.l.t.p.POS_VERB(PosValue=PTKVZ, begin=34, end=37)
d.t.u.d.c.a.l.t.p.POS_DET(PosValue=ART, begin=8, end=11)
d.t.u.d.c.a.l.t.p.POS(PosValue=PPER, begin=0, end=3)
d.t.u.d.c.a.l.t.p.POS(PosValue=VAFIN, begin=4, end=7)
d.t.u.d.c.a.l.t.p.POS(PosValue=ART, begin=8, end=11)
d.t.u.d.c.a.l.t.p.POS(PosValue=NN, begin=12, end=20)
d.t.u.d.c.a.l.t.p.POS(PosValue=KON, begin=21, end=24)
d.t.u.d.c.a.l.t.p.POS(PosValue=VVFIN, begin=25, end=29)
d.t.u.d.c.a.l.t.p.POS(PosValue=ADJD, begin=30, end=33)
d.t.u.d.c.a.l.t.p.POS(PosValue=PTKVZ, begin=34, end=37)
d.t.u.d.c.a.l.t.p.POS(PosValue=$., begin=37, end=38)
d.t.u.d.c.a.l.t.p.POS_CONJ(PosValue=KON, begin=21, end=24)
d.t.u.d.c.a.l.t.p.POS_PRON(PosValue=PPER, begin=0, end=3)
d.t.u.d.c.a.l.t.p.POS_NOUN(PosValue=NN, begin=12, end=20)
d.t.u.d.c.a.l.t.p.POS_PUNCT(PosValue=$., begin=37, end=38)
d.t.u.d.c.a.l.t.p.POS_ADJ(PosValue=ADJD, begin=30, end=33
The expected behavior would be that when retrieving the supertype (POS in this case), all returned annotations are sorted by offset position.
d.t.u.d.c.a.l.t.p.POS(PosValue=PPER, begin=0, end=3)
d.t.u.d.c.a.l.t.p.POS_PRON(PosValue=PPER, begin=0, end=3)
d.t.u.d.c.a.l.t.p.POS(PosValue=VAFIN, begin=4, end=7)
d.t.u.d.c.a.l.t.p.POS_VERB(PosValue=VAFIN, begin=4, end=7)
d.t.u.d.c.a.l.t.p.POS(PosValue=ART, begin=8, end=11)
d.t.u.d.c.a.l.t.p.POS_DET(PosValue=ART, begin=8, end=11)
...
(I know that having two POS tags at the same offsets is strange, only here for the sake of the example.)
This might be related to #247.