Skip to content

Conversation

@arubehn
Copy link

@arubehn arubehn commented Aug 19, 2025

Pull request checklist

  • add new concept list
  • add new metadata
  • add new Concepticon concept sets
    • checked whether the new concept(s) can be applied to existing lists with
      concepticon notlinked --gloss "NEW_GLOSS"
  • add new Concepticon concept relations
  • refine existing Concepticon concept set mappings
  • refine Concepticon glosses
  • refine Concepticon concept relations
  • refine Concepticon concept definitions
  • retire data

Additional information

This PR adds the concept list used in the Comparative Austronesian Dictionary to Concepticon.

@arubehn
Copy link
Author

arubehn commented Aug 19, 2025

@LinguList is there a way of explicitly referencing that this list is derived from Buck-1949-1110? I know this was discussed at some point, but I don't know if a workflow was established in the end.

Copy link
Contributor

@LinguList LinguList left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks nice. We must have a thorough review of the mappings, where you should find somebody to double-check them quickly. What I would ask for is: change the header for ORIGINAL_ID to BUCK_1949_1110, as we do also in the list by Castro-2010-511.tsv. Plus, check my answer on your question in the comment.

Schauenburg-2015-858 Schauenburg, Gesche and Ambrasat, Jens and Schröder, Tobias and Von Scheve, Christian and Conrad, Markus 2015 858 ratings German German https://doi.org/10.3758/s13428-014-0494-7 Schauenburg2015 This list contains ratings of valence, arousal, potency, authority and community for 858 German words. Participants were native German speakers recruited through university mailing lists. 720-735
Janschewitz-2008-460 Janschewitz, Kristin 2008 460 ratings English English https://doi.org/10.3758/BRM.40.4.1065 Janschewitz2008 This list of 460 English taboo, emotionally valenced, and emotionally neutral words was rated for frequency, inappropriateness, valence, arousal, and imageability by 78 native-English-speaking college students. 1065-1074
Tryon-1995-1310 Tryon, Darrell T. 1995 1310 questionnaire English Austronesian languages https://doi.org/10.1515/9783110884012 Tryon1995 This list comprises 1310 basic vocabulary items, extending the impactful 1110-items list by (:bib:Buck1949) by 200 items that are particularly relevant to Austronesian peoples.
Tryon-1995-1310 Tryon, Darrell T. 1995 1310 questionnaire English Austronesian languages https://doi.org/10.1515/9783110884012 Tryon1995 This list comprises 1310 basic vocabulary items, extending the impactful 1110-items list by (:bib:Buck1949) by 200 items that are particularly relevant to Austronesian peoples.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arubehn, we reference lists directly, if they are linked, by using [1110-item list of Buck (1949)](:ref:Buck-1949-1110), if that list exists. You have (xxx) without the preceding [xxx] here anyway.

@alzkuc alzkuc requested review from alzkuc and removed request for alzkuc August 26, 2025 11:39
@alzkuc alzkuc self-assigned this Aug 26, 2025
Copy link
Collaborator

@alzkuc alzkuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I have just gone through the mappings and left a few comments.

Tryon-1995-1310-119 02.550 COUSIN 1643 COUSIN Buck-1949-1110-80
Tryon-1995-1310-120 02.560 ANCESTORS 1669 ANCESTORS Buck-1949-1110-81
Tryon-1995-1310-121 02.570 DESCENDANTS 490 DESCENDANTS Buck-1949-1110-82
Tryon-1995-1310-122 02.610 FATHER-IN-LAW (of a man) 1055 FATHER-IN-LAW Buck-1949-1110-83
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You specify FATHER/MOTHER-IN-LAW (of a woman) as the concept FATHER/MOTHER-IN-LAW (OF WOMAN) but do not do the same for FATHER/MOTHER-IN-LAW (of a man). Perhaps you could specify for the male counterpart as well? This would result in using 2255 FATHER-IN-LAW (OF MAN) instead of 1055 and 2257 MOTHER-IN-LAW (OF MAN) instead of 1050.

Tryon-1995-1310-123 02.611 FATHER-IN-LAW (of a woman) 2254 FATHER-IN-LAW (OF WOMAN)
Tryon-1995-1310-124 02.620 MOTHER-IN-LAW (of a man) 1050 MOTHER-IN-LAW Buck-1949-1110-84
Tryon-1995-1310-125 02.621 MOTHER-IN-LAW (of a woman) 2256 MOTHER-IN-LAW (OF WOMAN)
Tryon-1995-1310-126 02.630 SON-IN-LAW (of a man) 1056 SON-IN-LAW Buck-1949-1110-85
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to the previous case. However, you are best familiar with the dataset, so you have to decide whether the specification makes sense here. If yes, 2267 SON-IN-LAW (OF MAN) and 2265 DAUGHTER-IN-LAW (OF MAN) are available.

Tryon-1995-1310-146 02.960 THEY 817 THEY
Tryon-1995-1310-147 03.110 ANIMAL 619 ANIMAL Buck-1949-1110-97
Tryon-1995-1310-148 03.120 MALE (adj.) 2263 MALE (OF ANIMAL) Buck-1949-1110-98
Tryon-1995-1310-149 03.130 FEMALE (adj.) 2262 FEMALE (OF ANIMAL) Buck-1949-1110-99
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the author specify that it only relates to animals? A broader term is available, in case it fits better: 1551 FEMALE

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although the glosses are not really clear. Compare 03.120 MALE (adj.) (referring to animals) to 02.230 MALE (more general or referring to humans).

The author unfortunately does not really clarify the glosses, so some of it is guesswork based on the context and the reference list by Buck. I am actually considering now whether the first instance (02.230) should rather be mapped to MALE (OF PERSON), but right now, it is not exactly clear to me what the elicited forms mean (and what not).

Tryon-1995-1310-191 03.594 DOVE 1853 DOVE
Tryon-1995-1310-192 03.596 OWL 735 OWL
Tryon-1995-1310-193 03.610 DOG 2009 DOG Buck-1949-1110-137
Tryon-1995-1310-194 03.614 RABBIT 1190 HARE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a better concept available: either 1136 RABBIT or 2345 LEPORID (RABBIT OR HARE)

Tryon-1995-1310-201 03.654 GILL 1916 GILL
Tryon-1995-1310-202 03.655 SHELL 598 SHELL
Tryon-1995-1310-203 03.661 SHARK 1110 SHARK
Tryon-1995-1310-204 03.662 PORPOISE, DOLPHIN 1479 DOLPHIN
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the word provided in the dataset refer to dolphin only?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. The way I understand it, however, is that a porpoise is a type of dolphin. So I would assume that the elicited terms will mostly, but not necessarily refer to a purpoise (but can also refer to a dolphin more broadly) - that's why I mapped it to the broader term DOLPHIN.

Copy link
Collaborator

@alzkuc alzkuc Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A slight tangent alert: Initially thought so too, but it turns out that that is not the case. They look very similar, but only share the order (toothed whales). They are parts of two different families. Is there a way you can determine which fits better for this list? If not, it is up to you which you find more likely, or you could always go for WHALE, as that is the broader concept covering both DOLPHIN and PORPOISE.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(@LinguList I will later update the concept definition on Concepticon for both Dolphin and Porpoise to be more accurate)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging that up, @alzkuc , you are absolutely right :) unfortunately, the data source does not allow me to better understand what exactly was elicited here. It is conceivable that many languages do not strictly differentiate between dolphins and purpoises (thinking of the older German "Tümmler" which was used to refer to dolphins "großer Tümmler" and porpoises "kleiner Tümmler").

This would technically mean that the correct concept would be PORPOISE OR DOLPHIN (or maybe DELPHINOID?), as a union of these two concepts. What should we do about that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the most accurate solution would be to introduce a broader concept which covers both Porpoise and Dolphin, e.g. DELPHINOID, to which you then map.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can now map it as the new concept: 4074 DOLPHIN-LIKE WHALE (DOLPHIN OR PORPOISE)

Copy link
Author

@arubehn arubehn Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thank you!

Tryon-1995-1310-426 05.420 BREAKFAST 1322 BREAKFAST Buck-1949-1110-278
Tryon-1995-1310-427 05.430 LUNCH 768 LUNCH Buck-1949-1110-279
Tryon-1995-1310-428 05.440 DINNER 1833 DINNER (SUPPER) Buck-1949-1110-280
Tryon-1995-1310-429 05.450 SUPPER 1833 DINNER (SUPPER) Buck-1949-1110-281
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky one, because the difference between the two words (Supper and Dinner) is largely regional in English. Does the language distinguish between the two?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not, I think you could only map it once and leave the other one unmapped.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this one. Many elicited forms seem to be identical anyway. In the original list Buck-1949-1110, both concepts are mapped to DINNER (SUPPER) as well, so I thought this would be the best way to proceed here as well.

In those cases where the forms do differ, it seems like DINNER refers to a slightly larger or more formal meal than SUPPER. However, the difference seems to be very subtle.

Tryon-1995-1310-568 07.580 ARCH 876 ARCH
Tryon-1995-1310-569 07.610 MASON 877 MASON
Tryon-1995-1310-570 07.620 BRICK 1006 BRICK
Tryon-1995-1310-571 07.630 MORTAR 1731 MORTAR BINDER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this refer to the paste and not the utensil?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, a case where the author uses the exact same gloss twice (in 05.580 and 07.630) without any disambiguation. But it is clear from the context that this instance refers to the paste (and the other one to the utensil).

Tryon-1995-1310-629 08.850 BANYAN 346 BANYAN Buck-1949-1110-442
Tryon-1995-1310-630 08.910 SWEET POTATO 159 SWEET POTATO
Tryon-1995-1310-631 08.912 YAM 410 YAM
Tryon-1995-1310-632 08.920 TAPIOCA, MANIOC, CASSAVA 925 CASSAVA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manioc and Cassava are names for the exact same root, correct (name depends on the region)? So this is a more general question for another time: should we merge these two concepts into one, e.g. CASSAVA or MANIOC (Or just CASSAVA and mention Manioc as an alternative name in the description)? (927 MANIOC in fact currently has 0 links in Concepticon, Cassava has 44.) @LinguList

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's the reason why I mapped it to CASSAVA, not MANIOC.

One could argue that the ideal mapping for this gloss would be TAPIOCA OR CASSAVA, but I didn't want to introduce new concepts at this point.

Tryon-1995-1310-865 12.330 TOP 1753 TOP Buck-1949-1110-667
Tryon-1995-1310-866 12.340 BOTTOM 690 BOTTOM Buck-1949-1110-668
Tryon-1995-1310-867 12.350 END 742 END (OF SPACE) Buck-1949-1110-669
Tryon-1995-1310-868 12.352 POINTED 2992 TIP (OF OBJECT) Buck-1949-1110-670
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps 372 POINTED?

Tryon-1995-1310-1302 22.310 HEAVEN 1565 HEAVEN Buck-1949-1110-1094
Tryon-1995-1310-1303 22.320 HELL 878 HELL Buck-1949-1110-1095
Tryon-1995-1310-1304 22.350 DEMON (evil spirit) 1973 DEMON Buck-1949-1110-1098
Tryon-1995-1310-1305 22.370 IDOL 3205 GRAVEN IMAGE Buck-1949-1110-1100
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1945 IDOL? But again, up to your judgement based on the dataset.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. In the original list by Buck, this item apparently elicited false idols (=graven images), hence the link, but this does not seem to be the case in this dataset.

@arubehn
Copy link
Author

arubehn commented Aug 28, 2025

@LinguList everything should be addressed now - could you check again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants