-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Because the output of the LLM is not 100% controllable, we can find some entries that are not empty, but have some unwanted behaviour such as this entry:
{
"question": "What is the official abbreviation of the project called ',',...?",
"answer": "I don't really know !"
}
So we need to implement a topic labeling workflow to see 'semantically' what's happening in our data.
We want to use BERTopic in a first step, then use a very cheap LLM that also have a good knowledge (maybe gpt-4o-mini) to assess the extracted qa and see if they are good to be asked in an interview or not