Skip to content

Perform topic labeling and key words extraction #6

@Azzedde

Description

@Azzedde

Because the output of the LLM is not 100% controllable, we can find some entries that are not empty, but have some unwanted behaviour such as this entry:

    {
        "question": "What is the official abbreviation of the project called ',',...?",
        "answer": "I don't really know !"
    }

So we need to implement a topic labeling workflow to see 'semantically' what's happening in our data.
We want to use BERTopic in a first step, then use a very cheap LLM that also have a good knowledge (maybe gpt-4o-mini) to assess the extracted qa and see if they are good to be asked in an interview or not

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions