Skip to content

Confusion about answer_embeds Usage in eval_forward for Inference #41

@hshc123

Description

@hshc123

Hello Author,

I have a question regarding the understanding of the code. In the eval_forward function, I noticed that the code concatenates answer_embeds with input_embeds and then feeds the combined features into the LLM. My question is: answer_embeds appears to be the embedding representation of the correct answer, so why is the feature of the correct answer also being input into the model during the inference phase? From my understanding, the model should only receive video features and question features during inference, and should not have access to the answer information. Doesn't directly inputting the answer features into the model lead to answer leakage, thereby affecting the model's inference process?

Here is the relevant code snippet:


def eval_forward(accelerator, model, input_embeds, answer_embeds, pad_id, answer_ids, tokenizer):
    # first append answer_embeds to input_embeds
    prompt_length = input_embeds.shape[1]
    labels_length = answer_embeds.shape[1]
    input_embeds = **torch.cat([input_embeds, answer_embeds], dim=1)**

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions