Confusion about answer_embeds Usage in eval_forward for Inference

Hello Author,

I have a question regarding the understanding of the code. In the eval_forward function, I noticed that the code concatenates answer_embeds with input_embeds and then feeds the combined features into the LLM. My question is: answer_embeds appears to be the embedding representation of the correct answer, so why is the feature of the correct answer also being input into the model during the inference phase? From my understanding, the model should only receive video features and question features during inference, and should not have access to the answer information. Doesn't directly inputting the answer features into the model lead to answer leakage, thereby affecting the model's inference process?

Here is the relevant code snippet:
```

def eval_forward(accelerator, model, input_embeds, answer_embeds, pad_id, answer_ids, tokenizer):
    # first append answer_embeds to input_embeds
    prompt_length = input_embeds.shape[1]
    labels_length = answer_embeds.shape[1]
    input_embeds = **torch.cat([input_embeds, answer_embeds], dim=1)**
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Confusion about answer_embeds Usage in eval_forward for Inference #41

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Confusion about answer_embeds Usage in eval_forward for Inference #41

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions