Skip to content

Not working with Gemma2 9B IT #12

@pratik443

Description

@pratik443

Hello, thank you for the amazing work and code.
So I'm trying to adapt the code to Gemma 2 9B it model, after changing the prompts to chat template required and running the code it gives following error

attn_weights = attn_weights + causal_mask
RuntimeError: The size of tensor a (7538) must match the size of tensor b (46) at non-singleton dimension 3
seems like past_key_values and current inputs is creating this problem, apparently setting usePrompt as True the generation works, but then just using cache with questions on hotpot dataset setting usePrompt as False dosent work

Does this imply that need to make changes in the generate function for some other issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions