-
Notifications
You must be signed in to change notification settings - Fork 127
Open
Description
Hello. Thank you for your wonderful code :)
I have a question about the freqs_cis term in the apply_rope function in modules/layers.py.
This function is used for attention, and if we look at model.py, we can see that the embeddings of txt_id and img_id are used as the freqs_cis term.
What are txt_id and img_id? Do we need any other terms besides the text and music pairs?
I commented out the apply_rope function and trained my model with just text/music pairs, but I didn't get good results.
It would be great if you could tell me what format this data is in.
Thank you
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels