Some questions about decoder position embedding for masked tokens by chrisway613 · Pull Request #173 · lucidrains/vit-pytorch

chrisway613 · 2021-11-24T07:30:14Z

In the decoder position embedding matrix, the size of first dim is the number of patches + 1, as the 1 for ViT's cls_token. But when embedding the position for masked tokens, their indices have not shifted 1, it may confuse with the position of the ViT's cls_token(although MAE do not use cls_token, but this will lead to weak extensibility if we wanna use the cls_token later)

lucidrains · 2021-11-24T16:17:07Z

@chrisway613 Hi Chris! while this is true, i think leaving untrained parameters in the wrapper class isn't elegant. you can always just concat the CLS tokens onto the decoder_pos_emb after you finished training, something like

decoder_cls_token = nn.Parameter(torch.randn(1, decoder_dim))
pos_embs_with_cls_token = torch.cat((decoder_cls_token, self.decoder_pos_emb), dim = 0)

some questions about decoder position embedding for masked tokens

6b7921f

lucidrains force-pushed the main branch 3 times, most recently from dbb7bd1 to b983bbe Compare December 21, 2021 18:23

lucidrains force-pushed the main branch from dfcfa20 to 2aae406 Compare March 23, 2022 17:42

lucidrains force-pushed the main branch 6 times, most recently from ddff7a7 to b3e90a2 Compare May 4, 2022 03:24

lucidrains force-pushed the main branch from ad1e6df to cb6d749 Compare October 29, 2022 18:35

lucidrains force-pushed the main branch from e051522 to 89e1996 Compare December 2, 2022 19:28

lucidrains force-pushed the main branch from 014df1e to df8733d Compare October 6, 2023 17:27

lucidrains force-pushed the main branch 3 times, most recently from 19eb6d4 to 5e808f4 Compare August 21, 2024 14:23

lucidrains force-pushed the main branch from 43cbcad to f50d7d1 Compare October 9, 2024 14:32

lucidrains force-pushed the main branch from 1de866d to db05a14 Compare March 5, 2025 18:50

lucidrains force-pushed the main branch from 0b273a2 to 3becf08 Compare September 25, 2025 13:21

lucidrains force-pushed the main branch 5 times, most recently from cbf6723 to 5cf8384 Compare October 28, 2025 19:17

lucidrains force-pushed the main branch from 7e703f2 to fb5014f Compare December 25, 2025 17:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some questions about decoder position embedding for masked tokens#173

Some questions about decoder position embedding for masked tokens#173
chrisway613 wants to merge 1 commit intolucidrains:mainfrom
chrisway613:chrisway

chrisway613 commented Nov 24, 2021

Uh oh!

lucidrains commented Nov 24, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

chrisway613 commented Nov 24, 2021

Uh oh!

lucidrains commented Nov 24, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants