Thanks for your job. Making full use of large-scale unlabeled data is highly valuable and worth attention.
I'm curious about the following:
- why not use strong permutation augmentation when training the teacher model? As the paper mentioned, the labeled images are already sufficient (1.5M). Perhaps its generalization ability is comparable to or even better than the semi-supervised training
- Why not use strong permutation in labeled data while training the student model?
@LiheYoung