An Image Captioning model by combining CLIP as encoder and mT5 as decoder
-
Updated
Jun 24, 2025 - Python
An Image Captioning model by combining CLIP as encoder and mT5 as decoder
Add a description, image, and links to the ktvic topic page so that developers can more easily learn about it.
To associate your repository with the ktvic topic, visit your repo's landing page and select "manage topics."