This repository was archived by the owner on Jan 30, 2026. It is now read-only.
Update dataset.py to support multiple captions per image file when us…#267
Merged
victorchall merged 1 commit intovictorchall:mainfrom Mar 10, 2025
Merged
Update dataset.py to support multiple captions per image file when us…#267victorchall merged 1 commit intovictorchall:mainfrom
victorchall merged 1 commit intovictorchall:mainfrom
Conversation
…ing json for data_root Changes to support importing multiple captions for single image file without duplicating the image file when using a json file as data_root. Originally the data loader uses a dict with the image path as the key and a caption dict as the value at a single stage in the process. Changed this so that when you're loading from a json file the image path is stored as just another entry in the dict and a counter int is used as the key in its place. Using a json file as data_root was already supported in the original code so this seemed the easiest path to supporting multiple captions for the same image file. I tested with json file as data root and with a folder of images and txt captions and both worked.
victorchall
approved these changes
Mar 10, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…ing json for data_root
Changes to support importing multiple captions for single image file without duplicating the image file when using a json file as data_root.
Originally the data loader uses a dict with the image path as the key and a caption dict as the value at a single stage in the process. Changed this so that when you're loading from a json file the image path is stored as just another entry in the dict and a counter int is used as the key in its place.
Using a json file as data_root was already supported in the original code so this seemed the easiest path to supporting multiple captions for the same image file.
I tested with json file as data root and with a folder of images and txt captions and both worked.