Skip to content
This repository was archived by the owner on Jan 30, 2026. It is now read-only.

Update dataset.py to support multiple captions per image file when us…#267

Merged
victorchall merged 1 commit intovictorchall:mainfrom
scottshireman:patch-2
Mar 10, 2025
Merged

Update dataset.py to support multiple captions per image file when us…#267
victorchall merged 1 commit intovictorchall:mainfrom
scottshireman:patch-2

Conversation

@scottshireman
Copy link
Contributor

…ing json for data_root

Changes to support importing multiple captions for single image file without duplicating the image file when using a json file as data_root.

Originally the data loader uses a dict with the image path as the key and a caption dict as the value at a single stage in the process. Changed this so that when you're loading from a json file the image path is stored as just another entry in the dict and a counter int is used as the key in its place.

Using a json file as data_root was already supported in the original code so this seemed the easiest path to supporting multiple captions for the same image file.

I tested with json file as data root and with a folder of images and txt captions and both worked.

…ing json for data_root

Changes to support importing multiple captions for single image file without duplicating the image file when using a json file as data_root.

Originally the data loader uses a dict with the image path as the key and a caption dict as the value at a single stage in the process. Changed this so that when you're loading from a json file the image path is stored as just another entry in the dict and a counter int is used as the key in its place.

Using a json file as data_root was already supported in the original code so this seemed the easiest path to supporting multiple captions for the same image file.

I tested with json file as data root and with a folder of images and txt captions and both worked.
@victorchall victorchall merged commit 059ec9e into victorchall:main Mar 10, 2025
1 of 2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants