-
Notifications
You must be signed in to change notification settings - Fork 255
Description
Is there an existing issue for this?
- I have searched the existing issues
Bug summary
When using TorchIO within the __getitem__ method of a custom Dataset, and returning a tuple of (torch.Tensor, dict) instead of a torchio.Subject, I receive a warning suggesting to use SubjectsLoader. However, replacing DataLoader with SubjectsLoader is not feasible in this case and leads to errors, as SubjectsLoader expects Subject instances. The warning seems unnecessary and cannot be easily suppressed.
Code for reproduction
import torch
import torchio as tio
from torch.utils.data import Dataset, DataLoader
from typing import Tuple, Dict
import numpy as np
class CustomDataset(Dataset):
def __init__(self, data_list, transform=None):
self.data_list = data_list
self.transform = transform
def __len__(self):
return len(self.data_list)
def __getitem__(self, idx) -> Tuple[torch.Tensor, Dict]:
# Simulate loading image and mask arrays
image_array = np.random.rand(1, 128, 128, 128).astype(np.float32)
mask_array = np.random.randint(0, 2, (1, 128, 128, 128)).astype(np.int16)
metadata = {'label': 0}
# Create TorchIO Images
image = tio.ScalarImage(tensor=image_array)
mask = tio.LabelMap(tensor=mask_array)
# Create a Subject
subject = tio.Subject(image=image, mask=mask)
# Apply transforms if any
if self.transform:
subject = self.transform(subject)
# Process the image (e.g., apply masking)
processed_image = subject['image'].data.float().contiguous()
# Return tensor and metadata dictionary
return processed_image, metadata
# Define any transforms (optional)
transform = tio.Compose([
tio.RandomAffine(),
tio.RandomFlip(),
])
# # Create the dataset and dataloader
# dataset = CustomDataset(data_list=[0, 1, 2], transform=transform)
# loader = DataLoader(dataset, batch_size=2)
# Attempting to replace `DataLoader` with `SubjectsLoader` as suggested:
from torchio import SubjectsLoader
loader = SubjectsLoader(dataset, batch_size=2)
# Iterate through the DataLoader
for batch in loader:
images, labels = batch
print(type(images), type(labels))
breakActual outcome
Running the provided code with a standard DataLoader results in a warning message from TorchIO, even though the data returned by the custom Dataset is a tuple of (torch.Tensor, dict) and does not involve torchio.Subject objects in the final output. The warning is unnecessary because the actual batch structure is fully compatible with PyTorch's DataLoader.
When attempting to follow the warning's suggestion to replace DataLoader with SubjectsLoader, the program fails with an AttributeError because SubjectsLoader expects each dataset item to be a torchio.Subject, but the Dataset returns a tuple instead. This makes SubjectsLoader unusable for this scenario.
Error messages
Traceback (most recent call last):
File "example.py", line XX, in <module>
batch = next(iter(loader))
File "path_to_python/lib/python3.X/site-packages/torch/utils/data/dataloader.py", line XXX, in __next__
data = self._next_data()
File "path_to_python/lib/python3.X/site-packages/torch/utils/data/dataloader.py", line XXX, in _next_data
data = self._dataset_fetcher.fetch(index)
File "path_to_python/lib/python3.X/site-packages/torch/utils/data/_utils/fetch.py", line XX, in fetch
return self.collate_fn(data)
File "path_to_python/lib/python3.X/site-packages/torchio/data/loader.py", line XX, in _collate
for key in first_subject.keys():
AttributeError: 'tuple' object has no attribute 'keys'Expected outcome
I expect that when using TorchIO inside the __getitem__ method but returning standard PyTorch data structures (e.g., torch.Tensor and dict), the warning about using SubjectsLoader should not be displayed. Alternatively, there should be a way to suppress this warning when it's not applicable.
System info
Platform: Linux-5.15.0-107-generic-x86_64-with-glibc2.27
TorchIO: 0.20.2
PyTorch: 2.3.1
SimpleITK: 2.2.1 (ITK 5.3)
NumPy: 1.26.4
Python: 3.11.6 | packaged by conda-forge | (main, Oct 3 2023, 10:40:35) [GCC 12.3.0]