Skip to content

Unnecessary warning when using TorchIO inside Dataset __getitem__ without returning Subject objects #1247

@acqxi

Description

@acqxi

Is there an existing issue for this?

  • I have searched the existing issues

Bug summary

When using TorchIO within the __getitem__ method of a custom Dataset, and returning a tuple of (torch.Tensor, dict) instead of a torchio.Subject, I receive a warning suggesting to use SubjectsLoader. However, replacing DataLoader with SubjectsLoader is not feasible in this case and leads to errors, as SubjectsLoader expects Subject instances. The warning seems unnecessary and cannot be easily suppressed.

Code for reproduction

import torch
import torchio as tio
from torch.utils.data import Dataset, DataLoader
from typing import Tuple, Dict
import numpy as np

class CustomDataset(Dataset):
    def __init__(self, data_list, transform=None):
        self.data_list = data_list
        self.transform = transform

    def __len__(self):
        return len(self.data_list)

    def __getitem__(self, idx) -> Tuple[torch.Tensor, Dict]:
        # Simulate loading image and mask arrays
        image_array = np.random.rand(1, 128, 128, 128).astype(np.float32)
        mask_array = np.random.randint(0, 2, (1, 128, 128, 128)).astype(np.int16)
        metadata = {'label': 0}

        # Create TorchIO Images
        image = tio.ScalarImage(tensor=image_array)
        mask = tio.LabelMap(tensor=mask_array)

        # Create a Subject
        subject = tio.Subject(image=image, mask=mask)

        # Apply transforms if any
        if self.transform:
            subject = self.transform(subject)

        # Process the image (e.g., apply masking)
        processed_image = subject['image'].data.float().contiguous()

        # Return tensor and metadata dictionary
        return processed_image, metadata

# Define any transforms (optional)
transform = tio.Compose([
    tio.RandomAffine(),
    tio.RandomFlip(),
])

# # Create the dataset and dataloader
# dataset = CustomDataset(data_list=[0, 1, 2], transform=transform)
# loader = DataLoader(dataset, batch_size=2)

# Attempting to replace `DataLoader` with `SubjectsLoader` as suggested:
from torchio import SubjectsLoader
loader = SubjectsLoader(dataset, batch_size=2)

# Iterate through the DataLoader
for batch in loader:
    images, labels = batch
    print(type(images), type(labels))
    break

Actual outcome

Running the provided code with a standard DataLoader results in a warning message from TorchIO, even though the data returned by the custom Dataset is a tuple of (torch.Tensor, dict) and does not involve torchio.Subject objects in the final output. The warning is unnecessary because the actual batch structure is fully compatible with PyTorch's DataLoader.

When attempting to follow the warning's suggestion to replace DataLoader with SubjectsLoader, the program fails with an AttributeError because SubjectsLoader expects each dataset item to be a torchio.Subject, but the Dataset returns a tuple instead. This makes SubjectsLoader unusable for this scenario.

Error messages

Traceback (most recent call last):
  File "example.py", line XX, in <module>
    batch = next(iter(loader))
  File "path_to_python/lib/python3.X/site-packages/torch/utils/data/dataloader.py", line XXX, in __next__
    data = self._next_data()
  File "path_to_python/lib/python3.X/site-packages/torch/utils/data/dataloader.py", line XXX, in _next_data
    data = self._dataset_fetcher.fetch(index)
  File "path_to_python/lib/python3.X/site-packages/torch/utils/data/_utils/fetch.py", line XX, in fetch
    return self.collate_fn(data)
  File "path_to_python/lib/python3.X/site-packages/torchio/data/loader.py", line XX, in _collate
    for key in first_subject.keys():
AttributeError: 'tuple' object has no attribute 'keys'

Expected outcome

I expect that when using TorchIO inside the __getitem__ method but returning standard PyTorch data structures (e.g., torch.Tensor and dict), the warning about using SubjectsLoader should not be displayed. Alternatively, there should be a way to suppress this warning when it's not applicable.

System info

Platform:   Linux-5.15.0-107-generic-x86_64-with-glibc2.27
TorchIO:    0.20.2
PyTorch:    2.3.1
SimpleITK:  2.2.1 (ITK 5.3)
NumPy:      1.26.4
Python:     3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:40:35) [GCC 12.3.0]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions