Skip to content

Conversation

@AndreaBozzo
Copy link

Fixes #3129

Summary

The type hint for to_tensor_fn in LanceDataset.__init__ incorrectly declared a single-argument callable, but the function is called with additional keyword arguments.

Changes

  • Updated type hint from Callable[[pa.RecordBatch], ...] to Callable[..., ...] to accurately reflect that the function accepts arbitrary arguments
  • Enhanced docstring to document the expected signature with optional hf_converter and use_blob_api keyword arguments

Testing

  • Verified Python syntax compilation passes
  • No functional changes, only type hint correction

Fixes lance-format#3129

The type hint for  in  was
declaring a single-argument callable, but the function is actually
called with additional keyword arguments ( and
).

Changes:
- Update type hint to use  to allow arbitrary args
- Enhance docstring to document the expected function signature
@github-actions github-actions bot added bug Something isn't working python labels Dec 24, 2025
Copy link
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fix the type hint by making it correct and specific?

Comment on lines 196 to 198
Callable[[pa.RecordBatch], Union[dict[str, torch.Tensor], torch.Tensor]]
Callable[
..., Union[dict[str, torch.Tensor], torch.Tensor]
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems wrong to just make this more generic. How about this?

Callable[[pa.RecordBatch, ...], Union[dict[str, torch.Tensor], torch.Tensor]]

Alternatively, can be fancier and do:

from typing import Protocol

class ToTensorFn(Protocol):
    def __call__(self, batch: pa.RecordBatch, hf_converter, use_blob_api: bool) -> Union[dict[str, torch.Tensor], torch.Tensor]: ...

...
to_tensor_fn: ToTensorFn

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, i agree about the hint being too generic.
I modified it with the ' less fancy' approach + ran lint in the last commit

@wjones127 wjones127 self-assigned this Jan 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

lance.torch.data.LanceDataset(to_tensor_fn=...) typehint inconsistent with usage.

2 participants