A specialized PyTorch dataset for loading human hand manipulation trajectories from Lance columnar format, designed with two distinct data loading modes to serve different learning paradigms.
This dataset provides two fundamentally different loading mechanisms because supervised learning and reinforcement learning have incompatible data access patterns:
Traditional supervised learning processes data in synchronized batches where all sequences advance together. This mode is ideal for tasks like trajectory prediction, behavior cloning, or motion analysis where you need the complete temporal context.
Reinforcement learning with parallel environments requires:
- Stateful tracking: Each environment maintains its own position within a trajectory
- Asynchronous resets: When environment #5 completes its trajectory, only that specific trajectory should be replaced—not the entire batch
- Persistent GPU memory: All trajectories remain in GPU memory for efficient indexed access
The custom RL buffer addresses these requirements that standard DataLoaders cannot handle.
pip install -e .Requirements:
- Python 3.8+ (tested on Python 3.8)
- PyTorch 1.10+
- NumPy <1.24 (for chumpy compatibility)
from hand_trajectory_loader import HandTrajectoryDataset, create_dataloader
# Download dataset first (see Downloading Dataset section below)
# Or set MANO_DATASET_PATH=/path/to/dataset.lance
# Load dataset
dataset = HandTrajectoryDataset()
# Create DataLoader
dataloader = create_dataloader(dataset, batch_size=32, shuffle=True)
# Training loop
for batch in dataloader:
mano_rotations = batch['mano_rotations'] # [B, T, 15, 3]
object_pose = batch['object_pose'] # [B, T, 6]
attention_mask = batch['attention_mask'] # [B, T]
# Your model training here
predictions = model(mano_rotations, object_pose)
loss = criterion(predictions, targets)Download the DexCanvas dataset from HuggingFace Hub:
# Install download dependencies
pip install huggingface_hub tqdm
# Download dataset (public)
python scripts/download_dataset.py --repo-id DEXROBOT/DexCanvas
# Download to specific directory
python scripts/download_dataset.py \
--repo-id DEXROBOT/DexCanvas \
--output-dir ./data
# For private datasets, authenticate first
huggingface-cli login
# Or provide token directly
python scripts/download_dataset.py \
--repo-id DEXROBOT/DexCanvas \
--token YOUR_HF_TOKENThe download script will:
- Download the dataset to
~/.dexcanvas/datasets/(or your specified directory) - Verify the downloaded file integrity
- Provide instructions to set
MANO_DATASET_PATH
The dataset path can be configured in two ways:
-
Environment Variable (Recommended): Set the
MANO_DATASET_PATHenvironment variable:export MANO_DATASET_PATH=/path/to/trajectories.lance -
Direct Parameter: Pass the path directly when creating the dataset:
dataset = HandTrajectoryDataset(lance_path="/path/to/trajectories.lance")
If both are provided, the direct parameter takes precedence.
Ideal for tasks where you need complete trajectory sequences with temporal context:
from hand_trajectory_loader import HandTrajectoryDataset, create_dataloader
# Create dataset (will use MANO_DATASET_PATH env var if lance_path not provided)
dataset = HandTrajectoryDataset(
lance_path="/path/to/trajectories.lance", # Optional if MANO_DATASET_PATH is set
operators=["s01", "s02"], # Filter by human operator
objects=["cube2", "cylinder"], # Filter by manipulated object
manipulation_types=["01", "02"], # Filter by manipulation strategy
min_rating=3.0, # Filter by trajectory quality rating
min_frames=50, # Filter by minimum number of frames
max_sequence_length=1024, # Sequences padded/truncated to this length
load_active_only=True, # Load only frames with active manipulation
device="cpu" # Keep on CPU for DataLoader
)
# Standard PyTorch DataLoader
dataloader = create_dataloader(
dataset,
batch_size=32,
num_workers=4,
shuffle=True
)
# Training loop (see examples/object_prediction.py)
for batch in dataloader:
mano_rotations = batch['mano_rotations'] # [B, T, 15, 3] Hand joint rotations
object_pose = batch['object_pose'] # [B, T, 6] Object 6D pose
attention_mask = batch['attention_mask'] # [B, T] Marks valid vs padded frames
fps = batch['fps'] # [B] Frame rate for each trajectory
# Forward pass through your model
predictions = model(mano_rotations, object_pose)
# Loss computation with attention mask to ignore padded frames
loss = criterion(predictions, targets)
loss = loss * attention_mask.unsqueeze(-1)
loss = loss.sum() / attention_mask.sum()Essential for reinforcement learning where each environment progresses independently:
from hand_trajectory_loader.rl_buffer import RLTrajectoryBuffer
# Initialize buffer with persistent GPU storage
buffer = RLTrajectoryBuffer(
dataset=dataset,
num_envs=1024, # Number of parallel environments
max_trajectory_length=300, # Shorter sequences for RL
device="cuda"
)
# RL training loop (see examples/rl_training.py)
for step in range(num_steps):
# Each environment at its own timestep
obs, dones = buffer.step(auto_reset=True)
# obs contains current frame for each environment
mano_rotations = obs['mano_rotations'] # [1024, 15, 3]
object_pose = obs['object_pose'] # [1024, 6]
# Your policy network
actions = policy(mano_rotations.flatten(1))
# Physics simulation would go here
# ...
# Buffer automatically replaces only completed trajectories
# Environment 5 might reset while others continueStandard DataLoaders reset all sequences simultaneously—incompatible with RL's asynchronous nature. The custom buffer maintains independent trajectory positions and only replaces completed trajectories, enabling true parallel environment simulation.
The dataset uses Lance columnar format for efficient storage and retrieval:
- Fast metadata reading: ~2ms to read all trajectory metadata (vs ~100ms for directory scanning)
- Selective column loading: 23x faster loading by excluding mesh_vertices (219ms → 10ms per trajectory)
- Efficient filtering: Query by rating, FPS, frames, or any metadata field
- Compact storage: Columnar compression reduces storage by ~40%
- Row-based indexing: Direct integer indexing into dataset rows (no string operations)
The Lance format enables powerful metadata-based filtering:
# Filter by quality rating
dataset = HandTrajectoryDataset(
lance_path="trajectories.lance",
min_rating=3.5, # Only trajectories rated 3.5 or higher
)
# Get dataset statistics
stats = dataset.get_statistics()
print(f"Total trajectories: {stats['total_trajectories']}")
print(f"Average rating: {stats['avg_rating']:.2f}")
print(f"Average frames: {stats['avg_frames']:.1f}")
# Find trajectories by specific criteria
high_quality_indices = dataset.filter_by_rating(min_rating=4.0)
target_fps_indices = dataset.filter_by_fps(target_fps=100.0)Trajectories are organized by:
- Operator: Human subject performing the manipulation (s01, s02, etc.)
- Object: Item being manipulated (cube2, cylinder, etc.)
- Manipulation Type: Different strategies for manipulating the same object (01, 02, 03)
- Rating: Quality score for the trajectory
- FPS: Frame rate (typically 100 Hz)
These attributes let you control data diversity—train on specific operator-object combinations or generalize across all variations.
Real manipulation trajectories vary in length. The dataset provides two mechanisms:
Sequences are padded to max_sequence_length with attention masks distinguishing real data from padding:
# attention_mask[t] = 1.0 for real data, 0.0 for padding
# Properly handle in loss computation
loss = criterion(predictions, targets)
loss = loss * attention_mask.unsqueeze(-1) # Zero out padded positions
loss = loss.sum() / attention_mask.sum() # Average only over real framesMost trajectories contain idle frames before/after actual manipulation. Enable load_active_only=True to load only meaningful frames:
# Focus on manipulation, skip idle frames
dataset = HandTrajectoryDataset(
lance_path="trajectories.lance",
load_active_only=True, # Load only active manipulation frames
max_sequence_length=512 # Can use smaller buffers
)
# Reduces memory usage by 50-75% and accelerates trainingThe Lance dataset contains trajectories in columnar format with the following schema:
-
trajectory_meta_data: Metadata dictionary containing:mocap_raw_data_source: Source information (operator, object, gesture, sequence)data_fps: Frame rate in Hztotal_frames: Number of framesrating: Quality ratingobject_move_start_frame: Active manipulation startobject_move_end_frame: Active manipulation endmano_hand_shape: Static hand shape parameters [10]
-
sequence_info.timestamp: Frame timestamps -
sequence_info.hand_joint.position: Hand position [T, 3] -
sequence_info.hand_joint.rotation: Hand rotation [T, 3] -
sequence_info.hand_joint.finger_pose: Finger pose parameters [T, 45] -
sequence_info.object_info.position: Object position [T, 3] -
sequence_info.object_info.euler_angle: Object rotation [T, 3] -
sequence_info.mano_model_output.joints: Joint positions [T, 21, 3] (optional) -
sequence_info.mano_model_output.mesh_vertices: Mesh vertices [T, 778, 3] (not loaded by default)
The dataset provides consistently formatted tensors:
mano_rotations:[T, 15, 3]- 15 hand joints × 3 axis-angle parametersmano_shape:[T, 10]- Hand shape parametersmano_translation:[T, 3]- Hand positionmano_global_rotation:[T, 3]- Hand orientationobject_pose:[T, 6]- Object 6D poseattention_mask:[T]- Valid frame indicatorfps: Frame rate in Hz (typically 100 Hz)time_offset: Time offset in seconds when using active loading
The dataset automatically:
- Parses timestamps from Lance dataset
- Calculates FPS from metadata (~100 Hz)
- Converts timestamps to relative seconds from trajectory start
- Provides time offsets when using
load_active_only=True
Users can convert between frame indices and time using:
time_in_seconds = frame_index / fps
frame_index = int(time_in_seconds * fps)The LanceAdapter uses selective column reading for optimal performance:
# Columns loaded by default (10ms per trajectory)
columns = [
'trajectory_meta_data',
'sequence_info.timestamp',
'sequence_info.hand_joint.position',
'sequence_info.hand_joint.rotation',
'sequence_info.hand_joint.finger_pose',
'sequence_info.object_info.position',
'sequence_info.object_info.euler_angle',
'sequence_info.mano_model_output.joints',
]
# mesh_vertices is excluded by default (would add 200ms per trajectory)
# Only load if you specifically need mesh dataThis selective reading provides a 23x speedup compared to loading all columns.
Visualize hand trajectories with 3D rendering using Open3D:
# Clone with submodules to get MANO models and object meshes
git clone --recursive [email protected]:ai/hand_trajectory_loader.git
# Or if already cloned, initialize submodules
git submodule update --init --recursive
# Install visualization dependencies (or use pip install -e . to install all deps)
pip install open3d trimesh scipy chumpy
# Visualize a trajectory (use your downloaded dataset path)
python examples/visualize_trajectory.py ~/.dexcanvas/datasets/mocap_ver0.1.parquet 0 \
--mano-model assets/mano/models/MANO_RIGHT.pkl \
--object assets/objects/cube2.stl \
--show-joints \
--max-sequence-length 2048- SPACE - Pause/Resume animation
- M - Toggle hand mesh visibility
- O - Toggle object mesh visibility
- Q - Quit (or close window)
The assets submodule contains:
- MANO models:
assets/mano/models/MANO_LEFT.pkl,MANO_RIGHT.pkl - Object meshes:
assets/objects/*.stl(cube, cylinder, sphere, etc.)
See complete working examples:
examples/visualize_trajectory.py- 3D visualization of hand trajectoriesexamples/object_prediction.py- Supervised learning for trajectory predictionexamples/rl_training.py- Parallel RL environment training
Current Status (Python 3.8):
- ✅ RLTrajectoryBuffer: All 13 tests passing
⚠️ Dataset tests: Legacy NPY format tests (deprecated, use Lance/Parquet format)
# Run RL Buffer tests (recommended)
pytest tests/test_rl_buffer.py -v
# Note: test_dataset.py and test_active_loading.py are for legacy NPY format
# Current version uses Lance/Parquet format - see examples/ for usageThe project has migrated from NPY to Lance/Parquet format. Legacy dataset tests are kept for reference but use deprecated APIs.
MIT