This repository contains the official implementation of the following paper:
Supercharging Floorplan Localization with Semantic Rays
Authors: Yuval Grader, Hadar Averbuch-Elor
Paper | arXiv | Project Page
The framework leverages both semantic and depth information for accurate camera pose estimation in indoor environments.
Floorplans provide a compact representation of the building's structure, revealing not only layout information but also detailed semantics such as the locations of windows and doors. However, contemporary floorplan localization techniques mostly focus on matching depth-based structural cues, ignoring the rich semantics communicated within floorplans. In this work, we introduce a semantic-aware localization framework that jointly estimates depth and semantic rays, consolidating over both for predicting a structural-semantic probability volume. Our probability volume is constructed in a coarse-to-fine manner: We first sample a small set of rays to obtain an initial low-resolution probability volume. We then refine these probabilities by performing a denser sampling only in high-probability regions and process the refined values for predicting a 2D location and orientation angle. We conduct an evaluation on two standard floorplan localization benchmarks. Our experiments demonstrate that our approach substantially outperforms state-of-the-art methods, achieving significant improvements in recall metrics compared to prior works. Moreover, we demonstrate that our framework can easily incorporate additional metadata such as room labels, enabling additional gains in both accuracy and efficiency.
For each dataset (S3D, ZInD):
-
Download the dataset
- Use the official dataset pages:
- You can use the provided script to automate download and extraction for S3D:
python data_utils/s3d/download_and_extract.py
-
Create Processed Datasets
- After downloading, process the raw data to create a processed folder for each dataset:
# For S3D python data_utils/s3d/create_data_sets.py # For ZInD (adapt as needed for ZInD structure) python data_utils/zind/create_data_sets.py
- This will create a
processedfolder with the required structure.
- After downloading, process the raw data to create a processed folder for each dataset:
-
Resize Images
- Resize all images to the required input size (360x640) to match prior work:
python data_utils/resize_images.py
- Resize all images to the required input size (360x640) to match prior work:
-
Generate Raycast Maps
- Generate grid raycasts for floorplan maps:
python -m data_utils.generate_maps_grid_raycasts_multi_thread
- Generate grid raycasts for floorplan maps:
-
Map Room Types
- Use the helper functions in
modules/semantic/semantic_mapper.pyto map room types as needed for your dataset. - Important for ZInD: Since ZInD contains over 250 different room types, we use a semantic mapper to consolidate these into a more manageable set of semantic categories. This mapping is essential for effective training and evaluation.
- Use the helper functions in
Our training framework is built on PyTorch Lightning, providing robust training capabilities with automatic logging and checkpointing.
- Each training run is controlled by a config file (YAML) in
Train_models/configurations/S3D/or the corresponding dataset folder. - Adjust the main parameters (e.g., learning rate, batch size, loss weights, etc.) in the config file before training.
-
Start Training:
python -m Train_models.Train
-
Monitor Training:
- TensorBoard: Training progress is automatically logged to TensorBoard. Launch TensorBoard to monitor:
tensorboard --logdir lightning_logs
- Checkpoints: PyTorch Lightning automatically saves checkpoints in the
lightning_logs/directory - Logs: Detailed training logs are saved in the
logs/directory
- TensorBoard: Training progress is automatically logged to TensorBoard. Launch TensorBoard to monitor:
-
Training Features:
- Automatic Mixed Precision: Enabled by default for faster training
- Gradient Clipping: Configured to prevent gradient explosion
- Early Stopping: Monitors validation loss to prevent overfitting
- Model Checkpointing: Saves best model based on validation metrics
- Monitor the training curves in TensorBoard to ensure proper convergence
- Adjust learning rate and batch size based on your hardware capabilities
- Use the validation set to tune hyperparameters
- Check the
lightning_logs/directory for saved model checkpoints
- For evaluation, specify the evaluation config, weights directory, and results directory in the evaluation script/config.
- Run evaluation:
python -m evaluation.eval_localization
- Download pretrained weights and place them in the appropriate directory (e.g.,
modules/weights/s3d/). - Download weights here
See requirements.txt for all dependencies.
If you use this code, please cite our paper:
@misc{grader2025superchargingfloorplanlocalizationsemantic,
title={Supercharging Floorplan Localization with Semantic Rays},
author={Yuval Grader and Hadar Averbuch-Elor},
year={2025},
eprint={2507.09291},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.09291},
}