This repository is tested on Ubuntu 22 with an Nvidia RTX 4090 GPU. To use this repository, please follow these steps:
-
Open terminal in your preferred work directory and enter the following commands:
git clone https://https://github.com/ShuvoNewaz/2D-Object-Tracking-KITTIcd 2D-Object-Tracking-KITTI -
Make sure Anaconda is installed. Enter the following command:
conda env create -f setup/environment.yml
This will create a conda environment with the required libraries. Activate the environment by typing
conda activate object_tracking_2d
-
Download and organize the dataset by entering
bash setup/download.shThe above command will also download pretrained weights required for a subproblem (object detection). Refer to this repository to get your own trained weights.
-
To view results, open the notebook, activate the installed environment, and run all.
The dataset used for this project is the KITTI 2D Tracking Evaluation. The parts of this dataset used are:
- RGB Images (15 GB)
- Training Labels (9 MB) (optional)
If the above steps were successfully followed, the dataset will already be downloaded to your required directory. The labels are optional for this project, since we are using pretrained weights from a different dataset.
The RGB images are continuous frames from different videos. The time difference between consecutive frames is 0.1 seconds.
The labels exist only for the training set. For this work, we have not used the labels provided for the tracking dataset. Instead, we use the weights generated from this repository to predict our own bounding boxes.
The details on the object detector can be found here.
The state to be determined is,
where
where
The acceleration is modeled as a zero-mean Gaussian process noise with variance
where
The process noise vector arises purely from acceleration,
Now, the full 2D process noise covariance matrix models the uncertainty in the process,
Initialize a random error covariance matrix
The new estimated state is,
The new error covariance matrix is,
The observation is the center of the bounding box,
which is obtained from the object detector. The observation matrix maps the predicted state to the observed measurement. Since the observation vector only contains position
The measurement noise covariance matrix represents the uncertainty in the observation. It depends on the accuracy of the bounding box detection.
The Kalman gain is,
The updated state is,
The error covariance matrix is updated as,
Unlike the Kalman filter that estimates a single state, a particle filter maintains many guesses of the object's state. The Kalman filter tracks the mean and covariance of a state under the assumption of linear dynamics and Gaussian noise. The particle filter approximates the entire distribution using a set of samples called Particles. Each particle represents a hypothesis of where the tracked object might be. Each particle is a possible state, and each has a weight representing how likely it is. Over time, the filter
- Predicts where each particle would move (based on a motion model + noise).
- Updates its weight based on how well it matches the current observation.
- Resamples - keeping the most likely particles and discarding unlikely ones.
We start with a cloud of particles centered around the initial measurement (detection). If there are no initial detections, we create a cloud that is uniformly distributed.
For each particle
For each particle
We draw a new set of particles by sampling from the current set, with probability proportional to their weights. Particles with high likelihood are duplicated and with low likelihood are discarded.
The current estimated position is the weighted average of the particles.
The red dots represent the predicted position (