Out-of-Distribution (OOD) Detection system using two different approaches: Classifier-based (ResNet18 + MC Dropout) and CVAE-based (Bayesian Convolutional Variational Autoencoder) methods.
μ΅νμ(Hyunsu Choi) - λΈλ μΈμ€ν λ°, μ½λ μμ±, λ°ν μλ£ κ΅¬μ± λ° μ μ, λ°μ΄ν°μ
μμ§
μ¬μ€νΈ(Junho Sim) - λΈλ μΈμ€ν λ°, μ½λ 리뷰 λ° μμ , λ°ν μλ£ μ μ, λ°μ΄ν°μ
μμ§
μ 무ν(Muhyun Sin) - λΈλ μΈμ€ν λ°, μ½λ 리뷰, λ°ν μλ£ μ μ, λ°μ΄ν°μ
μμ§
μ±κ²½μ(Kyungwon Chae) - λΈλ μΈμ€ν λ°, μ½λ 리뷰, λ°ν μλ£ κ΅¬μ± λ° μ μ, λ°μ΄ν°μ
μμ§
- Overview
- System Architecture
- Directory Structure
- Methods
- Quick Start
- Docker Usage Guide
- Output Format
- Configuration
- Troubleshooting
- References & Learning Resources
This system implements Out-of-Distribution (OOD) Detection using two different approaches to identify images that don't belong to the training distribution. The system is designed to work with the Animals-10 dataset (In-Distribution) and Pokemon dataset (Out-of-Distribution).
OOD detection is the task of identifying whether a new input belongs to the same distribution as the training data. In this system:
- ID (In-Distribution): Animals-10 dataset (butterfly, cat, chicken, cow, dog, elephant, horse, sheep, spider, squirrel)
- OOD (Out-of-Distribution): Pokemon dataset (images that are not animals)
The system consists of two independent OOD detection methods. Below are 5 different architectural views of the system:
High-level component diagram showing the overall system structure:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β OOD Detection System β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ β
β β Method 1: Classifier β β Method 2: VAE β β
β β ββββββββββββββββββββ β β ββββββββββββββββββββ β β
β β β β β β
β β ResNet18 + MC Dropout β β Bayesian VAE β β
β β β’ Pretrained ImageNet β β β’ Encoder-Decoder β β
β β β’ Fine-tuned on Animals β β β’ Latent Space (128D) β β
β β β’ Entropy-based OOD β β β’ Reconstruction-based β β
β ββββββββββββ¬ββββββββββββββββ ββββββββββββ¬ββββββββββββββββ β
β β β β
β β β β
β ββββββββββββββ¬ββββββββββββββββββββ β
β β β
β βββββββββΌβββββββββ β
β β Results Layer β β
β β β’ CSV Reports β β
β β β’ Histograms β β
β β β’ Sorted Imgs β β
β ββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
How data flows through the system from input to output:
βββββββββββββββ
β Input β Animals-10 (ID) / Pokemon (OOD)
β Images β
ββββββββ¬βββββββ
β
βββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Preprocessingβ β Preprocessingβ β Preprocessingβ
β (224x224) β β (64x64) β β (224x224) β
β Normalize β β ToTensor β β Normalize β
ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ
β β β
β β β
βΌ βΌ βΌ
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Classifier β β VAE β β Single Image β
β Pipeline β β Pipeline β β Detection β
β β β β β β
β β’ 30x MC β β β’ 30x MC β β β’ 30x MC β
β Forward β β Reconstructβ β Forward β
β β’ Entropy β β β’ MSE + Var β β β’ Entropy β
β Calc β β Calc β β Calc β
ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ
β β β
β β β
ββββββββββββββββ¬ββββββββββββ΄βββββββββββββ¬ββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββ ββββββββββββββββ
β Decision β β Results β
β Logic β β Storage β
β β β β
β ID/OOD β β β’ CSV β
β Threshold β β β’ Images β
β Comparison β β β’ Plots β
ββββββββββββββββ ββββββββββββββββ
How different components interact with each other:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Component Interaction View β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ ββββββββββββββββ βββββββββββββ
β β Docker ββββββββββΆβ Source ββββββββββΆβ Models ββ
β β Containers β Mount β Code β Train β Storage ββ
β β β β β β ββ
β β β’ Classifier β β β’ train.py β β β’ .pth ββ
β β β’ VAE β β β’ evaluate β β β’ Weightsββ
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββ¬βββββββ
β β β β β
β β β β β
β β βΌ β β
β β ββββββββββββββββ β β
β β β Data β β β
β β β Loader β β β
β β β β β β
β β β β’ Animals β β β
β β β β’ Pokemon β β β
β β ββββββββ¬ββββββββ β β
β β β β β
β β β β β
β ββββββββββββββββββββββΌβββββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββ β
β β Evaluation β β
β β Engine β β
β β β β
β β β’ MC Samplingβ β
β β β’ Score Calc β β
β β β’ Threshold β β
β ββββββββ¬ββββββββ β
β β β
β βΌ β
β ββββββββββββββββ β
β β Results β β
β β Manager β β
β β β β
β β β’ CSV Writer β β
β β β’ Image Copy β β
β β β’ Plot Gen β β
β ββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Detailed flow of the training process for both methods:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Training Pipeline Architecture β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β CLASSIFIER TRAINING PIPELINE: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β [Animals Dataset] β β
β β β β β
β β βΌ β β
β β [DataLoader] βββΊ [Transform: 224x224, Normalize] β β
β β β β β
β β βΌ β β
β β [ResNet18] βββΊ [Pretrained ImageNet Weights] β β
β β β β β
β β βΌ β β
β β [Modify FC] βββΊ [Dropout(0.5) + Linear(10)] β β
β β β β β
β β βΌ β β
β β [Training Loop] β β
β β β’ Forward Pass β β
β β β’ CrossEntropy Loss β β
β β β’ Backward Pass β β
β β β’ Adam Optimizer β β
β β β β β
β β βΌ β β
β β [Save Model] βββΊ /app/models/Animals-10/classifier/ β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β VAE TRAINING PIPELINE: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β [Animals Dataset] β β
β β β β β
β β βΌ β β
β β [DataLoader] βββΊ [Transform: 64x64, ToTensor] β β
β β β β β
β β βΌ β β
β β [Bayesian VAE] β β
β β β β β
β β βββΊ [Encoder] βββΊ [ΞΌ, log(ΟΒ²)] βββΊ [z ~ N(ΞΌ,ΟΒ²)] β β
β β β β β
β β βββΊ [Decoder] βββΊ [Reconstruction] β β
β β β β β
β β βΌ β β
β β [Loss Calculation] β β
β β β’ MSE (Reconstruction) β β
β β β’ KL Divergence (Regularization) β β
β β β β β
β β βΌ β β
β β [Training Loop] (BF16 Mixed Precision) β β
β β β’ Forward Pass β β
β β β’ Loss Backward β β
β β β’ Adam Optimizer β β
β β β β β
β β βΌ β β
β β [Save Model] βββΊ /app/models/Animals-10/vae/ β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Detailed flow of the OOD detection/evaluation process:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Inference Pipeline Architecture β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β CLASSIFIER INFERENCE PIPELINE: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β [Input Image] βββΊ [Preprocess: 224x224, Normalize] β β
β β β β β
β β βΌ β β
β β [MC Dropout Loop: 30 iterations] β β
β β β β β
β β βββΊ [Forward Pass 1] βββΊ [Logits] βββΊ [Softmax] β β
β β βββΊ [Forward Pass 2] βββΊ [Logits] βββΊ [Softmax] β β
β β βββΊ ... β β
β β βββΊ [Forward Pass 30] βββΊ [Logits] βββΊ [Softmax] β β
β β β β β
β β βΌ β β
β β [Average Probabilities] βββΊ [Mean Distribution] β β
β β β β β
β β βΌ β β
β β [Entropy Calculation] β β
β β H = -Ξ£(p_i * log(p_i)) β β
β β β β β
β β βΌ β β
β β [Decision] β β
β β if H > 0.6: OOD β β
β β else: ID (with predicted class) β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β VAE INFERENCE PIPELINE: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β [Input Image] βββΊ [Preprocess: 64x64, ToTensor] β β
β β β β β
β β βΌ β β
β β [MC Sampling Loop: 30 iterations] β β
β β β β β
β β βββΊ [Encode] βββΊ [Sample zβ] βββΊ [Decode] βββΊ [Reconβ]β
β β βββΊ [Encode] βββΊ [Sample zβ] βββΊ [Decode] βββΊ [Reconβ]β
β β βββΊ ... β β
β β βββΊ [Encode] βββΊ [Sample zββ] βββΊ [Decode] βββΊ [Reconββ]β
β β β β β
β β βΌ β β
β β [Calculate Scores] β β
β β β’ Mean Reconstruction = mean(Reconβ...Reconββ) β β
β β β’ Reconstruction Error = MSE(Original, Mean Recon) β β
β β β’ Uncertainty = Variance(Reconβ...Reconββ) β β
β β β β β
β β βΌ β β
β β [Anomaly Score] β β
β β Score = Reconstruction Error + Uncertainty β β
β β β β β
β β βΌ β β
β β [Decision] β β
β β if Score > 0.025: OOD β β
β β else: ID β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β COMMON OUTPUT PROCESSING: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β [OOD Decision] βββΊ [Result Storage] β β
β β β β β β
β β β βββΊ [CSV File] β β
β β β βββΊ [Image Copy] β β
β β β βββΊ [Histogram Plot] β β
β β β β β
β β βββΊ [Visualization] βββΊ [Results Directory] β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
OOD/
βββ data/ # Dataset storage
β βββ animals/ # In-Distribution data (Animals-10)
β β βββ butterfly/
β β βββ cat/
β β βββ chicken/
β β βββ ... (10 animal classes)
β βββ pokemon/ # Out-of-Distribution data
β βββ unknown/
β
βββ models/ # Trained model weights
β βββ Animals-10/
β βββ classifier/ # ResNet18 classifier model
β β βββ animals10_resnet18.pth
β βββ vae/ # Bayesian VAE model
β βββ vae_final.pth
β
βββ results/ # Evaluation results
β βββ Animals-10/
β βββ classifier/
β β βββ run_1/ # Each run creates a new folder
β β βββ ood_results_run_1.csv
β β βββ histogram_run_1.png
β β βββ sorted_images/
β βββ vae/
β βββ run_1/
β βββ vae_results_run_1.csv
β βββ histogram_run_1.png
β βββ sorted_images/
β
βββ src/ # Source code
β βββ Animals-10/
β βββ classifier/ # Classifier-based OOD detection
β β βββ model.py # ResNet18 with MC Dropout
β β βββ train.py # Training script
β β βββ evaluate_ood.py # Batch evaluation
β β βββ detect_ood.py # Single image detection
β βββ vae/ # VAE-based OOD detection
β βββ model.py # Bayesian VAE architecture
β βββ train.py # Training script
β βββ evaluate_ood.py # Evaluation script
β
βββ docker/ # Docker configuration
β βββ Dockerfile.classifier # Classifier container
β βββ Dockerfile.vae # VAE container
β
βββ docker-compose.yml # Container orchestration
- Model: ResNet18 (pretrained on ImageNet)
- Technique: Monte Carlo (MC) Dropout for uncertainty estimation
- Detection Metric: Entropy of predicted class probabilities
-
Training Phase (
classifier/train.py):- Loads ResNet18 pretrained on ImageNet
- Replaces final layer with Dropout (p=0.5) + Linear layer
- Fine-tunes on Animals-10 dataset
- Saves model to
/app/models/Animals-10/classifier/
-
Detection Phase (
classifier/evaluate_ood.py):- For each image, performs 30 forward passes with Dropout enabled
- Calculates average probability distribution across all passes
- Computes entropy of the distribution:
Entropy = -Ξ£(p_i * log(p_i)) - High entropy β Model is uncertain β Likely OOD
- Low entropy β Model is confident β Likely ID
-
Decision Rule:
- If
entropy > 0.6β OOD (Pokemon/Unknown) - If
entropy β€ 0.6β ID (Animal class)
- If
- MC Dropout: Enables uncertainty quantification during inference
- Entropy-based scoring: Measures prediction confidence
- Batch processing: Efficient evaluation of large datasets
- Model: Bayesian Variational Autoencoder (VAE)
- Technique: Reconstruction error + uncertainty estimation
- Detection Metric: Anomaly score (MSE + variance)
-
Training Phase (
vae/train.py):- Trains a VAE to reconstruct animal images
- Encoder: Compresses images to latent space (128 dimensions)
- Decoder: Reconstructs images from latent codes
- Uses MSE loss + KL divergence (standard VAE loss)
- Optimized for H100 GPU with mixed precision (BF16)
- Saves model to
/app/models/Animals-10/vae/
-
Detection Phase (
vae/evaluate_ood.py):- For each image, performs 30 reconstructions (MC sampling)
- Calculates:
- Reconstruction Error: MSE between original and mean reconstruction
- Uncertainty: Variance across 30 reconstructions
- Anomaly Score = Reconstruction Error + Uncertainty
- High score β Poor reconstruction β Likely OOD
- Low score β Good reconstruction β Likely ID
-
Decision Rule:
- If
anomaly_score > 0.025β OOD (Pokemon) - If
anomaly_score β€ 0.025β ID (Animal)
- If
- Reconstruction-based: Learns the distribution of ID data
- Bayesian uncertainty: Quantifies model uncertainty
- H100 optimized: Uses torch.compile and BF16 precision
# Extract datasets
unzip data/animals.zip -d data/
unzip pokemon.zip -d data/pokemon/docker-compose up -dTrain Classifier:
docker exec -it animals_classifier_container bash
cd /app/src/Animals-10/classifier
python train.pyTrain VAE:
docker exec -it ood_vae_container bash
cd /app/src/Animals-10/vae
python train.pyEvaluate with Classifier:
docker exec -it animals_classifier_container bash
cd /app/src/Animals-10/classifier
python evaluate_ood.pyEvaluate with VAE:
docker exec -it ood_vae_container bash
cd /app/src/Animals-10/vae
python evaluate_ood.pydocker exec -it animals_classifier_container bash
cd /app/src/Animals-10/classifier
python detect_ood.py --image /path/to/image.jpg- Docker: Version 20.10 or higher
- Docker Compose: Version 2.0 or higher
- NVIDIA Docker Runtime: For GPU support (nvidia-docker2)
- NVIDIA GPU: With CUDA support (for training/evaluation)
# Check Docker version
docker --version
# Check Docker Compose version
docker-compose --version
# Check NVIDIA Docker runtime
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smiThe system uses two Docker containers:
- Container Name:
animals_classifier_container - Image:
animals-classifier:v1 - Purpose: Classifier training and evaluation
- Base Image:
nvcr.io/nvidia/pytorch:23.10-py3 - Ports:
8889:8888(Jupyter Lab)6006:6006(TensorBoard)
- Container Name:
ood_vae_container - Image:
ood-vae:h100 - Purpose: VAE training and evaluation
- Base Image:
nvcr.io/nvidia/pytorch:23.10-py3 - Ports:
8888:8888(Jupyter Lab)
- Optimization: H100 GPU optimized with BF16 support
# Build all containers
docker-compose build
# Or build individually
docker-compose build classifier
docker-compose build vae# Start containers in detached mode
docker-compose up -d
# View container status
docker-compose ps# Check if containers are running
docker ps
# View container logs
docker-compose logs classifier
docker-compose logs vae# Start all containers
docker-compose up -d
# Start specific container
docker-compose up -d classifier
docker-compose up -d vae# Stop all containers
docker-compose down
# Stop without removing volumes
docker-compose stop
# Stop specific container
docker-compose stop classifier# Restart all containers
docker-compose restart
# Restart specific container
docker-compose restart classifier# View logs for all containers
docker-compose logs
# View logs for specific container
docker-compose logs classifier
docker-compose logs vae
# Follow logs in real-time
docker-compose logs -f classifier
# View last 100 lines
docker-compose logs --tail=100 classifierClassifier Container:
# Enter interactive bash shell
docker exec -it animals_classifier_container bash
# Once inside, you're in /app directory
cd /app/src/Animals-10/classifierVAE Container:
# Enter interactive bash shell
docker exec -it ood_vae_container bash
# Once inside, you're in /app directory
cd /app/src/Animals-10/vaeFrom Host (without entering container):
# Run classifier training
docker exec -it animals_classifier_container \
python /app/src/Animals-10/classifier/train.py
# Run classifier evaluation
docker exec -it animals_classifier_container \
python /app/src/Animals-10/classifier/evaluate_ood.py
# Run VAE training
docker exec -it ood_vae_container \
python /app/src/Animals-10/vae/train.py
# Run VAE evaluation
docker exec -it ood_vae_container \
python /app/src/Animals-10/vae/evaluate_ood.py
# Single image detection
docker exec -it animals_classifier_container \
python /app/src/Animals-10/classifier/detect_ood.py \
--image /app/data/pokemon/unknown/image.jpgFrom Inside Container:
# Enter container first
docker exec -it animals_classifier_container bash
# Then run scripts
cd /app/src/Animals-10/classifier
python train.py
python evaluate_ood.py
python detect_ood.py --image /app/data/pokemon/unknown/image.jpgBoth containers are configured with GPU support. Verify GPU access:
# Check GPU in classifier container
docker exec -it animals_classifier_container nvidia-smi
# Check GPU in VAE container
docker exec -it ood_vae_container nvidia-smi
# Run Python with GPU check
docker exec -it animals_classifier_container \
python -c "import torch; print(torch.cuda.is_available())"The containers use volume mounts to share data between host and containers:
| Host Path | Container Path | Purpose |
|---|---|---|
./src |
/app/src |
Source code |
./data |
/app/data |
Datasets |
./models |
/app/models |
Trained models |
./results |
/app/results |
Evaluation results |
From Host to Container:
- Files in
./src/are accessible at/app/src/in container - Files in
./data/are accessible at/app/data/in container - Models saved to
/app/models/appear in./models/on host - Results saved to
/app/results/appear in./results/on host
Example:
# On host: create a test file
echo "test" > ./src/test.txt
# In container: access the file
docker exec -it animals_classifier_container cat /app/src/test.txt- Real-time Sync: Changes in host directories are immediately visible in containers
- No Copy Needed: Files are shared, not copied
- Persistent Storage: Data persists even after container removal (unless using
-vflag)
| Container | Host Port | Container Port | Service |
|---|---|---|---|
| Classifier | 8889 | 8888 | Jupyter Lab |
| Classifier | 6006 | 6006 | TensorBoard |
| VAE | 8888 | 8888 | Jupyter Lab |
Jupyter Lab (Classifier):
# Access at: http://localhost:8889
# Default password/token: Check container logs
docker-compose logs classifier | grep tokenJupyter Lab (VAE):
# Access at: http://localhost:8888
# Default password/token: Check container logs
docker-compose logs vae | grep tokenTensorBoard (Classifier):
# Start TensorBoard inside container
docker exec -it animals_classifier_container \
tensorboard --logdir=/app/results --port=6006 --host=0.0.0.0
# Access at: http://localhost:6006# 1. Start containers
docker-compose up -d
# 2. Train classifier
docker exec -it animals_classifier_container \
python /app/src/Animals-10/classifier/train.py
# 3. Train VAE
docker exec -it ood_vae_container \
python /app/src/Animals-10/vae/train.py
# 4. Evaluate classifier
docker exec -it animals_classifier_container \
python /app/src/Animals-10/classifier/evaluate_ood.py
# 5. Evaluate VAE
docker exec -it ood_vae_container \
python /app/src/Animals-10/vae/evaluate_ood.py# 1. Start containers
docker-compose up -d
# 2. Enter classifier container
docker exec -it animals_classifier_container bash
# 3. Inside container, navigate and work
cd /app/src/Animals-10/classifier
python train.py # Edit code on host, run in container# Test single image with classifier
docker exec -it animals_classifier_container \
python /app/src/Animals-10/classifier/detect_ood.py \
--image /app/data/pokemon/unknown/pikachu.jpg# Terminal 1: Start training
docker exec -it animals_classifier_container \
python /app/src/Animals-10/classifier/train.py
# Terminal 2: Monitor logs
docker-compose logs -f classifier
# Terminal 3: Check GPU usage
watch -n 1 docker exec animals_classifier_container nvidia-smiSymptoms: Container exits immediately after starting
Solutions:
# Check logs
docker-compose logs classifier
# Check if port is already in use
netstat -tulpn | grep 8889
# Rebuild container
docker-compose build --no-cache classifier
docker-compose up -d classifierSymptoms: torch.cuda.is_available() returns False
Solutions:
# Verify NVIDIA Docker runtime
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
# Check container GPU access
docker exec -it animals_classifier_container nvidia-smi
# Verify docker-compose.yml has runtime: nvidia
cat docker-compose.yml | grep runtimeSymptoms: Cannot write to mounted volumes
Solutions:
# Check file permissions
ls -la ./models
ls -la ./results
# Fix permissions (if needed)
sudo chown -R $USER:$USER ./models ./resultsSymptoms: CUDA out of memory errors
Solutions:
# Reduce batch size in training scripts
# Edit: src/Animals-10/classifier/train.py
# Change: BATCH_SIZE = 32 # Reduce from 64Symptoms: ImportError: No module named 'X'
Solutions:
# Install missing package in container
docker exec -it animals_classifier_container pip install package_name
# Or rebuild container with new dependencies
# Edit Dockerfile, then:
docker-compose build classifier# Start all
docker-compose up -d
# Stop all
docker-compose down
# View logs
docker-compose logs -f
# Enter container
docker exec -it animals_classifier_container bash
docker exec -it ood_vae_container bash
# Run script
docker exec -it animals_classifier_container python /app/src/.../script.py
# Check GPU
docker exec -it animals_classifier_container nvidia-smi
# Rebuild
docker-compose build
# Clean restart
docker-compose down && docker-compose up -d| Task | Host Path | Container Path |
|---|---|---|
| Edit code | ./src/... |
/app/src/... |
| Add data | ./data/... |
/app/data/... |
| Check models | ./models/... |
/app/models/... |
| View results | ./results/... |
/app/results/... |
Each evaluation run creates a new run_X folder:
results/Animals-10/classifier/run_1/
βββ ood_results_run_1.csv # Detailed results per image
βββ mean_entropy_run_1.txt # Summary statistics
βββ histogram_run_1.png # Visualization
βββ sorted_images/
βββ Predicted_ID/ # Images classified as ID
βββ Predicted_OOD/ # Images classified as OOD
Classifier Results:
Filename: Image filenameTrue_Label: ID(Animal) or OOD(Pokemon)Entropy_Score: Uncertainty scoreFinal_Prediction: ID or OODPred_Class: Predicted animal classFull_Path: Original image path
VAE Results:
Filename: Image filenameTrue_Label: Animals or PokemonAnomaly_Score: Reconstruction error + uncertaintyPrediction: ID or OODOriginal_Path: Original image path
NUM_MC_SAMPLES = 30: Number of forward passes for uncertainty estimationENTROPY_THRESHOLD = 0.6: OOD detection thresholdBATCH_SIZE = 64: Evaluation batch sizeNUM_EPOCHS = 10: Training epochs
ANOMALY_THRESHOLD = 0.025: OOD detection thresholdBATCH_SIZE = 256: Training batch sizeNUM_EPOCHS = 50: Training epochslatent_dim = 128: Latent space dimensionality
| Aspect | Classifier Method | VAE Method |
|---|---|---|
| Approach | Discriminative | Generative |
| Detection | Entropy (uncertainty) | Reconstruction error |
| Training | Faster (10 epochs) | Slower (50 epochs) |
| Inference | 30 forward passes | 30 reconstructions |
| Interpretability | Class probabilities | Visual reconstruction |
| Use Case | When you have labels | When you only have ID data |
- Model not found: Ensure training scripts have been run first
- CUDA out of memory: Reduce batch size in evaluation scripts
- No data found: Check that datasets are extracted in
data/directory - Container issues: Use
docker-compose logsto check container status
- Low entropy (< 0.6): Model is confident β ID
- High entropy (> 0.6): Model is uncertain β OOD
- Low anomaly score (< 0.025): Good reconstruction β ID
- High anomaly score (> 0.025): Poor reconstruction β OOD
The histogram plots show the distribution of scores for ID and OOD samples. A good OOD detector should show:
- Clear separation between ID and OOD distributions
- ID samples clustered at low scores
- OOD samples spread at high scores
- MC Dropout: Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian approximation
- VAE: Kingma, D. P., & Welling, M. (2013). Auto-Encoding Variational Bayes
- ResNet: He, K., et al. (2016). Deep Residual Learning for Image Recognition
The following video series will help you improve your computer vision skills and deepen your understanding of the concepts used in this OOD detection system:
Direct Link: Computer Vision Course - YouTube
This course covers essential computer vision topics that are directly relevant to this OOD detection system, including:
- Deep learning architectures (ResNet, VAE)
- Uncertainty estimation techniques
- Out-of-distribution detection methods
- Model evaluation and interpretation
- Docker Documentation: https://docs.docker.com/
- Docker Compose Documentation: https://docs.docker.com/compose/
- NVIDIA Container Toolkit: https://github.com/NVIDIA/nvidia-docker
- Both methods use Monte Carlo sampling (30 samples) for uncertainty estimation
- Results are automatically organized into
run_Xfolders to track multiple experiments - Images are copied to
sorted_images/folders for visual inspection - The system is optimized for GPU execution (CUDA)
- VAE method is specifically optimized for H100 GPUs with BF16 precision
Last Updated: Complete documentation for OOD Detection System with Animals-10 and Pokemon datasets