Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 189 additions & 13 deletions Deeplens_Self_Supervised_Learning_Yashwardhan_Deshmukh/README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,216 @@

# Self-Supervised Learning for Strong Gravitational Lensing :sparkles: :milky_way:
*Special thanks to (mentors) Anna Parul, Yurii Halychanskyi, and Sergei Gleyzer.*
<hr>
<hr>

Click on this button below to read the detailed blog (Part 1): <br><br> [![medium](https://img.shields.io/badge/medium-000?style=for-the-badge&logo=medium&logoColor=white)](https://yaashwardhan.medium.com/self-supervised-learning-for-strong-gravitational-lensing-part1-5a049e976b51)

<img src="header.png">

Code is written in: <br>[![WrittenIN](https://skillicons.dev/icons?i=python,tensorflow)](https://skillicons.dev)

---

## Project Overview

This module implements **self-supervised learning (SSL)** techniques — specifically **Contrastive Learning** (Rotation & Gaussian Noise pretext tasks) and **Bootstrap Your Own Latent (BYOL)** — for classifying strong gravitational lensing images from [DeepLense](https://github.com/ML4SCI/DeepLense) datasets.

**Key goals:**
- Pre-train a ResNet50 encoder on unlabelled lensing images using SSL pretext tasks.
- Fine-tune the pre-trained encoder on labelled data for downstream classification (axion / CDM / no sub-structure).
- Benchmark SSL performance against a fully supervised ResNet50 baseline.

The pipeline covers:
1. Data loading and augmentation
2. Self-supervised pre-training (BYOL / Contrastive Learning)
3. Fine-tuning on labelled data
4. Evaluation (AUC, ROC curves, confusion matrix)

---

## Installation

### Prerequisites
- Python ≥ 3.8
- CUDA-compatible GPU (recommended for training)

### 1. Clone the repository

```bash
git clone https://github.com/ML4SCI/DeepLense.git
cd DeepLense
```

### 2. (Optional) Create a virtual environment

```bash
python -m venv venv
# On Linux/macOS
source venv/bin/activate
# On Windows
venv\Scripts\activate
```

### 3. Install dependencies

```bash
pip install tensorflow>=2.6 keras numpy pandas scikit-learn pillow imageio tqdm matplotlib
```

> **Note:** TensorFlow ≥ 2.6 is required. If you have a GPU, install the GPU-enabled build:
> ```bash
> pip install tensorflow-gpu>=2.6
> ```

---

## Repository Structure

```
Deeplens_Self_Supervised_Learning_Yashwardhan_Deshmukh/
├── byol_learning/ # BYOL self-supervised pipeline
│ ├── 1.imports.py # All imports
│ ├── 2.parameters.py # Hyperparameters (image size, epochs, lr, batch size)
│ ├── 3.augmentation.py # Data augmentation strategies
│ ├── 4.data_generator.py # Custom Keras data generator
│ ├── 5.encoder.py # Encoder/projection head definition
│ ├── 6.resnet50_baseline.py # Supervised ResNet50 baseline
│ ├── 7.byol_pretraining.py # BYOL pre-training loop
│ ├── 8.finetuning.py # Fine-tuning on labelled data
│ ├── 9.eval.py # Evaluation: AUC, ROC, confusion matrix
│ └── notebooks/ # Jupyter notebooks for exploration
├── contrastive_learning/ # Contrastive Learning pipeline
│ ├── 1.imports.py
│ ├── 2.parameters.py
│ ├── 3.data_generator.py
│ ├── 4.augmentations.py # Rotation & Gaussian Noise augmentations
│ ├── 5.encoder.py
│ ├── 5.vit_encoder.py # Vision Transformer encoder (experimental)
│ ├── 6.resnet50_baseline.py
│ ├── 7.rotation_pretraining.py # Contrastive pre-training (Rotation pretext)
│ ├── 8.rotation_finetuning.py
│ ├── 9.gaussian_pretraining.py # Contrastive pre-training (Gaussian Noise pretext)
│ ├── 10.gaussian_finetuning.py
│ ├── 11.eval.py
│ └── notebooks/
├── real_data/ # Scripts tested on real observational data
├── regression_notebooks/ # Regression experiments with SSL representations
├── header.png
├── byol.png
├── contrastive.png
├── augmentations.png
└── README.md
```

---

## Usage

Each numbered Python script is designed to be run **in order** within its respective folder. Below are the steps for each pipeline.

### BYOL Pipeline

```bash
cd byol_learning

# Step 1 – (Optional) Review/adjust imports
python 1.imports.py

# Step 2 – (Optional) Adjust hyperparameters
# Edit 2.parameters.py: width, num_epochs, batch_size, lr, input_shape

# Step 3 – Pre-train the encoder using BYOL
python 7.byol_pretraining.py

# Step 4 – Fine-tune on labelled data
python 8.finetuning.py

# Step 5 – Evaluate the model
python 9.eval.py
```

### Contrastive Learning Pipeline (Rotation Pretext)

```bash
cd contrastive_learning

# Pre-train with rotation pretext task
python 7.rotation_pretraining.py

# Fine-tune
python 8.rotation_finetuning.py

# Evaluate
python 11.eval.py
```

### Contrastive Learning Pipeline (Gaussian Noise Pretext)

```bash
cd contrastive_learning

# Pre-train with Gaussian noise pretext task
python 9.gaussian_pretraining.py

# Fine-tune
python 10.gaussian_finetuning.py

# Evaluate
python 11.eval.py
```

### Key Hyperparameters (`2.parameters.py`)

| Parameter | Default | Description |
|:-------------:|:-------:|:----------------------------------:|
| `width` | 128 | Width of the projection head |
| `num_epochs` | 30 | Number of training epochs |
| `batch_size` | 128 | Batch size |
| `lr` | 1e-4 | Learning rate (Adam optimizer) |
| `input_shape` | (64,64,3) | Input image shape |

---

## Contrastive Learning

Contrastive learning is a type of self-supervised learning method, that tries to learn similar and dissimilar representations of data by contrasting positive and negative examples.
Contrastive learning is a type of self-supervised learning method that learns similar and dissimilar representations by contrasting positive and negative examples.

<img src="contrastive.png">

## Bootstrap Your Own Latent (BYOL) Learning

BYOL trains two networks, the target network and the online network, both in parallel. There are no positive or negative pairs here like there are in contrastive learning. Two different augmented views of the same image are brought, and representations are learned using the online network, while the target network is a moving average of the online network, giving it a slower parameter update.
BYOL trains two networks — an **online network** and a **target network** — in parallel. Unlike contrastive learning, there are no negative pairs. Two augmented views of the same image are passed through both networks; the online network is trained to predict the target network's representation, while the target network is updated as a moving average of the online network.

<img src="byol.png">

---

## Classification Results
The values in the Model columns represent AUC for axion, cdm and no_sub respectively.
e.g (0.97, 0.96, 1.0) represents 0.97 AUC for axion, 0.96 AUC for cdm and 1.0 AUC for no_sub.

The values in the Model columns represent AUC for axion, cdm and no_sub respectively.
e.g. `(0.97, 0.96, 1.0)` → 0.97 AUC for axion, 0.96 AUC for cdm, 1.0 AUC for no_sub.

All results were calculated on a **separate test set**.

| NN Architecture | Model I | Model II | Model III |
| :-----------------------------------: | :------------: | :---------: | :----------: |
| Baseline ResNet50 | 0.97, 0.96, 1.0 | 0.98, 0.92, 0.98 | 0.96,0.95,0.99 |
| Contrastive Learning Rotation Pretext | 0.92, 0.91, 1.0 |**0.99, 0.99, 1.0** | **1.0, 0.99, 1.0** |
|Contrastive Learning Gaussian Noise Pretext| 0.96,0.95,1.0 | **0.99, 0.99, 1.0** | **1.0, 0.99, 1.0** |
| Bootstrap Your Own Latent | 0.95, 0.93, 1.0 | **1.0, 0.98, 1.0** | 0.98, 0.92, 0.98 |

| NN Architecture | Model I | Model II | Model III |
| :-----------------------------------: | :------------: | :---------: | :----------: |
| Baseline ResNet50 | 0.97, 0.96, 1.0 | 0.98, 0.92, 0.98 | 0.96,0.95,0.99 |
| Contrastive Learning Rotation Pretext | 0.92, 0.91, 1.0 |**0.99, 0.99, 1.0** | **1.0, 0.99, 1.0** |
|Contrastive Learning Gaussian Noise Pretext| 0.96,0.95,1.0 | **0.99, 0.99, 1.0** | **1.0, 0.99, 1.0** |
| Bootstrap Your Own Latent | 0.95, 0.93, 1.0 | **1.0, 0.98, 1.0** | 0.98, 0.92, 0.98 |
---

## Conclusion and Future Goals
So far, we can see that the results for self-supervised learning seem superior to their ResNet50 Baseline.
Future goals consist of testing models for regression, implementing vision transformers (twin networks) and testing some more self-supervised learning methods.

Self-supervised learning results are consistently **superior** to the supervised ResNet50 baseline across all three models.

Future goals:
- Test SSL representations for regression tasks
- Implement Vision Transformer (ViT) twin networks for SSL
- Explore additional SSL methods (SimSiam, MoCo, etc.)
- Evaluate on real observational lensing data
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
tensorflow>=2.10.0
keras>=2.10.0
numpy>=1.21.0
pandas>=1.3.0
matplotlib>=3.5.0
pillow>=9.0.0
scikit-learn>=1.0.0
tqdm>=4.62.0
imageio>=2.19.0
22 changes: 22 additions & 0 deletions Transformers_Classification_DeepLense_Kartik_Sachdev/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ python3 main.py \
| train_config | Transformer config: [CvT, CCT, TwinsSVT, LeViT, CaiT, CrossViT, PiT, Swin, T2TViT, CrossFormer] |
| cuda | Use cuda |
| no-cuda | Not use cuda |
| seed | Random seed for reproducible experiments (default: 42) |


### __Self-Supervised Learning__
Expand Down Expand Up @@ -136,6 +137,7 @@ python3 main_ray.py \
| cuda | Use cuda |
| no-cuda | Not use cuda |
| num_samples | Number of samples for [ASHA scheduler](https://docs.ray.io/en/latest/tune/api_docs/schedulers.html) |
| seed | Random seed for reproducible experiments (default: 42) |


<br>
Expand All @@ -155,6 +157,26 @@ sbatch < jobscript.sh
```
<br>

# __Reproducibility__

All training scripts support a `--seed` argument to ensure reproducible experiments. This sets seeds for Python's `random`, NumPy, PyTorch CPU, and CUDA:

```bash
# Supervised training with a custom seed
python3 main.py --dataset_name Model_II --train_config CvT --seed 123

# SSL pre-training with a custom seed
python3 pretrain.py --seed 123

# Finetuning with a custom seed
python3 finetune.py --seed 123
python3 finetune_byol.py --seed 123
```

The default seed is `42` if `--seed` is not specified.

<br>

# __Results__

## __Self-Supervised Learning__
Expand Down
19 changes: 17 additions & 2 deletions Transformers_Classification_DeepLense_Kartik_Sachdev/finetune.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import os
import torch
import torch.nn as nn
import torch.optim as optim
Expand All @@ -6,13 +7,27 @@
from models.cnn_zoo import CustomResNet
from utils.losses.contrastive_loss import ContrastiveLossEuclidean
from utils.train import train_simplistic
from utils.util import load_model_add_head
from utils.util import load_model_add_head, seed_everything
from torchsummary import summary
from argparse import ArgumentParser

parser = ArgumentParser()
parser.add_argument(
"--seed", type=int, default=42, help="random seed for reproducible experiments"
)
args = parser.parse_args()

# Set reproducible seed
seed_everything(seed=args.seed)


# Set device
device = "cuda" # torch.device("cuda" if torch.cuda.is_available() else "cpu")
learning_method = "contrastive_embedding"
saved_model_path = "/home/kartik/git/deepLense_transformer_ssl/output/pretrained_contrastive_embedding.pth"

# Define base directory relative to script location
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
saved_model_path = os.path.join(BASE_DIR, "output", "pretrained_contrastive_embedding.pth")

# Set hyperparameters
batch_size = 128
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import os
import torch
import torch.nn as nn
import torch.optim as optim
Expand All @@ -12,16 +13,37 @@
load_model_add_head,
get_second_last_layer,
get_last_layer_features,
seed_everything,
)
from torchsummary import summary
from models.byol import BYOLSingleChannel, FinetuneModelByol
import torchvision
from models.utils.finetune_model import FinetuneModel
from argparse import ArgumentParser

parser = ArgumentParser()
parser.add_argument(
"--seed", type=int, default=42, help="random seed for reproducible experiments"
)
args = parser.parse_args()

# Set reproducible seed
seed_everything(seed=args.seed)


# Set device
device = "cuda" # torch.device("cuda" if torch.cuda.is_available() else "cpu")
learning_method = "contrastive_embedding"
saved_model_path = "/home/kartik/git/DeepLense/Transformers_Classification_DeepLense_Kartik_Sachdev/logger/2023-07-23-13-30-24/checkpoint/Resnet_finetune_Model_II_2023-07-23-13-30-24.pt"

# Define base directory relative to script location
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
saved_model_path = os.path.join(
BASE_DIR,
"logger",
"2023-07-23-13-30-24",
"checkpoint",
"Resnet_finetune_Model_II_2023-07-23-13-30-24.pt"
)

# Set hyperparameters
batch_size = 512
Expand Down
17 changes: 14 additions & 3 deletions Transformers_Classification_DeepLense_Kartik_Sachdev/inference.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from __future__ import print_function

import os
from turtle import down

from utils.dataset import DefaultDatasetSetupSSL
Expand All @@ -19,8 +19,19 @@ def main():
labels_map = {0: "axion", 1: "cdm", 2: "no_sub"}
image_size = 224
channels = 1
log_dir = "/home/kartik/git/DeepLense/Transformers_Classification_DeepLense_Kartik_Sachdev/logger/2023-07-23-13-30-24"
finetune_model_path = "/home/kartik/git/DeepLense/Transformers_Classification_DeepLense_Kartik_Sachdev/logger/2023-07-23-13-30-24/checkpoint/Resnet_finetune_Model_II.pt"

# Define base directory relative to script location
BASE_DIR = os.path.dirname(os.path.abspath(__file__))

# Use relative paths instead of hardcoded absolute paths
log_dir = os.path.join(BASE_DIR, "logger", "2023-07-23-13-30-24")
finetune_model_path = os.path.join(
BASE_DIR,
"logger",
"2023-07-23-13-30-24",
"checkpoint",
"Resnet_finetune_Model_II.pt"
)
batch_size = 512
num_workers = 8

Expand Down
Loading