Robust Wild Face Restoration with Diffusion, LoRA, and CodeFormer Guidance
This repository implements a Robust Wild Face Restoration (RWFR) pipeline based on DiffBIR v2, enhanced with:
- LoRA fine-tuning (UNet + ControlNet),
- on-the-fly CodeFormer guidance for face priors,
- support for unaligned real-world faces (NTIRE-style setting).
The project supports:
- Stage-2 fine-tuning (LoRA training),
- Inference on unaligned images with CodeFormer-guided Diffusion.
Inference flow (Unaligned BFR):
- Input image (unaligned, wild face)
- Face detection & alignment (DiffBIR internal)
- CodeFormer restores aligned face (512Γ512)
- CodeFormer output is injected into ControlNet
- Diffusion restores face + background
- Faces are pasted back into the original image
π CodeFormer is used only as a face prior, not as a standalone enhancer.
Code-Diff-Lo-RWFR/
βββ diffbir/
β βββ inference/
β β βββ loop.py
β β βββ unaligned_bfr_loop.py # main unaligned inference logic
β βββ utils/
β β βββ common.py # model loading (ckpt + safetensors)
β β βββ codeformer_wrapper.py # CodeFormer face restorer
β βββ model/
β βββ cldm.py
βββ configs/
β βββ train/
β β βββ train_stage2.yaml # LoRA training config
β βββ inference/
β βββ bsrnet.yaml
β βββ swinir.yaml
βββ train_stage2.py
βββ inference.py
βββ requirements.txt
βββ README.md
wget https://huggingface.co/lxq007/DiffBIR/resolve/main/face_swinir_v1.ckptwget https://huggingface.co/Manojb/stable-diffusion-2-1-base/resolve/main/v2-1_512-ema-pruned.safetensorsfrom safetensors.torch import load_file
import torch
ckpt = load_file("v2-1_512-ema-pruned.safetensors")
torch.save(
{"state_dict": ckpt, "global_step": 0},
"v2-1_512-ema-pruned.ckpt"
)
print("β
Converted safetensors β ckpt")
sd = torch.load("v2-1_512-ema-pruned.ckpt", map_location="cpu", weights_only=False)
print(type(sd), sd.keys())Expected output
β
Converted safetensors β ckpt
<class 'dict'> dict_keys(['state_dict', 'global_step'])
- Python β₯ 3.9
- CUDA GPU recommended
- Make sure LQ, GT, and CodeFormer outputs share identical filenames
1οΈβ£ Clone the repository
git clone <this-repo-url>
cd Code-Diff-Lo-RWFR2οΈβ£ Install dependencies
Install requirements.txt.
Torch & xFormers must be installed manually depending on GPU.
β οΈ On Pascal GPUs (P100):
- Do not install xFormers
- Use system torch
β‘ On Ampere / RTX GPUs:
pip install torch==2.2.2+cu118 torchvision==0.17.2+cu118 torchaudio==2.2.2+cu118 \
--extra-index-url https://download.pytorch.org/whl/cu118
pip install xformers==0.0.27.post2 --no-deps3οΈβ£ Prepare dataset
- Place LQ / GT / CF images
- Filenames must match exactly
4οΈβ£ Configure training Edit:
configs/train/train_stage2.yaml
Key sections to adjust:
train:(batch size, lr, iterations)dataset:(paths)lora:(rank_unet, rank_controlnet)
5οΈβ£ Launch training
accelerate launch train_stage2.py \
--config configs/train/train_stage2.yamlLoRA checkpoints will be saved in the experiment directory.
This project requires CodeFormer during inference.
CodeFormer is not vendored into this repo by design.
git clone <this-repo-url>
git clone https://github.com/sczhou/CodeFormerDirectory layout:
.
βββ Code-Diff-Lo-RWFR
βββ CodeFormer
This project optionally uses AdaFace as an identity embedding network during CHECKPOINT SELECTION only.
AdaFace is NOT vendored into this repository by design.
Clone the official AdaFace repository inside in this repo:
cd dir/Code-Diff-Lo-RWFR
git clone https://github.com/mk-minchul/AdaFace.gitInstall Dependencies (SKIP ALREADY INCLUDED IN MAIN REQ.TXT OF THIS REPO!!)
β οΈ Do NOT install AdaFaceβs original PyTorch or Lightning requirements.
This project uses modern PyTorch (2.x) and only requires AdaFace
for forward embedding inference, not training.
pip install -r Code-Diff-Lo-RWFR/requirements.txtInstall Torch / xFormers as per your GPU (see training section).
cd CodeFormer
python basicsr/setup.py develop- LQ images (unaligned faces)
- LoRA weights from training
- Stage-1 + SD v2.1 weights (CodeFormer weights download automatically)
cd Code-Diff-Lo-RWFR
python inference.py \
--task unaligned_face \
--upscale 2 \
--version v2 \
--sampler spaced \
--steps 50 \
--cfg_scale 4.0 \
--captioner none \
--pos_prompt "" \
--neg_prompt "low quality, blurry, low-resolution, noisy, unsharp, weird textures" \
--input /path/to/LQ_images \
--output /path/to/output_dir \
--lora_path /path/to/lora_checkpoint.pt \
--rank_unet 64 \
--rank_controlnet 16 \
--batch_size 1 \
--n_samples 1 \
--precision fp32 \
--device cudaAfter running inference for multiple LoRA checkpoints:
python diffbir/inference/select_best_lora.pyThe script:
-
Computes CLIP-IQA, MANIQA, MUSIQ, NIQE
-
Applies AdaFace identity gating
-
Ranks checkpoints using NTIRE weighted score
Output:
π BEST CHECKPOINT: lora_0000200
| Task | File |
|---|---|
| Training config | configs/train/train_stage2.yaml |
| Inference logic | diffbir/inference/unaligned_bfr_loop.py |
| CodeFormer usage | diffbir/utils/codeformer_wrapper.py |
| Model loading (ckpt / safetensors) | diffbir/utils/common.py |
| Main inference CLI | inference.py |
- CodeFormer is used only on aligned face crops
- Background fusion is handled by DiffBIR
- LoRA is trained jointly on UNet + ControlNet
- ControlNet guidance is preserved via CF face priors
- No double face detection or alignment occurs
This design avoids identity drift and boundary artifacts while improving realism.
- DiffBIR: original authors
- CodeFormer: Zhou et al.
- Stable Diffusion v2.1: Stability AI
- NTIRE RWFR Challenge