A high-performance image generation script that leverages multiple GPUs to generate hyper-realistic images using the FLUX.1-Krea model with Super Realism LoRA weights. This implementation uses multiprocessing to distribute image generation across available GPUs for maximum efficiency.
- Multi-GPU support: Automatically detects GPUs and distributes image generation tasks among them.
- LoRA weight integration for enhanced style adaptation.
- Multiple resolution and style prompt templates.
- Random seed control for reproducibility or random generation.
- Output images saved individually or as a zip archive.
- Configurable generation parameters: number of images, prompt, resolution, guidance scale, and inference steps.
- Python 3.8+
- PyTorch with CUDA and bfloat16 support
- NVIDIA GPUs with CUDA drivers installed
- Install dependencies with:
pip install -r requirements.txtExample requirements.txt contents:
git+https://github.com/huggingface/diffusers.git
git+https://github.com/huggingface/transformers.git
git+https://github.com/huggingface/accelerate.git
git+https://github.com/huggingface/peft
huggingface_hub
sentencepiece
torch
pillow
hf_xet
numpy
torchvision
protobuf
gradio # optional if using Gradio interface
Edit the script app.py to specify your prompt and generation parameters:
prompt = "Your prompt here"
num_images = 10Run the script:
python app.pyThe script will detect your GPUs and perform image generation in parallel, saving images locally and optionally zipping them.
save_image(img): Saves PIL image with unique UUID filename.randomize_seed_fn(seed, randomize_seed): Handles seed randomization if enabled.apply_style(style_name, positive): Applies preset style prompts.generate_on_gpu(args): Loads model and LoRA weights on a specified GPU subprocess, generates images for assigned prompt batch.generate(...): Main controller function, manages GPU count, divides workload, triggers multiprocessing, and handles output zipping.
- Utilizes
torch.cuda.set_device(gpu_id)to direct workload on each subprocess GPU. - Loads full pipeline and LoRA into each GPU memory context.
- Multiprocessing pool splits image generation workload evenly across all GPUs.
- Each GPU subprocess generates images independently and returns paths and generation duration.
- Zips output images for convenient download if enabled.
- Ensure your environment has sufficient GPU VRAM to load the model with LoRA.
- Running on fewer GPUs will automatically fallback to single-GPU generation.
- Multiprocessing incurs some overhead but significantly speeds up batch generation.
- Seed management ensures reproducible or random image generations as needed.
- Modify or extend
style_listto add custom prompt templates.
GitHub: https://github.com/PRITHIVSAKTHIUR/Flux-Krea-multi-GPU-Pool
Flux.1-Krea-Merged-Dev : prithivMLmods/flux1-krea-merged-dev