Mirage Pipeline by DavidBert · Pull Request #1 · Photoroom/diffusers

DavidBert · 2025-09-26T15:05:06Z

Mirage pipeline

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

photoroman

I did a brief initial pass. This looks really great! Thank you!

The main missing part for me are more docs. I think we should add docstrings to all public classes and methods. Also we should add some docs and examples in docs/source/en/api/pipelines/mirage.md, probably.

photoroman · 2025-09-26T17:40:12Z

scripts/convert_mirage_to_diffusers.py

+    if vae_type == "flux":
+        config_path = "/raid/shared/storage/home/davidb/diffusers/diffusers_pipeline_checkpoints/pipeline_checkpoint_fluxvae_gemmaT5_updated/transformer/config.json"
+    elif vae_type == "dc-ae":
+        config_path = "/raid/shared/storage/home/davidb/diffusers/diffusers_pipeline_checkpoints/pipeline_checkpoint_dcae_gemmaT5_updated/transformer/config.json"


Should we change these hardcoded paths?

Absolutely!

photoroman · 2025-09-26T17:41:00Z

scripts/convert_photon_to_diffusers.py

+#!/usr/bin/env python3
+"""
+Script to convert Mirage checkpoint from original codebase to diffusers format.
+"""


Should we release this script or put it into the computer vision repo and release the converted checkpoints?

I think the idea is to keep this only internally, right?

Looking at other scripts in this folder, it seems that most companies actually put these kinds of scripts in diffusers.

photoroman · 2025-09-26T17:43:12Z

scripts/convert_mirage_to_diffusers.py

+    mapping = {}
+
+    # RMSNorm: scale -> weight
+    for i in range(16):  # 16 layers


Use a MIRAGE_NUM_LAYERS: int = 16 constant at the top?

I change it to come from a config instead.

photoroman · 2025-09-26T17:59:02Z

scripts/convert_mirage_to_diffusers.py

+    if vae_type == "flux":
+        ref_pipeline = "/raid/shared/storage/home/davidb/diffusers/diffusers_pipeline_checkpoints/pipeline_checkpoint_fluxvae_gemmaT5_updated"
+    else:  # dc-ae
+        ref_pipeline = "/raid/shared/storage/home/davidb/diffusers/diffusers_pipeline_checkpoints/pipeline_checkpoint_dcae_gemmaT5_updated"


Should we change these hardcoded paths?

I removed all dependency to previous ref pipeline, this was a mistake.

photoroman · 2025-09-26T18:09:59Z

scripts/convert_mirage_to_diffusers.py

+    return mapping
+
+
+def convert_checkpoint_parameters(old_state_dict: dict) -> dict:


Probably good to use more specific types, like Dict[str, str]. Import Dict from typing, because the diffusers library supports Python 3.8 and built-in types with generics, e.g. dict[str, str] are only supported since Python 3.9.

photoroman · 2025-09-26T18:15:50Z

src/diffusers/models/transformers/transformer_mirage.py

+logger = logging.get_logger(__name__)
+
+
+def get_image_ids(bs: int, h: int, w: int, patch_size: int, device: torch.device) -> Tensor:


Use more readable variable names: batch_size, height, width.

Add docstrings for all public methods and classes.

photoroman · 2025-09-26T18:49:41Z

src/diffusers/models/transformers/transformer_mirage.py

+            assert attention_mask.dim() == 2, f"Unsupported attention_mask shape: {attention_mask.shape}"
+            assert attention_mask.shape[-1] == l_txt, (
+                f"attention_mask last dim {attention_mask.shape[-1]} must equal text length {l_txt}"
+            )


Should these be checked as if conditions and raised as ValueError? Usually asserts are to catch programming errors not for input validation.

photoroman · 2025-09-26T19:20:12Z

src/diffusers/pipelines/photon/pipeline_photon.py

+        vae ([`AutoencoderKL`] or [`AutoencoderDC`]):
+            Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.
+            Supports both AutoencoderKL (8x compression) and AutoencoderDC (32x compression).
+    """


I think in addition to this, we need to add docs about Mirage in docs/source/en/api/pipelines/mirage.md. Have a look at the section "Adding a new pipeline/scheduler" in docs/README.md.

photoroman · 2025-09-26T19:23:34Z

src/diffusers/pipelines/mirage/pipeline_mirage.py

+
+        # 0. Default height and width to transformer config
+        height = height or 256
+        width = width or 256


nit: put 256 into a constant.

I specify it in the checkpoint config now and changed it for a constant when not specified

photoroman · 2025-09-26T19:25:25Z

src/diffusers/pipelines/mirage/pipeline_mirage.py

+                )
+
+                # Convert back to image format
+                from ...models.transformers.transformer_mirage import seq2img


Any reason not to import this at the top?

I've been becoming quite keen of doing in-code imports for cases like this one, where seq2img is only used once in the whole file :D

I moved it to the top.

almazan · 2025-09-29T06:22:48Z

src/diffusers/pipelines/mirage/pipeline_mirage.py

+"""
+
+
+class MiragePipeline(


I remember Eliot started a thread about maybe changing the name of Mirage to something else, since there's already a ML model called like that. Did that end up on something?

photoroman

LGTM!

EDIT: We should still discuss the name "Mirage" though before opening a PR to upstream.

photoroman · 2025-10-01T08:52:05Z

docs/source/en/api/pipelines/mirage.md

+pipe = MiragePipeline.from_pretrained("path/to/mirage_checkpoint")
+pipe.to("cuda")
+
+prompt = "A digital painting of a rusty, vintage tram on a sandy beach"


We should use a more viral example prompt. Once we chose a viral name for the model, we can match the prompt to the name ;) Maybe we have to activate the Photoroom marketing team for this one.

photoroman · 2025-10-01T08:54:57Z

docs/source/en/api/pipelines/mirage.md

+from diffusers import MiragePipeline
+
+# Load pipeline - VAE and text encoder will be loaded from HuggingFace
+pipe = MiragePipeline.from_pretrained("path/to/mirage_checkpoint")


I guess we'll be able to store the checkpoint on Hugging Face as well, right? If yes, we should not forget to update the paths here to the official one, to make this truly copy-paste and run.

photoroman

Had a quick look. Great! I trust Claude renamed everything correctly.

photoroman · 2025-10-07T15:13:26Z

docs/source/en/api/pipelines/mirage.md

 pipe.to("cuda")

-prompt = "A digital painting of a rusty, vintage tram on a sandy beach"
+prompt = "A vibrant night sky filled with colorful fireworks, with one large firework burst forming the glowing text “Photon” in bright, sparkling light"


Haha, awesome!

…mirage

David Bertoin added 7 commits September 26, 2025 10:20

mirage pipeline first commit

016316a

use attention processors

4ac274b

use diffusers rmsnorm

904debc

use diffusers timestep embedding method

122115a

remove MirageParams

e3fe0e8

checkpoint conversion script

85ae87b

ruff formating

9a697d0

DavidBert requested review from almazan and photoroman September 26, 2025 15:08

photoroman reviewed Sep 26, 2025

View reviewed changes

almazan reviewed Sep 29, 2025

View reviewed changes

David Bertoin added 4 commits September 30, 2025 19:05

remove dependencies to old checkpoints

34fa9dd

remove old checkpoints dependency

5cc965a

move default height and width in checkpoint config

d79cd8f

add docstrings

f2759fd

DavidBert force-pushed the mirage branch from 493568a to f2759fd Compare September 30, 2025 21:27

David Bertoin added 4 commits September 30, 2025 21:28

if conditions and raised as ValueError instead of asserts

394f725

small fix

54fb063

nit remove try block at import

c49fafb

mirage pipeline doc

7e7df35

DavidBert requested review from almazan and photoroman October 1, 2025 06:25

photoroman approved these changes Oct 1, 2025

View reviewed changes

David Bertoin added 2 commits October 7, 2025 14:20

update doc

de03851

rename model to photon

a69aa4b

photoroman approved these changes Oct 7, 2025

View reviewed changes

photoroman reviewed Oct 7, 2025

View reviewed changes

David Bertoin added 3 commits October 8, 2025 13:00

mirage pipeline first commit

9e099a7

use attention processors

6e10ed4

use diffusers rmsnorm

866c6de

David Bertoin added 19 commits October 8, 2025 13:00

use diffusers timestep embedding method

4e8b647

remove MirageParams

472ad97

checkpoint conversion script

97a231e

ruff formating

35d721f

remove dependencies to old checkpoints

775a115

remove old checkpoints dependency

1c6c25c

move default height and width in checkpoint config

b0d965c

add docstrings

235fe49

if conditions and raised as ValueError instead of asserts

a6ff579

small fix

3a91503

nit remove try block at import

e200cf6

mirage pipeline doc

2ea8976

update doc

26429a3

rename model to photon

0abe136

add text tower and vae in checkpoint

fe0e3d5

update doc

855b068

Merge branch 'mirage' of https://github.com/Photoroom/diffusers into …

d2c6bdd

…mirage

update photon doc

89beae8

ruff fixes

2df0e2f

		return mapping


		def convert_checkpoint_parameters(old_state_dict: dict) -> dict:

		logger = logging.get_logger(__name__)


		def get_image_ids(bs: int, h: int, w: int, patch_size: int, device: torch.device) -> Tensor:

Comments

Conversation

DavidBert commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Mirage pipeline

Before submitting

Who can review?

Uh oh!

photoroman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DavidBert Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

photoroman left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

photoroman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DavidBert commented Sep 26, 2025 •

edited

Loading

DavidBert Sep 30, 2025 •

edited

Loading

photoroman left a comment •

edited

Loading