Skip to content

Comments

sd: relax size restrictions for DiT models#1986

Open
wbruna wants to merge 1 commit intoLostRuins:concedo_experimentalfrom
wbruna:kcpp_sd_dit_image_size
Open

sd: relax size restrictions for DiT models#1986
wbruna wants to merge 1 commit intoLostRuins:concedo_experimentalfrom
wbruna:kcpp_sd_dit_image_size

Conversation

@wbruna
Copy link

@wbruna wbruna commented Feb 21, 2026

Round image dimensions to the specific multiple required by each DiT model, which range from 32 (certain Wan models) to 1 (Chroma Radiance), with most requiring multiples of 8 or 16. Unet models keep being rounded to multiples of 64.

Current sd.cpp rounds the sizes internally; but it always rounds up, so we still need to round on our side to apply image size restrictions, and to trigger VAE tiling correctly.

Also, remove a legacy test that could abort a generation with unsupported image sizes: it'd never run, because it was applied after the image side adjustements.

Round image dimensions to the specific multiple required by each
DiT model, which range from 32 (certain Wan models) to 1 (Chroma
Radiance), with most requiring multiples of 8 or 16. Unet models
keep being rounded to multiples of 64.

Current sd.cpp rounds the sizes internally; but it always rounds
up, so we still need to round on our side to apply image size
restrictions, and to trigger VAE tiling correctly.

Also, remove a legacy test that could abort a generation with
unsupported image sizes: it'd never run, because it was applied
after the image side adjustements.
@wbruna
Copy link
Author

wbruna commented Feb 21, 2026

I'm not sure what would be the best approach to the stable-ui side. Maybe a new config item like the "Allow Larger Params" to change the granularity, still defaulting to 64? We could explain in a tooltip that 64 is the most compatible value, and that different values could be rounded by the server to model-specific multiples. Not that I am eager to program that or anything 🙂

@LostRuins
Copy link
Owner

stable UI allows editing the slider, so no issue

image

However, are you sure its safe to allow non-multiples of 64? last time i tried it resulted in various crashes and incoherent output.


} else {

if (params.width <= 0 || params.width % 64 != 0 || params.height <= 0 || params.height % 64 != 0) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still needed for handling negative numbers i think

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sd_fix_resolution function deals with negatives right at its beginning:

    width = std::max(std::min(width, 8192), img_side_min);
    height = std::max(std::min(height, 8192), img_side_min);

@wbruna
Copy link
Author

wbruna commented Feb 21, 2026

However, are you sure its safe to allow non-multiples of 64? last time i tried it resulted in various crashes and incoherent output.

Yep; it was changed upstream at leejet/stable-diffusion.cpp#1073. I'm using the same code to get the needed multiple.

@wbruna
Copy link
Author

wbruna commented Feb 21, 2026

One way to test is with sides that are almost multiples of 64: 127x127 become 120x120 for ZIT, 112x112 for Klein 4B, 64x64 for SDXL. Also tested VAE tiling, with 767x1023 requests.

-1x-1 becomes 64x64 as before, though we should probably return an error at the API level instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants