First Look at Stable Cascade, a New Text-to-Image AI Generator with Three-Stage Approach

Here’s a first look at Stable Cascade, a new text-to-image AI generator from Stability AI that takes a three-stage approach. This new hyper efficient Würstchen architecture enables a hierarchical compression of images, resulting in incredible outputs while utilizing a highly compressed latent space.

What sets this apart from Stable Diffusion is the Latent Generator phase, or Stage C, which transforms the user inputs into compact 24×24 latents that are passed along to the Latent Decoder phase (Stages A & B). This is used to compress images, similar to what the job of the VAE is in Stable Diffusion, but achieving results at a much higher compression. Github page here.

Sale

Acer Aspire 3 A314-23P-R3QA Slim Laptop | 14.0" Full HD IPS Display | AMD Ryzen 5 7520U Quad-Core...

Purposeful Design: Travel with ease and look great doing it with the Aspire 3 thin, light design.
Ready-to-Go Performance: The Aspire 3 is ready-to-go with the latest AMD Ryzen 5 7520U Processor with Radeon Graphics—ideal for the entire family,...
Visibly Stunning: Experience sharp details and crisp colors on the 14.0" Full HD IPS display with 16:9 aspect ratio and narrow bezels.

Stable Cascade Text-to-Image AI Generator

Next to standard text-to-image generation, Stable Cascade can generate image variations and image-to-image generations. Image variations work by extracting image embeddings from a given image using CLIP and then returning this back to the model,” said Stability AI in a press release.

Related Posts

Chrysler Halcyon EV Concept Boasts Butterfly-Hinged Canopy and Stellantis AI Assistant

Researchers Develop Beef Cell-Infused Rice for Astronauts as Alternative Protein Source