Stability AI unveils Stable Diffusion 3.5 - The Most Powerful Models Yet

Stability AI has officially unveiled Stable Diffusion 3.5, a groundbreaking release featuring the most advanced models in the series to date. This new version introduces multiple customizable variants that are designed to run efficiently on consumer hardware, making them accessible to a wider audience. Both commercial and non-commercial users can now take advantage of these models for free under the Stability AI Community License.

The release includes Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo, which are now available for download on Hugging Face, with inference code accessible via GitHub. For those awaiting more options, Stable Diffusion 3.5 Medium is set to be released on October 29th.

Earlier this year, in June, Stability AI launched Stable Diffusion 3 Medium, marking the first open release of the Stable Diffusion 3 series. However, this version did not fully meet the company’s standards or the community’s expectations. After gathering feedback, Stability AI decided to take extra time to develop a more robust version. Stable Diffusion 3.5 now reflects their commitment to transforming the visual media landscape, providing users with powerful tools to create and customize high-quality images.

With this release, Stability AI continues to push the boundaries of open-source AI, ensuring that creators everywhere have access to cutting-edge technology.

Key Takeaways:

Today we are introducing Stable Diffusion 3.5. This open release includes multiple model variants, including Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo. Additionally, Stable Diffusion 3.5 Medium will be released on October 29th.
These models are highly customizable for their size, run on consumer hardware, and are free for both commercial and non-commercial use under the permissive Stability AI Community License.
You can download Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo from Hugging Face and the inference code on GitHub now.

Stability AI shared on their official website “Today we are releasing Stable Diffusion 3.5, our most powerful models yet. This open release includes multiple variants that are customizable, run on consumer hardware, and are available for use under the permissive Stability AI Community License. You can download Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo models from Hugging Face and the inference code on GitHub now.

In June, we released Stable Diffusion 3 Medium, the first open release from the Stable Diffusion 3 series. This release didn’t fully meet our standards or our communities’ expectations. After listening to the valuable community feedback, instead of a quick fix, we took the time to further develop a version that advances our mission to transform visual media.

Stable Diffusion 3.5 reflects our commitment to empower builders and creators with tools that are widely accessible, cutting-edge, and free for most use cases. We encourage the distribution and monetization of work across the entire pipeline – whether it’s fine-tuning, LoRA, optimizations, applications, or artwork.”

What’s being released

Stable Diffusion 3.5 offers a variety of models developed to meet the needs of scientific researchers, hobbyists, startups, and enterprises alike:

Stable Diffusion 3.5 Large: At 8 billion parameters, with superior quality and prompt adherence, this base model is the most powerful in the Stable Diffusion family. This model is ideal for professional use cases at 1 megapixel resolution.
Stable Diffusion 3.5 Large Turbo: A distilled version of Stable Diffusion 3.5 Large generates high-quality images with exceptional prompt adherence in just 4 steps, making it considerably faster than Stable Diffusion 3.5 Large.
Stable Diffusion 3.5 Medium (to be released on October 29th): At 2.5 billion parameters, with improved MMDiT-X architecture and training methods, this model is designed to run “out of the box” on consumer hardware, striking a balance between quality and ease of customization. It is capable of generating images ranging between 0.25 and 2 megapixel resolution.

Developing the models

In developing the models, we prioritized customizability to offer a flexible base to build upon. To achieve this, we integrated Query-Key Normalization into the transformer blocks, stabilizing the model training process and simplifying further fine-tuning and development.

To support this level of downstream flexibility, we had to make some trade-offs. Greater variation in outputs from the same prompt with different seeds may occur, which is intentional as it helps preserve a broader knowledge-base and diverse styles in the base models. However, as a result, prompts lacking specificity might lead to increased uncertainty in the output, and the aesthetic level may vary.

For the Medium model specifically, we made several adjustments to the architecture and training protocols to enhance quality, coherence, and multi-resolution generation abilities.

Where the models excel

The Stable Diffusion 3.5 version excels in the following areas, making it one of the most customizable and accessible image models on the market, while maintaining top-tier performance in prompt adherence and image quality:

Customizability: Easily fine-tune the model to meet your specific creative needs, or build applications based on customized workflows.
Efficient Performance: Optimized to run on standard consumer hardware without heavy demands, especially the Stable Diffusion 3.5 Medium and Stable Diffusion 3.5 Large Turbo models.
Diverse Outputs: Creates images representative of the world, not just one type of person, with different skin tones and features, without the need for extensive prompting.

Additionally, our analysis shows that Stable Diffusion 3.5 Large leads the market in prompt adherence and rivals much larger models in image quality.

Stable Diffusion 3.5 Large Turbo offers some of the fastest inference times for its size, while remaining highly competitive in both image quality and prompt adherence, even when compared to non-distilled models of similar size

Stable Diffusion 3.5 Medium outperforms other medium-sized models, offering a balance of prompt adherence and image quality, making it a top choice for efficient, high-quality performance.

Stable Diffusion 3.5 Score

More ways to access the models

While the model weights are available on Hugging Face now for self-hosting, you can also access the model through the following platforms:

The Stability AI Community license at a glance

We are pleased to release this model under our permissive community license. Here are the key components of the license:

Free for non-commercial use: Individuals and organizations can use the model free of charge for non-commercial use, including scientific research.
Free for commercial use (up to $1M in annual revenue): Startups, small to medium-sized businesses, and creators can use the model for commercial purposes at no cost, as long as their total annual revenue is less than $1M.
Ownership of outputs: Retain ownership of the media generated without restrictive licensing implications.

Follow INCPAK on Facebook / Twitter / Instagram for updates.