Ghostboard pixel Skip to content

This Open-Source AI Image Generator Plans to Take on Stable Diffusion 3

Finally, a truly open-source Stable Diffusion contender!

The ability of generative AI (GenAI) to create images has been evolving at a rapid pace thanks to the many innovations in the field, ushering in a new era of creative spontaneity.

Unfortunately, that has come at the cost of many people's hard work being plagiarized without their knowledge, OpenAI knows what I am talking about.

In any case, the use and development of such tech has not slowed down, with one of the most talked-about GenAI models in recent times being Stable Diffusion 3.

However, its licensing terms didn't sit well with many, and rightly so, highlight the need for a capable, more permissive model.

That is where a U.S.-based AI outfit, fal.ai's challenger, “AuraFlow” has stepped in.

🚧
This is an under-development piece of software that's not recommended for production use, but is still good enough for casual use.

AuraFlow: A New Contender Enters The Ring

a screenshot consisting of many image generated using auraflow
Various images generated using AuraFlow

Born out of a need for a state-of-the-art open-source model, fal collaborated with developers and researchers to introduce an initial AuraFlow 0.1 release, which has been made available under the Apache 2.0 License.

When implemented, users can perform text-to-image generation tasks with AuraFlow, provided they have the necessary hardware to do so, as it's quite resource intensive.

fal shared that during four weeks of intensive compute time, AuraFlow was taken through exhaustive training, with pretraining of images in various sizes such as 256×256, 512×512, and 1024×1024, followed up by aspect ratio fine-tuning, and a few other tweaks.

All that resulted in a GenEval score of 0.63~0.67 during pretraining for the final model, with the score going up to 0.703 after the use of a prompt-enhancement pipeline.

Moving on from the numbers, fal has provided an online demo for users to check out AuraFlow in action.

a screenshot of the auraflow online demo

When I ran a prompt for generating an image with a happy Tux looking into the horizon, I was presented with the above result, which was creepy as heck.

Like, why does it have only one limb? Where are its flippers, its face? Mind you, I didn't touch any of the advanced settings, this was the result with the default settings, running on fal's NVIDIA A100 GPU-equipped machine.

I do have a hunch that its face is actually pointing towards the horizon, but it's one limb facing in the wrong direction is enough to terrify people.

Anyhow, it's in a very early phase of development, so I can let it slide.

There are also plans to introduce a less resource intensive model for running on GPUs with lower VRAM and compute power, fal says to expect this very soon.

Wrapping up, if you are eager to learn how they were able to pull off AuraFlow, I highly suggest you give the announcement blog a read.

Moreover, with Decrypt's original coverage, they have gone in deep, with very comprehensive benchmarks pitching it against Stable Diffusion 3 Medium, if you are curious to know.

Get AuraFlow

As is the standard for most open AI models, the model weights for AuraFlow are available on Hugging Face, where you will find all the relevant details and files.

It has seen over 30,000 downloads already. 😯

Those looking to build Comfy workflows with this model can get started with the latest version of ComfyUI.

If you are interested in being part of its community, then you can join fal's official Discord server.

💬 If you have played around with AuraFlow, do let me know how your experience was!

Suggested Read 📖

10 AI-based Search Engines I Tested Recently
No one’s happy, but they’re here. And, for the sake of it, we take a look at some AI search engine options.

More from It's FOSS...

Latest