The ability of generative AI (GenAI) to create images has been evolving at a rapid pace thanks to the many innovations in the field, ushering in a new era of creative spontaneity.
Unfortunately, that has come at the cost of many people's hard work being plagiarized without their knowledge, OpenAI knows what I am talking about.
In any case, the use and development of such tech has not slowed down, with one of the most talked-about GenAI models in recent times being Stable Diffusion 3.
However, its licensing terms didn't sit well with many, and rightly so, highlight the need for a capable, more permissive model.
That is where a U.S.-based AI outfit, fal.ai's challenger, “AuraFlow” has stepped in.
AuraFlow: A New Contender Enters The Ring
Born out of a need for a state-of-the-art open-source model, fal collaborated with developers and researchers to introduce an initial AuraFlow 0.1 release, which has been made available under the Apache 2.0 License.
When implemented, users can perform text-to-image generation tasks with AuraFlow, provided they have the necessary hardware to do so, as it's quite resource intensive.
fal shared that during four weeks of intensive compute time, AuraFlow was taken through exhaustive training, with pretraining of images in various sizes such as 256×256, 512×512, and 1024×1024, followed up by aspect ratio fine-tuning, and a few other tweaks.
All that resulted in a GenEval score of 0.63~0.67 during pretraining for the final model, with the score going up to 0.703 after the use of a prompt-enhancement pipeline.
Moving on from the numbers, fal has provided an online demo for users to check out AuraFlow in action.
When I ran a prompt for generating an image with a happy Tux looking into the horizon, I was presented with the above result, which was creepy as heck.
Like, why does it have only one limb? Where are its flippers, its face? Mind you, I didn't touch any of the advanced settings, this was the result with the default settings, running on fal's NVIDIA A100 GPU-equipped machine.
I do have a hunch that its face is actually pointing towards the horizon, but it's one limb facing in the wrong direction is enough to terrify people.
Anyhow, it's in a very early phase of development, so I can let it slide.
There are also plans to introduce a less resource intensive model for running on GPUs with lower VRAM and compute power, fal says to expect this very soon.
Wrapping up, if you are eager to learn how they were able to pull off AuraFlow, I highly suggest you give the announcement blog a read.
Moreover, with Decrypt's original coverage, they have gone in deep, with very comprehensive benchmarks pitching it against Stable Diffusion 3 Medium, if you are curious to know.
Get AuraFlow
As is the standard for most open AI models, the model weights for AuraFlow are available on Hugging Face, where you will find all the relevant details and files.
It has seen over 30,000 downloads already. 😯
Those looking to build Comfy workflows with this model can get started with the latest version of ComfyUI.
If you are interested in being part of its community, then you can join fal's official Discord server.
💬 If you have played around with AuraFlow, do let me know how your experience was!
Suggested Read 📖
Here's why you should opt for It's FOSS Plus Membership
- Even the biggest players in the Linux world don't care about desktop Linux users. We do.
- We don't put content behind paywall. Your support keeps it open for everyone. Think of it like 'pay it forward'.
- Don't like ads? With the Plus membership, you get an ad-free reading experience.
- When millions of AI-generated content is being published daily, you read and learn from real human Linux users.
- It costs just $2 a month, less than the cost of your favorite burger.