Browsed by
Category: Art+Design

Expanding and enhancing Stable Diffusion with specialized models

Expanding and enhancing Stable Diffusion with specialized models

Now that you have Stable Diffusion 1.5 installed on your local system, have learned how to make cool generative prompts, it might be time to take the next step of trying different latent models.

There is more than one model out there for stable diffusion, and they can generate vastly different images:

Check out this article to learn how to install and use different popular models you can use with stable diffusion:

  • F222 – People found it useful in generating beautiful female portraits with correct body part relations. It’s quite good at generating aesthetically pleasing clothing.
  • Anything V3 – a special-purpose model trained to produce high-quality anime-style images. You can use danbooru tags (like 1girl, white hair) in text prompt.
  • Open Journey – a model fine-tuned with images generated by Mid Journey v4.
  • DreamShaper – model is fine-tuned for portrait illustration style that sits between photorealistic and computer graphics
  • Waifu-diffusion – Japanese anime style
  • Arcane Diffusion – TV show Arcane style
  • Robo Diffusion – Interesting robot style model that will turn everything your subject into robot
  • Mo-di-diffusion – Generate Pixar-like style models
  • Inkpunk Diffusion – Generate images in a unique illustration style
Better stable diffusion and AI generated art prompts

Better stable diffusion and AI generated art prompts

Now that you have stable diffusion on your system, how do you start taking advantage of it?

One way is to try some sample prompts to start with. Techspot has some good ones (halfway through the article) to whet your appetite.

You can get inspiration by looking at good examples on free public prompt marketplaces.

Then you might want to learn how to fix some common problems.

When you’re really ready to dive in, this article from Metaverse gives you a list of excellent getting started guides to help get you from beginner to proficient in generating your own awesome art.

The key to it all is learning the syntax, parameters, and art of crafting AI prompts. It’s as much art as it is science. It’s complex enough that there are everything from beginner examples, free guides, tools to help, all the way to paid marketplaces.

Learning gotten a lot better in the last 6 months since people started learning how to use AI generated prompts last year.

Stable diffusion in other languages

Stable diffusion in other languages

Stable Diffusion was developed by CompVisStability AI, and LAION. It mainly uses the English subset LAION2B-en of the LAION-5B dataset for its training data and, as a result, requires English text prompts to producing images.

This means that the tagging and correlating of images and text are based on English tagged data sets – which naturally tend to come from English-speaking sources and regions. Users that use other languages must first use a translator from their native language to English – which often loses the nuances or even core meaning. On top of that, it also means the latent model images Stable Diffusion can use are usually limited to English-speaking region sources.

For example, one of the more common Japanese terms re-interpreted from the English word businessman is “salary man” which we most often imagine as a man wearing a suit. You would get results that look like this, which might not be very useful if you’re trying to generate images for a Japanese audience.

rinna Co., Ltd. has developed a Japanese-specific text-to-image model named “Japanese Stable Diffusion”. Japanese Stable Diffusion accepts native Japanese text prompts and generates images that reflect the naming and tagged pictures of the Japanese-speaking world which may be difficult to express through translation and whose images may simply not present in the western world. Their new text-to-image model was trained on source material that comes directly from Japanese culture, identity, and unique expressions – including slang.

They did this by using a two step approach that is instructive on how stable diffusion works.

First, the latent diffusion model is left alone and they replaced the English text encoder with a Japanese-specific text encoder. This allowed the text encoder to understand Japanese natively, but would still generate western style tagged images because the latent model remained intact. This was still better than just translating the stable diffusion prompt.

Now Stable Diffusion could understand what the concept of a ‘businessman’ was but it still generated images of decidedly western looking businessmen because the underlying latent diffusion model had not been changed:

The second step was to retrain the the latent diffusion model from more Japanese tagged data sources with the new text encoder. This stage was essential to make the model become more language-specific. After this, the model could finally generate businessmen with the Japanese faces they would have expected:

Read more about it on the links below.

Links:

It’s nothing… forever

It’s nothing… forever

Nothing, Forever is a 24 hour a day Twitch stream with an amazing premise. It runs 24 hours a day, 365 days of the year and delivers new content every minute. Everything you see, hear, or experience (with the exception of the artwork and laugh track) is always brand new content, continually generated via machine learning and AI algorithms. It never repeats (except when the AI generates the same content).

It was launched by Mismatch Media, a media lab focused on creating experimental forms of television shows, video games, and more, using generative and other machine learning technologies.

Give it a watch and be amazed. Sadly, it’s probably better than probably 50% of current TV shows.

Physical pixel art

Physical pixel art

The creators of PIXIO magnetic building blocks have invented another fun toy for building 3D art. VOXART uses lightweight panels which click together to create life-size voxel art. They fold flat, so you can store thousands without taking up much space. Sign up for a pre-launch discount, or get notified on Kickstarter.

Games Done Quick prizes

Games Done Quick prizes

Games Done Quick had some unique video game themed prizes donated by various artists. I spent some time finding some of the more interesting artists

Overlapping roof

Overlapping roof

Daoming Town in Sichuan Province, China, is known for its bamboo weaving traditions. “In Bamboo” is an homage to this rich local custom. Constructed in just 52 days back in 2018, the multi-use pavilion stretches 1,800 square meters and contains space for exhibitions, gatherings, and dining. The steel and wood structure supports a twisting, infinity-shaped roof of small ceramic tiles, which slopes down near a reflective pool at the center of the building.

Link: