Browsed by
Category: AI

AI based digital re-aging

AI based digital re-aging

Disney published this paper about using AI to digitally age and de-age actors in a fraction of the time it usually takes for normal frame-by-frame manual aging techniques used today.

FRAN (which stands for face re-aging network) is a neural network that was trained using a large database containing pairs of randomly generated synthetic faces at varying ages, which bypasses the need to otherwise find thousands of images of real people at different (documented) ages that depict the same facial expression, pose, lighting, and background. Using synthetically generated training data is a method that’s been utilized for things like training self-driving cars to handle situations that aren’t easily reproducible.

The age changes are then added/merge onto the face. It appears this approach fixes a lot of the issues common in this kind of approach: facial identity loss, poor resolution, and unstable results across subsequent video frames. It does have some issues with greying hair and aging very young actors, but produces results better than techniques used just a few years ago (not that the bar was very hard to beat).

Links:

AI architecture

AI architecture

Architects and designers are increasingly experimenting with AI generated art and designs. Michael Arellanes II of MA2 Studio created a series called ‘Synthetic Futures’ in which he experiments primarily with Midjourney in an attempt to create a consistent and controlled aesthetic for architecture. 

I personally think wide-scale use of AI based art generation to continue a theme or even explore and create new ideas/directions is a foregone conclusion at this point. I’m continually astounded by the results these algorithms generate. Results that will just get better very quickly.

Arellanes seems to agree when he says: ‘The current open platforms for AI imagery work from word descriptions alone, as opposed to architectural 3D modeling and/or encoding surface parameters. This leaves the operator with flat images or AI impressions based on descriptions with extraordinary results of the unexpected. The unexpected results are the most exciting aspect of this new paradigm. As designers test the limits of AI’s imagination and complex image compositions, new possibilities emerge that have never been seen before.’

Enhancing your stable diffusion game

Enhancing your stable diffusion game

AI generated art has caught fire. Learning how to generate the command line prompts to generate the art is still a work of trial and error. But some folks are helping you learn by giving some example prompts to help you learn what works and what doesn’t.

All the below items were 100% auto-generated and included on the page. It looks like people are exploring and sharing different prompts to generate different kinds of art.

  • Portrait Photography
  • Graphic Design
  • Architecture
  • Clothing design
  • 3D and game concept art
  • Graphic design

Links:

Rise of the AI Prompt Engineer

Rise of the AI Prompt Engineer

Using online AI art generation sites like DALLE, Midjourney, and GPT-3 aren’t free or unlimited to most folks. For example, DALL-E 2 was charging 10 cents per prompted generation attempt. Trying a few hundred prompts can quickly add up. Even using free generators like Stable Diffusion, experimenting with prompts can be time consuming.

It only makes sense we’re witnessing the rise of specialist prompt writers and online marketplaces where you can buy and sell high-quality prompts that get the desired results much faster. This saves users money on API fees and time trying to tune the prompt to get what you want.

These even have names now. A prompt engineer is a specialist adept at writing the text prompts necessary for an AI model to generate reliable outputs (such as graphics, text, or code) at a reasonable price. They can then sell the specialized prompts they generate on a prompt marketplace. These are sites where users can purchase and sell prompts. The prompt maker usually keeps 80% of the sale, and the marketplace takes a 20% cut.

Below are some of the top paid Prompt Marketplaces. Definitely worth browsing to see the amazing work that can be generated by AI art algorithms.

PromptBase – offers an amazing amount of prompts for just a few dollars:

PromptHero – seems to be geared towards higher-end generation

Arthub.ai

PromptSea

Visualise.ai

Links:

Installing Stable-diffusion 1.4

Installing Stable-diffusion 1.4

Time to play with some AI generated art!

Here’s some great instructions on how to install the older stable diffusion 1.4:
https://www.howtogeek.com/830179/how-to-run-stable-diffusion-on-your-pc-to-generate-ai-images/

8GB of VRAM or less

One of the first things you’ll run into is that you’re not going to be able to generate any images at 512×512 or larger if you have a graphics card with 8GB of VRAM or less. Even smaller if you only have 4GB of VRAM. The first/easiest method is to limit the output image size. There is also an option that splits the model into 4 parts and loads each separately (thought it will take longer), or using a more optimized/compressed set of trained model data.

So how do you do that if you have an older graphics card that only has 4GB or 8GB of VRAM? TingTingin has some tips at the end of his installation video if you are using a card with 8GB of VRAM (nVidia 3070’s for example).

Summary (at 15:45): Modify your txt2img.py and add the line ‘model.half()’ after model = instantiate_from_config(config.model) in the load_model_from_config() function.

Where Zillow’s AI went wrong

Where Zillow’s AI went wrong

What went wrong with Zillow’s $500 million AI-based home purchasing program? It was a host of factors, but it highlights a unique problem in AI.

It turns out you can’t just set up an AI model and let it crank for years. You need to pay attention to something called drift. There are ways of telling if your AI model is drifting by monitoring model accuracy, outputs, and inputs on an ongoing basis and re-balancing them.

Next 10 years of AI

Next 10 years of AI

Andrew Ng is one of the biggest names in AI. He makes a few predictions, and I thought the article had some good observations.

His current big focus is using AI in manufacturing. Andrew Ng founded Landing AI in 2017. His focus was primarily consulting, but after working on many customer projects, Ng and Landing AI developed a new toolkit and playbook for making AI work in manufacturing and industrial automation. This led to Landing AI and the development of a data-centric approach to AI.

“In consumer software, you can build one monolithic AI system to serve a hundred million or a billion users, and truly get a lot of value in that way,” he said. “But in manufacturing, every plant makes something different. So every manufacturing plant needs a custom AI system that is trained on their data.”

The challenge that many companies in the AI world face is how to help 10,000 manufacturing plants build 10,000 customer systems. In short – scale.

In manufacturing, there is often no big data to go by. The data for manufacturing different products is unique. Their first observation was to see it makes more sense to keep the models relatively fixed while focusing on quality data to fine-tune the models rather than continuing to push for marginal improvements in the models.

This uniqueness of data also means there is almost never enough images of faults or cases to train models. The only way out of this dilemma is to build tools that empower customers to build their own models and let product experts engineer the data and express their domain knowledge. Ng and Landing AI do that through Landing Lens, which enables domain experts to express their knowledge with data labeling instead of constantly tweaking the models.

Worth a read.

AI illistrated book get unprecedented copyright

AI illistrated book get unprecedented copyright

Earlier this year, the US Copyright Office ruled against awarding copyrights to AI systems themselves. “The courts have been consistent in finding that non-human expression is ineligible for copyright protection,” the Office reasoned in February, citing previous cases involving attempts to copyright based on “divine inspiration,” as well as that time someone tried to secure copyright protection for a monkey selfie.

In the face of this, New York-based artist Kris Kashtanova claims to be the first known artist to receive a US copyright registration for Zayra of the Dawn, a graphic novel featuring latent diffusion AI-assisted artwork.

“I was open how it was made and put Midjourney on the cover page. It wasn’t altered in any other way. Just the way you saw it here,” Kashtanova wrote in an announcement posted to Instagram last week. “I tried to make a case that we do own copyright when we make something using AI. I registered it as visual arts work. My certificate is in the mail and I got the number and a confirmation today that it was approved.” Kashtanova also noted that they first got the idea to show that artists “do own copyright when we make something using AI” from a “friend lawyer.”

The industry starts taking sides

On September 21, Getty Images CEO Craig Peters told The Verge that the company would no longer accept AI-generative artwork into its catalogue, citing concerns over copyright legality and privacy. “There are real concerns with respect to the copyright of outputs from these models and unaddressed rights issues with respect to the imagery, the image metadata and those individuals contained within the imagery”

Or embrace it!

Even more interesting is that there is now a whole website for comic books created by generated AI artwork.

Links:

Using Stable Diffusion for compression

Using Stable Diffusion for compression

Last week, Swiss software engineer Matthias Bühlmann discovered that the popular image synthesis model Stable Diffusion could compress existing 2D images with fewer visual artifacts than JPEG or WebP at high compression ratios, though there are some important limitations.

When Stable Diffusion analyzes and “compresses” images into weight form, they reside in what researchers call “latent space,” which is a way of saying that they exist as a sort of fuzzy potential that can be realized into images once they’re decoded. With Stable Diffusion 1.4, the weights file is roughly 4GB, but it represents knowledge about hundreds of millions of images.

While most people use Stable Diffusion with text prompts, Bühlmann cut out the text encoder and instead forced his images through Stable Diffusion’s image encoder process, which takes a low-precision 512×512 image and turns it into a higher-precision 64×64 latent space representation. At this point, the image exists at a much smaller data size than the original, but it can still be expanded (decoded) back into a 512×512 image with fairly good results.

Bühlmann’s method currently comes with significant limitations. It’s not good with faces or text, and in some cases, it can inject detail features in the decoded image that were not present in the source image. (You probably don’t want your image compressor inventing details in an image that don’t exist.) Also, decoding requires the 4GB Stable Diffusion weights file and extra decoding time that are inherent with Stable Diffusion.

Not the first time that AI has been explored as a method of compression as much as generation. Daniel Holden of Ubisoft presented an astounding paper at GDC in 2018 about using neural nets to compress animation data used in video game character animation.

Links:

DALL·E2 AI generated music video

DALL·E2 AI generated music video

DALL·E 2 is a pretty astounding new natural language AI system that can create realistic images and art from simple written descriptions. It can also combine concepts, attributes, and styles – all by simply typing in what you want in text.

Below (and on the website) are some examples of what the AI generated from the simple text description.
1. “An astronaut riding a horse in a photorealistic style”

2. “A bowl of soup that is a portal to another dimension as digital art”

3. “Teddy bears shopping for groceries in the style of ukyo-e”