Browsed by
Category: AI

Next 10 years of AI

Next 10 years of AI

Andrew Ng is one of the biggest names in AI. He makes a few predictions, and I thought the article had some good observations.

His current big focus is using AI in manufacturing. Andrew Ng founded Landing AI in 2017. His focus was primarily consulting, but after working on many customer projects, Ng and Landing AI developed a new toolkit and playbook for making AI work in manufacturing and industrial automation. This led to Landing AI and the development of a data-centric approach to AI.

“In consumer software, you can build one monolithic AI system to serve a hundred million or a billion users, and truly get a lot of value in that way,” he said. “But in manufacturing, every plant makes something different. So every manufacturing plant needs a custom AI system that is trained on their data.”

The challenge that many companies in the AI world face is how to help 10,000 manufacturing plants build 10,000 customer systems. In short – scale.

In manufacturing, there is often no big data to go by. The data for manufacturing different products is unique. Their first observation was to see it makes more sense to keep the models relatively fixed while focusing on quality data to fine-tune the models rather than continuing to push for marginal improvements in the models.

This uniqueness of data also means there is almost never enough images of faults or cases to train models. The only way out of this dilemma is to build tools that empower customers to build their own models and let product experts engineer the data and express their domain knowledge. Ng and Landing AI do that through Landing Lens, which enables domain experts to express their knowledge with data labeling instead of constantly tweaking the models.

Worth a read.

AI illistrated book get unprecedented copyright

AI illistrated book get unprecedented copyright

Earlier this year, the US Copyright Office ruled against awarding copyrights to AI systems themselves. “The courts have been consistent in finding that non-human expression is ineligible for copyright protection,” the Office reasoned in February, citing previous cases involving attempts to copyright based on “divine inspiration,” as well as that time someone tried to secure copyright protection for a monkey selfie.

In the face of this, New York-based artist Kris Kashtanova claims to be the first known artist to receive a US copyright registration for Zayra of the Dawn, a graphic novel featuring latent diffusion AI-assisted artwork.

“I was open how it was made and put Midjourney on the cover page. It wasn’t altered in any other way. Just the way you saw it here,” Kashtanova wrote in an announcement posted to Instagram last week. “I tried to make a case that we do own copyright when we make something using AI. I registered it as visual arts work. My certificate is in the mail and I got the number and a confirmation today that it was approved.” Kashtanova also noted that they first got the idea to show that artists “do own copyright when we make something using AI” from a “friend lawyer.”

The industry starts taking sides

On September 21, Getty Images CEO Craig Peters told The Verge that the company would no longer accept AI-generative artwork into its catalogue, citing concerns over copyright legality and privacy. “There are real concerns with respect to the copyright of outputs from these models and unaddressed rights issues with respect to the imagery, the image metadata and those individuals contained within the imagery”

Or embrace it!

Even more interesting is that there is now a whole website for comic books created by generated AI artwork.

Links:

Using Stable Diffusion for compression

Using Stable Diffusion for compression

Last week, Swiss software engineer Matthias Bühlmann discovered that the popular image synthesis model Stable Diffusion could compress existing 2D images with fewer visual artifacts than JPEG or WebP at high compression ratios, though there are some important limitations.

When Stable Diffusion analyzes and “compresses” images into weight form, they reside in what researchers call “latent space,” which is a way of saying that they exist as a sort of fuzzy potential that can be realized into images once they’re decoded. With Stable Diffusion 1.4, the weights file is roughly 4GB, but it represents knowledge about hundreds of millions of images.

While most people use Stable Diffusion with text prompts, Bühlmann cut out the text encoder and instead forced his images through Stable Diffusion’s image encoder process, which takes a low-precision 512×512 image and turns it into a higher-precision 64×64 latent space representation. At this point, the image exists at a much smaller data size than the original, but it can still be expanded (decoded) back into a 512×512 image with fairly good results.

Bühlmann’s method currently comes with significant limitations. It’s not good with faces or text, and in some cases, it can inject detail features in the decoded image that were not present in the source image. (You probably don’t want your image compressor inventing details in an image that don’t exist.) Also, decoding requires the 4GB Stable Diffusion weights file and extra decoding time that are inherent with Stable Diffusion.

Not the first time that AI has been explored as a method of compression as much as generation. Daniel Holden of Ubisoft presented an astounding paper at GDC in 2018 about using neural nets to compress animation data used in video game character animation.

Links:

DALL·E2 AI generated music video

DALL·E2 AI generated music video

DALL·E 2 is a pretty astounding new natural language AI system that can create realistic images and art from simple written descriptions. It can also combine concepts, attributes, and styles – all by simply typing in what you want in text.

Below (and on the website) are some examples of what the AI generated from the simple text description.
1. “An astronaut riding a horse in a photorealistic style”

2. “A bowl of soup that is a portal to another dimension as digital art”

3. “Teddy bears shopping for groceries in the style of ukyo-e”

Grocery Trip

Grocery Trip

“Computers are good at lots of tasks – but they’ll never replace creative activities and artists”

May I present Pouff’s grocery shopping video (Grocery Trip). It was created back in 2015 using neural network technology which attempted to identify animal faces in places where they didn’t actually exist.

Incidentally, Mario Klingemann disagrees with the first statement. “Humans are not original,” he says. “We only reinvent, make connections between things we have seen.” While humans can only build on what we have learned and what others have done before us, “machines can create from scratch”

Everybody Dance now!

Everybody Dance now!

More astounding technology. Take any source dancer, a clip of the target person, and then make anyone do the same dance.

This will probably be used very soon to make whole hosts of dancers in movies/music videos all in perfect sync while only paying for one source dancer.

It could be used to bring back deceased dancers, or apply the dances of deceased dancers onto new artists.

Full paper and details here: