Browsed by
Category: AI

Riffusion

Riffusion

Riffusion (Riff-fusion) is a music AI that you type in prompts and it generates music for you. It’s not going to win any awards anytime soon but it does seem to handle smooth and electronic tunes pretty well. Honestly, if I heard some of this in an elevator, I doubt I would notice.

One more step towards our automatically generated content future.

AI trained to get images from MRI brain scans

AI trained to get images from MRI brain scans

The top is the image the person saw, the bottom image is what the AI re-created from their brain scans

Researchers at Osaka University in Japan are among the ranks of scientists using A.I. to make sense of human brain scans. While others have tried using AI with MRI scans to visualize what people are seeing, the Osaka approach is unique because it used Stable Diffusion to generate the images. This greatly simplified their model so it required only a few thousands, instead of millions, of training parameters.

Normally, Stable Diffusion takes text descriptions/prompts which are run through a language model. That language model is trained against a huge library of images to generate a text-to-image latent space that can be queried to generate new amalgamated images (yes, a gross simplification).

The Osaka researchers took this one step further. The researchers used functional MRI (fMRI) scans from an earlier, unrelated study in which four participants looked at 10,000 different images of people, landscapes and objects while being monitored in an fMRI. The Osaka team then trained a second A.I. model to link brain activity in fMRI data with text descriptions of the pictures the study participants looked at.

Together, these two models allowed Stable Diffusion to turn fMRI data into relatively accurate images that were not part of the A.I. training set. Based on the brain scans, the first model could recreate the perspective and layout that the participant had seen, but its generated images were of cloudy and nonspecific figures. But then the second model kicked in, and it could recognize what object people were looking at by using the text descriptions from the training images. So, if it received a brain scan that resembled one from its training marked as a person viewing an airplane, it would put an airplane into the generated image, following the perspective from the first model. The technology achieved roughly 80 percent accuracy.

The team shared more details in a new paper, which has not been peer-reviewed, published on the preprint server bioRxiv.

Links:

Anadol’s data projections

Anadol’s data projections

Refik Anadol makes projection mapping and LED screen art. His unique approach, however, is embracing massive data sets churned through various AI algorithms as his visualization source.

I think one of his unique additions to the space is visualizing the latent space generated during machine learning stages.

Some of his projects:

Install Stable Dreamfusion on Windows

Install Stable Dreamfusion on Windows

I wrote about Stable Dreamfusion previously. Dreamfusion first takes normal Stable Diffusion text prompts to generate 2D images of the desired object. Stable Dreamfusion then uses those 2D images to generate 3D meshes.

A hamburger

The authors seemed to be using A100 nVidia cards on an Ubuntu system. I wanted to see if I could get this to work locally on my home Windows PC, and found that I could do so.

System configuration I am using for this tutorial:

  • nVidia GeForce GTX 3090
  • Intel 12th gen processor
  • Windows 10

Setting Stable Dreamfusion up locally:

Step 1: Update your Windows and drivers

  1. Update Windows
  2. Ensure you have the latest nVidia driver installed.

Step 2: Install Windows Subsystem for Linux (WSL)

  1. Install Windows Subsystem for Linux (WSL). WSL install is a simple command line install. You’ll need to reboot after you install. You want to make sure you install Ubuntu 22.04, which is the default in Feb 2023 since that is what Stable Dreamfusion likes. Currently WSL installs the latest Ubuntu distro by default, so this works:
    wsl –install
    If you want to make sure you get Ubuntu 22.04, use this command line:
    wsl –install -d Ubuntu-22.04
  2. After installing WSL, Windows will ask to reboot.
  3. Upon reboot, the WSL will complete installation and ask you to create a user account.
  4. Start Ubuntu 22.04 on WSL by clicking on the Windows Start menu and typing ‘Ubuntu’ or you can type Ubuntu at a command prompt and type ‘Ubuntu’.

Step 2b (optional): Install Ubuntu wherever you want on your Windows system. By default it installs the image on your C:\Users directory – which is kind of annoying.

Step 3: Install dependent packages on Ubuntu

  1. If you don’t have Ubuntu started, go ahead and start Ubuntu 22.04 on WSL by clicking on the Windows Start menu and typing ‘Ubuntu’ (or you can type Ubuntu at a command prompt as well). A new shell terminal should appear.
  2. You need to install the nVidia CUDA SDK on Ubuntu. You’ll choose one of these two options:
    • You will then get a set of install instructions at the bottom of the page (wget, apt-get, etc). Simply copy the lines one by one and put them into your Ubuntu terminal. Ensure each step passes without errors before continuing.
    • The ‘sudo apt-get -y install cuda’ line will install a lot of packages. It can take 10-15 minutes.
  3. Install python3 pip. This is required for the Dreamfusion requirements installation script.
    • sudo apt install python3-pip

Step 4: Install Stable Dreamfusion and dependent packages

  1. You should now follow the install instructions found on the Dreamfusion page.
  2. Clone the project as directed: git clone https://github.com/ashawkey/stable-dreamfusion.git
  3. Install with PIP: Install the pre-requisites via pip as directed on the Dreamfusion github page:
    • pip install -r requirements.txt
    • I also installed both optional packages nvdiffrast and CLIP.
    • Add this export line to your .bashrc to ensure python can find libcudnn:
      export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH
  4. I did not install the build extension options
  5. Exit and restart your shell so that all path changes take effect

Step 5: Run a workload!

Follow the instructions in the USAGE section of the Dreamfusion instructions. Instead of ‘python’ use ‘python3’. They have a number of things you can specify like negative prompts, using the GUI interface (which does not work under WSL),

The very first run will take a long time. It will download several gigabytes of training data, then train 100 epoch’s, which can take up to an hour.

$> python3 main.py --text "a hamburger" --workspace trial -O 
$> python3 main.py --text "a hamburger" --workspace trial -O --sd_version 1.5 
$> python3 main.py --workspace trial -O --test 
$> python3 main.py --workspace trial -O --test --save_mesh 

Check Your Output:

Look in the results directory under the workspace name:

.\stable-dreamfusion\<workspace name>\mesh\ #directory holds the .obj, .mat, and .png files
.\stable-dreamfusion\<workspace name>\results\ #directory holds a mp4 video that shows the object rotating

Copying them to Windows:
All Windows drives are pre-mounted in \mnt\<drive letter>\ for WSL.
Ex: \mnt\c\
So you can copy the output files to your windows side by doing:
cp -rP .\<workspace name> \mnt\c\workdir\

Looking at the generated meshes with materials:

  1. Install Blender
  2. File->Import->Wavefront (.obj) (legacy)
  3. Or, use 3D Viewer (though it seems to have issues with material loading at times)

Fixes:

  1. You might get an error about missing libcudnn_cnn_infer.so.8
==> Start Training trial Epoch 1, lr=0.050000 …
0% 0/100 [00:00<?, ?it/s]Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory

add this to your .bashrc to ensure it can find libcudnn:
export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH

2. If you load the object in Blender but it doesn’t load the texture maps, try Alt-Z

Links:

Midjourney intro + prompt guide

Midjourney intro + prompt guide

Matt Wolfe briefly walks you through getting Midjourney set up (via Discord) and then gives you some great geting started prompts to help you learn different styles and image generation capabilities.

He also recommends Guy Parsons who gives out lots of tips on building prompts and who has a free e-book with some of his best tips.

Comparing AI art generators for common artist workloads

Comparing AI art generators for common artist workloads

Gamefromscratch decides to do side-by-side results tests on DALL-E 2, Stable Diffusion, and MidJourney for a variety of art generation tasks.

His conclusion is they are not going to replace artist for all tasks, but for concept art, pixel art, and some other simple tasks these AI generators can replace artists.

How close are we a complete world transformation due to AI?

How close are we a complete world transformation due to AI?

Tom Scott distils down his encounter with AI doing a job he used to do (almost equally well) and then reflects on why this could be a completely transformative development for the world – much like when the internet really took off in the late 90’s. I think he’s probably right. As someone that has played with AI art generation and watching the ground breaking papers that are using AI for even traditional rendering and modeling tasks in just the graphics world, I think we’re just at the first part of his sigmoid curve.

This transformation is likely to be very different than just the early internet upheavals of the music industry, cellular phones, and stores/commerce that he describes. Those were largely transformations of market form with the same commercial and societal needs.

I think this is different in at least 2 ways. First, AI is bringing about a change in which thought, analysis, creativity, and response to problems themselves is likely about to be abdicated (and somewhat blindly by the lazy or those that aren’t critically looking at what is being generated). And we’ll be abdicating that power to systems aren’t truly or fully understood, controlled, or protected.

With things like chatGPT, we will very easily start abdicating the hard work of thinking itself. If we no longer crafting the actual language of our responses, doing the hard logical work of building arguments for our daily actions or policies we live by – we will never develop the critical thinking ability to even question what is generated. Instead, they are generated for us. What would that do to us long term? Especially we we already see that chatGPT and other AI systems can get things terribly wrong – and not give us the first clue they are wrong.

Secondly, like all tools, they could even be controlled/manipulated by nefarious agents. Today, our most deadly and horrific tools of destruction (nuclear bombs and sophisticated strategic weapons) are today largely contained within government military systems and by needing the highly specialized ability to build them.

AI can be wielded by anyone, anywhere in the world, with any motivation (political, personal, etc.). With just a small rack of commercially available servers, one has the ability to unleash the kind of infinitely scalable social media posting, auto-responding, narrative controlling, news story generating, and possibly subverted think-for-you devices upon the whole world.

We have known since at least 2019 that this is happening on all major social media platforms despite the best efforts of some of the smartest people in Silicon Valley working on it. Smarter every day did a series of stories on the problem. Research has proven again and again these things are happening and are very, very easy to do and very, very hard to stop:

A few clever AI systems that would likely cost less than a single cruise missile could easily overwhelm social media forums, message boards, Wikipedia edits, generated news articles, etc – before we could ever hope to verify the claims or combat its ability to generate hundreds of thousands of responses, up/down votes, planted webpage articles, etc every hour. How could one even verify the claims if everything is suspect? Why WOULDN’T a country do this if it cost less than what a single missile costs? Even better, what if the AI can be subverted to bias certain responses (which we have already seen too)?

In the post-truth internet, people are well into putting their trust in anonymous influencer opinions and echo chamber forum posts before well verified facts. What will this mean in our internet era in which ‘objective facts are less influential in shaping public opinion than appeals to emotion and personal belief.’?

My how far we’ve come from the idea that the internet would become a forum in which people share ideas and the best ones rise to the top. How dangerously naïve we were…

Leaking corporate code via chatGPT

Leaking corporate code via chatGPT

After catching snippets of text generated by OpenAI’s powerful ChatGPT tool that looked a lot like company secrets, Amazon is now trying to head its employees off from leaking anything else to the algorithm.

This issue seems to have come to a head recently because Amazon staffers and other tech workers throughout the industry have begun using ChatGPT as a “coding assistant” of sorts to help them write or improve strings of code, the report notes.

While this isn’t necessarily a problem from a proprietary data perspective, it’s a different story when employees start using the AI to improve upon existing internal code — which is already happening, according to the lawyer.

Installing Stable Diffusion 2.0/2.1

Installing Stable Diffusion 2.0/2.1

Stable Diffusion 2.0 was largely seen as a dud. Past version 1.5 you should be aware that the outcry of various artists against having their works sampled resulted in the 2.x branches trying to use less of these public sources. This means it has a more limited training set and likely more limited output variety.

If you are interested in trying Stable Diffusion 2.1, use this tutorial to installing and use 2.1 models in AUTOMATIC1111 GUI, so you can make your judgement by using it.

Here are 2 different Stable Diffusion 2.1 tutorials:

You might also try this tutorial by TingTing

Retro games with modern graphics – using AI

Retro games with modern graphics – using AI

We’re already seeing a real revolutions in retro gaming via emulation. Preservation of old hardware is important, but it’s also seen as almost impossible task as devices mass produced to only last 5-10 years in the consumer market reach decades of age. Failure rates will eventually reach 100% over enough time (unless people re-create the hardware). But with modern emulators, you can still play all the different games on modern hardware.

On a separate development note, we’ve also seen graphics effects like anti-aliasing and upscaling get the AI treatment. Instead of hand-coded anti-aliasing kernels, they can be generated automatically by AI and the results are now included in all major hardware vendors.

But what about the very graphics content itself? Retro game art has it’s own charm, but what if we gave it the AI treatment too?

Jay Alammar wanted to see what he could achieve by pumping in some retro game graphics from the MSX game Nemesis 2 (Gradius) into Stable Diffusion, Dall-E, and Midjourney art generators. He presents a lot of interesting experiments and conclusions. He used various features like in-painting, out-painting, Dream Studio and all kinds of other ideas to see what he could come up with.

The hand-picked results were pretty great:

He even went so far as to convert the original opening sequence to use the new opening graphics here:

I think this opens up a whole new idea. What if you replaced the entire game graphics elements with updated AI graphics? The results would essentially just become a themed re-skinning with no gameplay (or even level changes), but this definitely brings up the idea of starting your re-theming for new levels (fire levels, ice levels, space levels, etc) by auto-generating the graphics.

Then it brings up the non-art idea of re-theming the gameplay itself – possibly using AI generated movement or gameplay rules. Friction, gravity, jump height, etc – could all be given different models (Mario style physics, Super Meat Boy physics, slidy ice-level physics) and then let the AI come up with the gravity, bounce, jump parameters.

Interesting times…

Links: