Stable Diffusion 2.0 was largely seen as a dud. Past version 1.5 you should be aware that the outcry of various artists against having their works sampled resulted in the 2.x branches trying to use less of these public sources. This means it has a more limited training set and likely more limited output variety.
We’re already seeing a real revolutions in retro gaming via emulation. Preservation of old hardware is important, but it’s also seen as almost impossible task as devices mass produced to only last 5-10 years in the consumer market reach decades of age. Failure rates will eventually reach 100% over enough time (unless people re-create the hardware). But with modern emulators, you can still play all the different games on modern hardware.
On a separate development note, we’ve also seen graphics effects like anti-aliasing and upscaling get the AI treatment. Instead of hand-coded anti-aliasing kernels, they can be generated automatically by AI and the results are now included in all major hardware vendors.
But what about the very graphics content itself? Retro game art has it’s own charm, but what if we gave it the AI treatment too?
Jay Alammar wanted to see what he could achieve by pumping in some retro game graphics from the MSX game Nemesis 2 (Gradius) into Stable Diffusion, Dall-E, and Midjourney art generators. He presents a lot of interesting experiments and conclusions. He used various features like in-painting, out-painting, Dream Studio and all kinds of other ideas to see what he could come up with.
The hand-picked results were pretty great:
He even went so far as to convert the original opening sequence to use the new opening graphics here:
I think this opens up a whole new idea. What if you replaced the entire game graphics elements with updated AI graphics? The results would essentially just become a themed re-skinning with no gameplay (or even level changes), but this definitely brings up the idea of starting your re-theming for new levels (fire levels, ice levels, space levels, etc) by auto-generating the graphics.
Then it brings up the non-art idea of re-theming the gameplay itself – possibly using AI generated movement or gameplay rules. Friction, gravity, jump height, etc – could all be given different models (Mario style physics, Super Meat Boy physics, slidy ice-level physics) and then let the AI come up with the gravity, bounce, jump parameters.
Photographer and filmmaker Nicholas Kouros spent “hundreds of hours” over 4 years creating a stop-motion meme-themed music video using paper prints and cutouts for a song called Ruined by the metal band Blame Kandinsky. He then created a new version using AI – in 4 days.
The work on the original physical shoot was intense:
“Cutting out all individual pieces was a serious task. Some of the setups were so labor-intensive, I had friends over for days to help out,” says Kouros.
“Every piece was then assembled using various methods, such as connecting through rivets and hinges. We shot everything at 12fps using Dragonframe on a DIY rostrum setup with a mirrorless Sony a7S II and a Zeiss ZE f/2 50mm Macro-Planar lens.”
In a move that likely avoided copyright issues, he used freely usable images. “Most of Ruined was made using public domain paintings and art found on museum websites like Rijks or the Met”
After everything had been shot, the RAW image sequences were imported to After Effects and later graded in DaVinci Resolve.
Using AI instead
Kouros then created a second music video but this time he used AI. The video took a fraction of the time to make. “In direct contrast with my previous work for the same band, Vague by Blame Kandinsky, it took a little over four days of experimenting, used a single line of AI text prompting, and 20 hours of rendering,”
“The text prompt line used was: ‘Occult Ritual, Rosemary’s Baby Scream, Flemish renaissance, painting by Robert Crumb, Death.’”
Kouros describes his experience with AI as “fun” and was impressed with the results that the image synthesizer gave him.
What was his final take?
“In my opinion, this specific style of animation won’t stand the test of time, but it will probably be a reminder of times before this AI thing really took off.
I embrace new tech as it comes along and I have already started making images with the aid of image generators. I’ve actually learned more about art history in this last year using AI, than in seven years of art schools.”
‘Les Tontons flingueurs‘ doesn’t seem to have the first clue how silencers sound – or even how guns work. Many claim this was done on purpose and it’s a well known comedy scene. Still, there’s some real laughs in the youtube comments.
World Dance New York seems to teach some really high-end classes from Hip-hop, Flamenco, Samba, Fire dancing, and belly dance all the way to prenatal and even self defense.
They publish some really high quality performances like this one that shows a great combination of class and artistry. How she can dance while balancing a sword on her head is astounding.
CETI (Creative and Emergent Technology Institute) is a local creative group that experiments with different technologies for creating unique experiences. Sarah Turner is a local artist that has been experimenting with different media and video technologies – which she calls the Mobile Projection Unit. This project has set up projection mapping displays at a number of different art and media festivals.
In this video she goes over some of the things she’s learned from these projection mapping setups:
3blue1brown makes lots of good videos on mathematics. One of those videos is how to visualize and understand quaternions.
Quaternions are a higher dimension system that can be used to describe 2D and 3D rotations. How they work, however, is often much harder to understand and more complex than understanding simple matrix rotations.
They made a very good video on the subject, but it required me to stop a lot and spend time thinking. These are complex concepts and almost more complex to visualize or conceptualize in your mind.
What’s nice is there is a written page that goes over these concept as well at https://eater.net/quaternions. I found this was much easier to digest than a fast-running video.
Also, if you want to play with the visualization in realtime, they even have a super-cool tool that lets you play with Quaternions in 2D, 3D, and 4D:
“Give me a stock clerk with a goal and I’ll give you a man that can make history. Give me a man without a goal, and I’ll give you a stock clerk.” – J.C. Penny
My dad almost always had a Zig Ziglar tape playing in the car when I was growing up. These simple principles of setting and achieving goals have stuck with me almost my entire life – and have lead me along a tremendously successful career that I could have never imagined as a small kid growing up in rural Indiana.
I can attest that by following his simple and time proven method of goal-setting, I have achieved many of the biggest goals I have wanted in my life. Give it a listen.
nVidia GPU’s top the Stable Diffusion performance charts
They tried a number of different combinations and experiments such as changing the sampling algorithms (though they didn’t make much difference in performance), output size, etc. I wish, however, they discussed and compared the differences in memory sizes on these cards more clearly. Stable Diffusion is a memory hog, and having more memory definitely helps. They also didn’t check any of the ‘optimized models’ that allow you to run stable diffusion on as little as 4GB of VRAM.
There were some fun anomalies – like the RTX 2080 Ti often outperforming the RTX 3080 Ti.
AMD and Intel cards seem to be leaving a lot of performance on the table because their hardware should be able to do better than it is currently doing. Arc GPU’s matrix cores should provide similar performance to the RTX 3060 Ti and RX 7900 XTX, give or take, with the A380 down around the RX 6800. In practice, Arc GPUs are nowhere near those marks. This doesn’t shock me personally since nVidia has been much more invested and in the forefront of developing and optimizing AI libraries.