Browsed by
Month: February 2023

Amazing balance

Amazing balance

World Dance New York seems to teach some really high-end classes from Hip-hop, Flamenco, Samba, Fire dancing, and belly dance all the way to prenatal and even self defense.

They publish some really high quality performances like this one that shows a great combination of class and artistry. How she can dance while balancing a sword on her head is astounding.

Projection Mapping with MadMapper

Projection Mapping with MadMapper

CETI (Creative and Emergent Technology Institute) is a local creative group that experiments with different technologies for creating unique experiences. Sarah Turner is a local artist that has been experimenting with different media and video technologies – which she calls the Mobile Projection Unit. This project has set up projection mapping displays at a number of different art and media festivals.

In this video she goes over some of the things she’s learned from these projection mapping setups:

Learning Quaternions

Learning Quaternions

3blue1brown makes lots of good videos on mathematics. One of those videos is how to visualize and understand quaternions.

Quaternions are a higher dimension system that can be used to describe 2D and 3D rotations. How they work, however, is often much harder to understand and more complex than understanding simple matrix rotations.

They made a very good video on the subject, but it required me to stop a lot and spend time thinking. These are complex concepts and almost more complex to visualize or conceptualize in your mind.

What’s nice is there is a written page that goes over these concept as well at https://eater.net/quaternions. I found this was much easier to digest than a fast-running video.

Also, if you want to play with the visualization in realtime, they even have a super-cool tool that lets you play with Quaternions in 2D, 3D, and 4D:

https://eater.net/quaternions/video/rotation

Zig Ziglar and the power of setting goals

Zig Ziglar and the power of setting goals

“Give me a stock clerk with a goal and I’ll give you a man that can make history. Give me a man without a goal, and I’ll give you a stock clerk.” – J.C. Penny

My dad almost always had a Zig Ziglar tape playing in the car when I was growing up. These simple principles of setting and achieving goals have stuck with me almost my entire life – and have lead me along a tremendously successful career that I could have never imagined as a small kid growing up in rural Indiana.

I can attest that by following his simple and time proven method of goal-setting, I have achieved many of the biggest goals I have wanted in my life. Give it a listen.

nVidia GPU’s top the Stable Diffusion performance charts

nVidia GPU’s top the Stable Diffusion performance charts

Toms Hardware did a great benchmarking test on which GPU’s do the best on Stable Diffusion.

They tried a number of different combinations and experiments such as changing the sampling algorithms (though they didn’t make much difference in performance), output size, etc. I wish, however, they discussed and compared the differences in memory sizes on these cards more clearly. Stable Diffusion is a memory hog, and having more memory definitely helps. They also didn’t check any of the ‘optimized models’ that allow you to run stable diffusion on as little as 4GB of VRAM.

There were some fun anomalies – like the RTX 2080 Ti often outperforming the RTX 3080 Ti.

AMD and Intel cards seem to be leaving a lot of performance on the table because their hardware should be able to do better than it is currently doing. Arc GPU’s matrix cores should provide similar performance to the RTX 3060 Ti and RX 7900 XTX, give or take, with the A380 down around the RX 6800. In practice, Arc GPUs are nowhere near those marks. This doesn’t shock me personally since nVidia has been much more invested and in the forefront of developing and optimizing AI libraries.

lekktor Demoscene compressor

lekktor Demoscene compressor

The 90’s Demoscene subculture was famous for building incredible visual demos in astoundingly small executables sizes. Many demo scene gatherings had maximum size requirements – often in just a few hundred, or even a few DOZEN, kilobytes. Figuring out how to get the most amazing tech in the smallest size was one of the great innovation points for these contests.

Once developers had exhausted their technical chops on generating amazing art with miniscule code (using all the tricks in the book they could think of), they quickly found that hand-tuned compression became far too tedious and brittle. So, they started building tools to do this compression for them.

I wrote about a more modern take on this where MattKC tried to fit an entire game into a QR code. Part of his adventure was compressing the executable using an old demoscene tool called Crinkler.

There were others, one of which was called lekktor which was used first on their .kkrieger demo. The story behind the development is a fun read as is an interview he did in 2005.

Apparently it used a form of code coverage as part of the analysis which ran while you ran the application. This had the dubious effect of allowing people to use the down arrow down on menus but not the up arrow – because nobody ever pressed the up arrow when training the compressor.

Links:

Auto-generation of 3D models from text

Auto-generation of 3D models from text

I’ve already written about nVidia’s GET3D code that can generate a wide variety of 3D objects using AI trained networks. These objects, however, are more finely tuned to generate specific objects (chairs, cars, etc). This requires a large labeled 3D dataset. nVidia provides simple ones, but if you want them to generate specific kinds of styles or from different eras (only 50’s era cars, only 1800’s style furnature), you’ll need to collect, label, and train the model for that.

There’s another player in town called DreamFusion that goes a slightly different direction. Some Google and a UC Berkley researchers are using a similar method to generate 3D models from text. This gets around the problem of needing lots of pre-trained data by using images generated from 2D text-to-image diffusion models (like Stable Diffusion, DALL-E, and MidJourney). They developed an error/loss metric that they then use to evaluate the generated 2D images and potential for 3D generation and then do so. They come up with some astounding results.

There is also a paper by Nikolay Jetchev called ClipMatrix that attempts the same text-to-2D-to-3D generation. He also seems to be experimenting with animations and something called VolumeCLIP that does ray-casing.

This kind of end-to-end workflow pipeline is exactly the kind of content makers want. Unfortunately, it also means that it could likely decimate an art department. This kind of technology could easily be used to fill the non-critical areas of ever-expanding 3D worlds in games and VR with very minimal effort or cost. In theory, it could even be done pseudo-realtime. Imagine worlds in which you can walk in any direction – forever – and see constantly new locations and objects.

Links:

CLIPMatrix and VolumeCLIP AI based 3D model generation

CLIPMatrix and VolumeCLIP AI based 3D model generation

As I mentioned in my previous article, there is a paper by Nikolay Jetchev called ClipMatrix that attempts to generate 3D models from 2D images that are generated by text-to-image diffusion models (like stable diffusion, DALL-E, MidJourney, etc). A list of his other papers can be found here.

He now seems to be working on auto-generated models that are animated automatically. (Content note: he seems to love to generate content based on heavy metal lyrics, demons, and other fantastical creations that I don’t think demonstrate this could work on more ‘normal’ looking models):

Originally tweeted by Nikolay Jetchev (@NJetchev) on March 10, 2022.

In looking at his Twitter stream, he also seems to be working on a version called VolumeCLIP that appears to generate voxel objects he can ray-cast into..

“The Fire Dwarf Blacksmith”

Originally tweeted by Nikolay Jetchev (@NJetchev) on January 26, 2023.

My heads are gone!

My heads are gone!

Are you losing the heads of those images you’re generating in stable diffusion?

Try adding these keywords to your prompt:

  • “A view of”
  • “A scene of”
  • “Viewed from a distance”
  • “Standing on a “
  • “longshot”, “full shot”, “wideshot”, “extreme wide shot”, “full body”
  • start the prompt with “Head, face, eyes”
  • Try adjusting the aspect ratio of the image to be taller instead of wider. Be careful not to go too tall (or two wide) or you’ll get the double-head or start generating combinations of two people.
  • The source material has been scanned in a taller aspect ratio, try adjusting the x-side of your ratio
  • Use img2img on a crop that includes part of the chest to make it match the rest of the drawing
  • Cinematography terms tend to work well. In order of close to far: Extreme close-up, close-up, medium close-up, medium shot, medium full shot, full shot, long shot, extreme long shot.

Links: