Koe Recast comes from developer Asara Near in Texas and it allows you to dramatically change your voice into a wide variety of styles – even opposite genders. They have a website demo that allows you to convert a 20 second clip. It’s a preview of their commercial product currently undergoing private alpha testing.
I guess the old TV trope of concealing your voice with a handkerchief over the telephone are long gone.
I have my opinions about cryptocurrency. All of which have come true – in spades. Like many, I have a fascination with watching each crypto train wreck after train wreck unfold.
If you like predicting train wrecks, PricePredictions has a little AI bot that tries to predict crypto prices.
They start with some simple AI image generation and move on to more and more complex examples that includes a brief introduction to some key parameters, changing and including broader image sources, and then generating various famous artistic styles.
They finish out the intro with some links to help you learn more:
Lexica — a repository of images generated using Stable Diffusion and the corresponding prompt. Searchable by keyword.
Stable Diffusion Artist Style Studies — A non-exhaustive list of artists Stable Diffusion might recognize, as well as general descriptions of their artistic style. There is a ranking system to describe how well Stable Diffusion responds to the artist’s name as a part of a prompt.
The AI Art Modifiers List — A photo gallery showcasing some of the strongest modifiers you can use in your prompts, and what they do. They’re sorted by modifier type.
Top 500 Artists Represented in Stable Diffusion — We know exactly what images were included in the Stable Diffusion training set, so it is possible to tell which artists contributed the most to training the AI. Generally speaking, the more strongly represented an artist was in the training data, the better Stable Diffusion will respond to their name as a keyword.
The Stable Diffusion Subreddit — The Stable Diffusion subreddit has a constant flow of new prompts and fun discoveries. If you’re looking for inspiration or insight, you can’t go wrong.
Remember old-school movies that were damaged, in black in white, and everyone ran around at 2x speed? With AI processing, they can fix many of those problems. Olden Days youtube channel has a number of great restored videos like this.
Amazing to see that when fixed, this looks just like a snowball fight one might see today – proving that we aren’t all that different from the people of our past as we’d like to think.
Disney published this paper about using AI to digitally age and de-age actors in a fraction of the time it usually takes for normal frame-by-frame manual aging techniques used today.
FRAN (which stands for face re-aging network) is a neural network that was trained using a large database containing pairs of randomly generated synthetic faces at varying ages, which bypasses the need to otherwise find thousands of images of real people at different (documented) ages that depict the same facial expression, pose, lighting, and background. Using synthetically generated training data is a method that’s been utilized for things like training self-driving cars to handle situations that aren’t easily reproducible.
The age changes are then added/merge onto the face. It appears this approach fixes a lot of the issues common in this kind of approach: facial identity loss, poor resolution, and unstable results across subsequent video frames. It does have some issues with greying hair and aging very young actors, but produces results better than techniques used just a few years ago (not that the bar was very hard to beat).
Architects and designers are increasingly experimenting with AI generated art and designs. Michael Arellanes II of MA2 Studio created a series called ‘Synthetic Futures’ in which he experiments primarily with Midjourney in an attempt to create a consistent and controlled aesthetic for architecture.Â
I personally think wide-scale use of AI based art generation to continue a theme or even explore and create new ideas/directions is a foregone conclusion at this point. I’m continually astounded by the results these algorithms generate. Results that will just get better very quickly.
Arellanes seems to agree when he says: ‘The current open platforms for AI imagery work from word descriptions alone, as opposed to architectural 3D modeling and/or encoding surface parameters. This leaves the operator with flat images or AI impressions based on descriptions with extraordinary results of the unexpected. The unexpected results are the most exciting aspect of this new paradigm. As designers test the limits of AI’s imagination and complex image compositions, new possibilities emerge that have never been seen before.’
AI generated art has caught fire. Learning how to generate the command line prompts to generate the art is still a work of trial and error. But some folks are helping you learn by giving some example prompts to help you learn what works and what doesn’t.
All the below items were 100% auto-generated and included on the page. It looks like people are exploring and sharing different prompts to generate different kinds of art.
Using online AI art generation sites like DALLE, Midjourney, and GPT-3 aren’t free or unlimited to most folks. For example, DALL-E 2 was charging 10 cents per prompted generation attempt. Trying a few hundred prompts can quickly add up. Even using free generators like Stable Diffusion, experimenting with prompts can be time consuming.
It only makes sense we’re witnessing the rise of specialist prompt writers and online marketplaces where you can buy and sell high-quality prompts that get the desired results much faster. This saves users money on API fees and time trying to tune the prompt to get what you want.
These even have names now. A prompt engineer is a specialist adept at writing the text prompts necessary for an AI model to generate reliable outputs (such as graphics, text, or code) at a reasonable price. They can then sell the specialized prompts they generate on a prompt marketplace. These are sites where users can purchase and sell prompts. The prompt maker usually keeps 80% of the sale, and the marketplace takes a 20% cut.
Below are some of the top paid Prompt Marketplaces. Definitely worth browsing to see the amazing work that can be generated by AI art algorithms.
PromptBase– offers an amazing amount of prompts for just a few dollars:
PromptHero – seems to be geared towards higher-end generation
One of the first things you’ll run into is that you’re not going to be able to generate any images at 512×512 or larger if you have a graphics card with 8GB of VRAM or less. Even smaller if you only have 4GB of VRAM. The first/easiest method is to limit the output image size. There is also an option that splits the model into 4 parts and loads each separately (thought it will take longer), or using a more optimized/compressed set of trained model data.
So how do you do that if you have an older graphics card that only has 4GB or 8GB of VRAM? TingTingin has some tips at the end of his installation video if you are using a card with 8GB of VRAM (nVidia 3070’s for example).
Summary (at 15:45): Modify your txt2img.py and add the line ‘model.half()’ after model = instantiate_from_config(config.model) in the load_model_from_config() function.
It turns out you can’t just set up an AI model and let it crank for years. You need to pay attention to something called drift. There are ways of telling if your AI model is drifting by monitoring model accuracy, outputs, and inputs on an ongoing basis and re-balancing them.