Browsed by
Category: AI

VR ready to replace your desktop?

VR ready to replace your desktop?

People are starting to experiment with the latest VR headsets – especially the Meta Quest 3 and Quest Pro. One of the big questions is, can I finally get rid of my desktop environment and work purely with VR headset?

It turns out, most of the reviewers believe the time is almost here and believe it is possible.

Hallden seems to think it is possible, but points out some issues with working in moving environments (like airplanes), connectivity and lag, and the possible advantages of an AR vs VR solution. His take is primarily from a coders point of view.

Alan Truly also believes the time is almost here, but points out app quirks with copy-paste, the browser, content editing, and the extra pound of weight on your head might be too much for a full 8 hour day of work.

Articles:

Autonomously plowing your fields – from a phone 1500 miles away

Autonomously plowing your fields – from a phone 1500 miles away

At the John Deere booth at this year’s CES in the Las Vegas Convention Center, conventiongoers could do something incredible with an iPhone. They could pushed the PAUSE button on an iPhone and thirteen hundred miles away, in the middle of a field outside of Austin, Texas, a giant, bright green, driverless tractor stopped short. Hit RESUME and the tractor started up again. Put down the iPhone and the tractor resumed tilling the field, all by itself.  

The breadth of what you can do with the tractor via the demo app was limited. You could stop and resume the tractor, as well as increasing or decreasing its speed in a straight line and while turning. There are no turning controls. But what this signals is huge.

In the demo, a farmer first geo-fences the field boundaries and then the tractor can determine its own path based on how wide the tiller is. Tillage is the only job the technology is programmed to handle but John Deere hopes to have a complete autonomous production system supporting every step of the farming process by 2030.

The John Deere spokespeople ballparked such a tractor between $600,000 to $700,000, with the autonomous technology implementation adding a further $100,000 on top of that. Older tractors from the 2020 model year and up can also likely be retrofitted with the tech. The update should “take only about a day” according to a 2022 CNET story.  

There’s no doubt in my mind this is how the future of farming will look. It’s been coming for a long time; and spending long hours out in the field will almost certainly be a thing of the past very soon.

There are already calls that John Deere and other equipment manufacturers will have fully autonomous fleets that they manage and simply send to your fields on a subscription-like basis.

Article

Google adds watermarks to AI generated audio and images

Google adds watermarks to AI generated audio and images

AI generated audio created using Google DeepMind’s AI Lyria model, or YouTube’s new audio generation features, will be watermarked with SynthID to let people identify it was AI-generated. Google says the watermark shouldn’t be detectable by the human ear and it should still be detectable even if an audio track is compressed, sped up or down, or has extra noise added.

SynthID also works on images and is supposed to be detectable even after modifications like adding filters, changing colors, and saving with various lossy compression schemes like JPEGs.

This is part of the new presidential executive order surrounding AI generated content that was issued back in Oct 2023.

Links:

Fully robotic/automated restaurant

Fully robotic/automated restaurant

“To our knowledge, this is the world’s first operating restaurant where both ordering and every single cooking process are fully automated”

CaliExpress is a burger joint in Pasadena with a unique staff. Instead of people flipping burgers, it’s burgers and fries are made by a robot called Flippy from Miso Robotic’s, and uses PopID, a biometrics powered technology company, to place orders. In short, robots run the place. They claim its safer since no humans risk oil burns or spills, stressful jobs are reduced, and those running the systems make much higher wages.

The location also has displays on the Flippy robot development timeline, including robotic arms used on retired Flippy models, photographic displays, and 3-D printed artifacts from its development. 

The group encourages local schools to come for tours as the location shows the Flippy development timeline, including early robotic arms once used, photographic displays, and 3-D printed artifacts. The goal is to inspire future AI and automation development.

Now, if you combine that technology with experiences like Inamo in London…I think you have an amazing restaurant that can almost literally just run itself.

This kind of automation shouldn’t be a surprise. California passed a bill that requires chain restaurants to pay $20/hour minimum wage starting April 2024, so it’s no wonder restaurants are now finding automating jobs is cheaper than paying employees.

If the goal was to make more living wage jobs, it’s doing exactly this by removing low-skilled burger flipping and hospitality jobs and replacing them with tech workers. But this means that those burger flippers need to be able to take those robot repair jobs. That’s not something all of them can do. It is another in a long line of examples of how well meaning but short-sighted legislators actual may be working against their lowest-skill constituents because they don’t think about the long term consequences.

Or, it could be what they intended – which is making the divide between rich and poor even greater as low-skill workers jobs are being replaced with high-skill jobs they may not be able to do.

Articles:

Everyone looked real

Everyone looked real

How’s this for clever – and frightening? Scammers deepfaked various employees on a video conference and managed to scam $25 million from a Hong Kong based firm.

The scam used digitally recreated versions of the company’s chief financial officer, along with other employees, who appeared in a live deepfaked video conference call instructing an employee to transfer funds. They included active faces and voices that were good enough to fool them.

The scam was initially uncovered following a phishing attempt, when an employee in the finance department of the company’s Hong Kong branch received what seemed to be a phishing message, purportedly from the company’s UK-based chief financial officer, instructing them to execute a secret transaction. Despite initial doubts, the employee was convinced enough by the presence of the CFO and others in a group video call to make 15 transfers totaling HK$200 million to five different Hong Kong bank accounts. 

Articles:

Attacking AI with Adversarial Machine Learning

Attacking AI with Adversarial Machine Learning

Adversarial machine learning is a machine learning branch that tries to trick AI models by providing carefully crafted/deceptive input to break AI algorithms.

Adversarial network attacks are starting to get more and more research, but had humble beginnings. The first attempts were by protest activists that did very simple defacing or face painting techniques. Dubbed CV Dazzle, it sought to thwart early computer vision detection routines by painting over your face/objects with geometric patterns.

These worked on very early computer vision algorithms, but are largely ineffective on modern CV systems. The creators of this kind of face painting were largely artists that now talk about the effort more as a political and fashion statement than actually being effective.

More effective approaches

It turns out that you can often fool algorithms in a way not actually visible to average users. This paper shows that you can cause AI’s to consistently misclassify adversarially modified images. It does this by applying small but intentionally worst-case perturbations to examples from the dataset. This perturbed input results in the model outputting an incorrect answer with high confidence. For example, the panda picture below is combined with perturbations to produce an output image that looks ok visually, but is recognized by AI models as something incorrect – and incorrectly at high confidence.

This isn’t the only technique. There’s a lot more. One of them, Generative Adversarial Networks (GAN), are actually used to improve current AI models by attempting to fool a model, which is then used to help train it to be more robust – like working out at a gym or practicing the same thing with many variations.

Nightshade and Glaze

This kind of attack isn’t academic. Some artists see themselves currently in a battle with generative AI algorithms.

Nightshade is a tool that artists can use to alters the pixels of an image in a way that fools an AI algorithm and computer vision technology but leaves it unaltered to human eyes. If the images are scraped by an AI model it can result in images being incorrectly classified which results in an increasingly incorrectly trained model.

Glaze is a tool that prevents style mimicry. Glaze computes a set of minimal changes that will appear unchanged to human eyes but appears to AI models like a dramatically different art style. For example, a charcoal portrait but an AI model might see the glazed version as a modern abstract portrate. So when someone then prompts the model to generate art mimicking the charcoal artist, they will get something quite different from what they expected.

The AI Arms Race is On

As with anything, we’re now in an arms race with lots of papers written about the various problems of adversarial attacks and how to protect your models and training data from them. Viso.ai has a good overview of the space that will get you started.

Links:

Procedurally generated VR city

Procedurally generated VR city

Vuntra City is a procedural VR city generator in Unreal Engine 5 developed by a single person over the last few years. I know, I know. Procedurally generated content has got some serious shortcomings. Too many games with procedural content are just thinly veiled programmer art designed to fill spaces rather than be part of the experience.

The author actually does a great job recognizing those traditional limitations and attempts to fix them. Probably the best observations they make is not from the technical side, but the aesthetics side.

It turns out they have made an excellent solution with just some good observations and shockingly simple engineering solutions. As an engineer, I see far, far too many projects over-complicate things that could be done much more simply. Simplicity is how you know you’re on the right track. Complexity leads to tears.

After 2 years of experimenting, they have a really interesting solution. Check out the VuntraCity youtube channel to see vidoes of how they experimented with different techniques and solutions. I particularly liked how they used a normal old treemap layout to break up boring city grid structures. Combining it with a caching and pooled allocation system is nothing new; but was a good little optimization.

Links:

US Copyright, Patents, and Generative AI

US Copyright, Patents, and Generative AI

There’s a lot of misinformation and misrepresentation of copyright and patent law when it comes to generative AI. In fact, the US copyright office has already flip-flopped on this issue; and the Chinese courts have come up with a completely different ruling saying that AI generated images can be copyrighted. It appears this could be a political war as much as a legal one.

A lot of the social media hyperbole is being fueled by fear and uncertainty. Not that there isn’t a real problem with generative AI taking away people’s livelihoods or possible copyright violation; but it’s worth knowing what one is talking about before heading off with pitchforks and torches.

I found this article on IPWatchdog to be informative about the actual legal arguments – but it’s important to know the jury is still out; and the US Copyright Office has already ruled exactly the opposite on this issue just a year ago. First off, what does copyright protect (compared to a patent)?

The Supreme Court laid out the difference first in Baker v. Selden, and re-emphasized it a century later in Mazer v. Stein. “Unlike a patent, a copyright gives no exclusive right to the art disclosed; protection is given only to the expression of the idea—not the idea itself.” In this way, each type of intellectual property right exists in different types of creations, which arise in a different ways, and have different requirements for protection. “[C]opyright protects originality rather than novelty or invention,” which is the domain of patents, said the Court in Mazer.

Indeed, what the Court made clear in Feist v. Rural, is that authorial works need to be original; that is, both created independently and “creative.” Other cases, such as Bleistein v. Donaldson, spoke of original expressions as “personal reaction upon nature,” where the author contributes “something recognizably his own,” per Alfred Bell

So the question for copyright becomes ‘Is AI creative?’. This is a tough point because it’s not clear what creativity really is. However, that philosophical or neuroscientific point is not that important when it comes to law. What is important is the previous language used to describe what is protected.

The article author indicates the emerging legal arguments seem to indicate that the kind of ‘creativity’ that is covered by copyright relates to that of human activity. Neither the courts nor the US Copyright Office have so far found AI to be creative with respect to the wording of existing copyright law.

Whether that argument is valid/sticks is a whole other story. Law is fickle and can change. It also doesn’t touch on the question of fair use on publicly displayed images and the argument that AI might be just considered as using copyrighted work to learn techniques/make but making their own reactive/derivative works which is something that art students do and the whole point of going to art school.

Either way, we’re likely see the most important legal decision in a decades with profound repercussions for future generations.

Links:

China rules AI Generated art is copyrightable

China rules AI Generated art is copyrightable

In stark contrast to Western rulings (well, except some early ones) and previous Chinese stance of control over generative AI, a Chinese court just awarded copyright protection to AI-generated images.

The case revolved around the generation of a pop idol image; not the use of copyrighted images in the training of a generative AI model that is the source of a current US lawsuit with artists.

The argument was the one that we’ve been hearing already: because it was a human being who wrote the relevant parameters for the AI model and ultimately selected the image in question, the final output is directly generated based on their intellectual input and “reflects the plaintiff’s personalized expression.”

It will be interesting to see how this goes.

Generative AI legal battles heat up

Generative AI legal battles heat up

More developments in the copyright case of generative AI and artists. The previous lawsuit has been amended and updated.

After a first round in which the judge refused a few arguments, things have gotten tightened up a bit.

  1. New artists – from photographers and game artists – have joined the lawsuit
  2. New arguments have been added:
    • In an effort to expand what is copyrighted by artists, the complaint makes the claim that even non-copyrighted works may be automatically eligible for copyright protections if they include the artists’ “distinctive mark,” such as their signature, which many do contain.
    • AI companies that relied upon the widely-used LAION-400M and LAION-5B datasets — which do contain copyrighted works but only links to them and other metadata about them, and were made available for research purposes — would have had to download the actual images to train their models, thus made “unauthorized copies.” to train their models.
    • The suit claims that the very architecture of diffusion models themselves — in which an AI adds visual “noise” or additional pixels to an image in multiple steps, then tries to reverse the process to get close to the resulting initial image — is itself designed to come as close to possible to replicating the initial training material. The lawsuit cites several papers about diffusion models and claim are simply ‘reconstructing the (possibly copyrighted) training set’.

This third point is likely the actual meat of the suit; but they haven’t spelled it out quite as sufficiently as I think they should have. To me, the questions that are really the crux of the question are:

  1. Do large-scale models work by generating novel output, or do they just copy and interpolate between individual training examples?
  2. Whether training (using copyrighted art) is covered by fair use or qualifies as a copyright violation.

Even if generative AI loses all of these arguments, it doesn’t mean generative AI is going away. They can still be trained on huge volumes of non-copyright images and data, or data that is purchased and licensed for the purpose. Even beyond that, companies have already been training models with data collected from their use (that you give to them for free by using devices like iPhone’s Siri, Amazon Alexa, and Google) and by generated synthetic training data.

Links: