Browsed by
Category: AI

Use AI to teach you better AI prompting techniques

Use AI to teach you better AI prompting techniques

It’s becoming more and more common to use AI to help you understand what AI is doing. Task decomposition has been around a long time with asking AI to break things into steps to implement. This is kind of a new technique I had not seen: prompt decomposition. The idea is to use AI to help generate a good prompt that will get you want you want.

This could be used on cheaper local models you run for free to generate the prompts you use on expensive models.

The query goes something like this:

I want to create a high-quality prompt for this task:

[TASK]

Before writing the prompt, identify the 5–7 high-leverage prompt dimensions for this task — the core variables, constraints, context, output requirements, or stylistic choices that will most determine the quality of the result.

For each dimension, briefly explain:

1. Why it matters

2. What tradeoff or decision it controls

3. How it should influence the final prompt

Then turn those dimensions into a polished, copy-ready prompt.

The final prompt should be clear, specific, and structured. It should include the necessary context, role, task instructions, constraints, output format, quality criteria, and guidance for handling ambiguity. 

Automated sand delivery

Automated sand delivery

Driverless autonomous trucks are becoming a thing in remote areas of Texas used to feed an endless sand supply to fracking wells. AI and fracking. How could that go wrong? 😀

DeepSWE rates your LLM

DeepSWE rates your LLM

Rating exactly how well an AI does on tasks has been an open field. There are benchmarks, but there have been lots of arguments these current early benchmarks are too limited or biased. A new player is on the field and they seem to have discovered that some benchmarks are actually evaluating incorrectly a shocking amount of times.

A startup called Datacurve released a benchmark it says does a much better job. DeepSWE, a 113-task evaluation spanning 91 open-source repositories and five programming languages, produces a dramatically wider spread among the same frontier models.

It’s biggest shock is that it claims many benchmarks aren’t even correct. Their tests cover a pretty impressive and wider range of characteristics such as bigger, more representative tasks. They avoid what they call ‘contamination’ that results from benchmarks that rely on simple Github coding samples that some models simply regugitate vs truely generate. And most damning – they found that some benchmarks verifiers (the part of the code that verifies what the AI built is correct) gives false positive/negative rates from 8-24% of the time.

Screenshot 2026-05-26 at 3.22.11 PM

More benchmarks and more testing is valuable for evaluating models – so hopefully these guys will help push the industry to more scrutiny and reproducible real-world results.

You can even go download it and try it on your own models.

Links:

BMAD and Ralph Wiggum

BMAD and Ralph Wiggum

Do you want to write an app? Don’t know anything? How about something so simple that even Ralph Wiggum could use to generate a working app?

BMAD and the Ralph Wiggum loop (invented by Geoffrey Huntley) are methods that creates an AI loop that first builds something, then tests it. It can help you not only create apps and solutions – but also continually improve them.

The existential crisis is real

The existential crisis is real

Vibe coding is causing software engineers to have an existential crisis. What happens when you have an ‘easy’ button that largely spits out things that just work? What are you even doing anymore?

3D AI Generated worlds

3D AI Generated worlds

Project Genie is an experimental Google DeepMind AI system that creates interactive, navigable 3D worlds from text prompts, sketches, or images. Powered by the Genie 3 world model, it simulates physics and consistent environments in real-time.

90% of losses caused by drones

90% of losses caused by drones

The Russians are having a very rough spring. The Ukrainian forces have very reliable and verifiable numbers that over 35,000 Russian soldiers were killed or seriously wounded in March alone.

Even more crazy is that 95% of those causalities were cause by drones. Drones are becoming so prevalent, they’re regularly attacking and killing targets up to 100km behind enemy lines.

Modern warfare is changing profoundly as we watch. While these drones are all piloted by actual people, imagine turning thousands and thousands of AI controlled drones with grenades loose on a battlefield. Entire battles could be won with automated killers.

This could also be done by terrorist groups or assassins. Drones could be turned loose at a rally or in government office building to seek and destroy key targets or cause mass casualties. The future is frightening.

Profound turn

Profound turn

Intuit TurboTaxCredit KarmaQuickBooksMailchimp and Intuit Enterprise Suite are now available in Claude. Whether you’re a consumer estimating a tax refund or seeking help improving your credit, or a business owner looking to improve cash flow or trying to find new customers, Claude users can access powerful Intuit capabilities and experiences to provide personalized financial insights and recommendations you can seamlessly act on directly inside a Claude conversation. 

https://www.intuit.com/blog/news-social/intuit-apps-now-available-in-claude/

If you’ve been at all paying attention, people’s workflows are completely changing. People who know and have been working with AI now do almost everything in their AI agents. Summarize this data, reword this report, write this code, design this logo, etc. From data analysis to programming to crafting reports and architectural documents – all of it is changing overnight right before our eyes.

It’s no surprise that software companies are being forced to face this new reality. But this announcement from Intuit marks an interesting new second turn of software being absorbed into AI instead of just adding AI features as a bolt-on feature.

I’ve already heard of developers looking up simple apps and tools – and when finding the simple app cost $10 and not wanting to pay, simply asked AI to make the app for them. People are now using AI to completely generate everything from apps to things as complicated as compilers.

What does the world look like if you never have to buy software at all anymore. Even your grandma can just ask AI to make the exact software you need? The idea is profound and disturbing for the software industry but the reality is that it IS happening right now.

Many are probably correct in saying that any company doing software as a service (SaaS) is a dead man walking. Software won’t be going away – but anyone writing it likely will be. Maybe that’s why Computer Science grads have become some of the most unemployed college graduates since 2024.

AI is making hidden moral choices in it’s responses

AI is making hidden moral choices in it’s responses

AI is starting to amorally consider alternatives and making moral judgements you might not even be aware of.