Browsed by
Category: Programming

Earth shattering…

Earth shattering…

While it might not look like much, this software program called the Eureka machine uses standard video input, examines the behavior of a system, and with no previous knowledge of a system’s physics, kinetics, etc – it generates equations that accurately describe what is going on.

The program, from simple video input and a little massaging, it was able to generate the Hamiltonian equation for the difficult double-pendulum problem in about 30 minutes.  And a Lagrangian Equation that describes a double harmonic oscillator in another case – all in very short periods of time:



While this is very cool and to some degree just an expansion of what we have been able to do with neural net programming that ‘learns’ by trying out techniques and checking their results against reality – the ability for the program to generate equations takes this all a step further.To give an example of what this brings about – they recently applied the algorithm to some complex data collected in cell interaction.  While the scientists had struggled to make any meaning of the patterns, the program was able to come up with a formula that accurately described how these cells worked.  But this presented a new problem.  While the equation seemed to match exactly what was going on, the scientists who fed the data couldn’t figure out what physical components the variables the equation related too. They made the decision NOT to publish the equation in any papers with the accurate modeling equation because they didn’t actually understand how the equation modeled the system. While not unsurprising from an program that simply generates an equation from data; its the first time that these computers might actually be out-matching us for models of systems.  However, since they are unfettered by making the actual variables equate to real-world  phenomenon – they are free to generate equations who’s variables aren’t necessarily based on the underlying physical phenomenon.  THIS is the interesting part.It seems (rightly) that just modeling the situation isn’t sufficient to say you understanding it.  Does understanding of a phenomenon require the understanding of the underlying principles?  Should it? Sure, you might be able to come up with an equation that models what’s going on for the cases you have, but without understanding the principles behind it, you’re just putting your faith in the equations generated.  But is this what we do today?

I was taught since 6th grade science class that every scientific principle was only one repeatable converse case away from being refuted at any time.  History is full of such events – including the most deeply held ones such as Newton’s laws of motion. Depending on the size scale of use, they either work very well, or in the quantum/astrophysical realms – fall apart completely.  Those rules have been getting ‘touch-ups’ for years.  While Newton certainly isn’t categorically wrong – it’s clear we didn’t (and still don’t) have all the corners fleshed out.

So we find the crux of the matter -why shouldn’t the equations generated by this program be any more deserving of our trust than Newton’s?  I’d say the key lies in several ways: mostly in the requirement for rigorous review, numerous experimentally repeatable verifications, and apparently that the equation needs to be explainable with principles and terms that we DO understand.  The first part is very understandable.  No scientific statement worth it’s salt should be accepted without lots of peer review, repetition of the experiment by others in different conditions, public discussion, and confirmation via different methods.  This program required user intervention to get a balance between absolute accuracy and ‘simplicity’.  Which means it had to go through numerous iterations and a little bit of pre-known knowledge to get it to generate equations that corresponded to principles we understand. This implies it could generate different equations for the same phenomenon.  More on this later…

But the second reason, and the one the jury appears to be out on, appears that one needs to be able to explain WHY the equation works, or at least be based on terms we do understand.  In other words, just pulling the ‘answer’ out of the back of the book isn’t real understanding.  The right answer doesn’t seem to be sufficient by itself for science to classify as real knowledge.  For science, we also apparently need to be able to explain why it’s right too.  Only then can we actually say we have a decent understanding of something.

The unanswered question is if that requirement of being built on understood principles needlessly inhibits us.  What if we just ‘went with the flow’ and let machines like this generate those horribly difficult equations for us?  What would that look like/imply?  The equations that the software could generated didn’t always correspond to previously known/modeled phenomenon – and needed to be ‘guided’ by the user to answers in the form they wanted.  But this implies the computer in other circumstances might be revealing a different *kind* of thinking that we could backtrace?  What if those equations are just like another ‘culture’ or ‘language’ that sees the same reality in a different, but no less valid, way that we could explore and understand? I think that could be an interesting discussion for another entry.

This instance reminds us that there are very important philosophical principles behind what is considered scientifically known and not.  Principles that have real and interesting effects; and depending on when/where you lived, there were/are very different requirements for what is considered knowledge.

In case you’re interested, philosophically, this question of what is knowing is called Epistomology – and might be worth a look.  (Is my philosophy undergrad work showing?)

A parking meter can teach a software engineer many things.

A parking meter can teach a software engineer many things.

I go to The New Old Thing a fair amount. It’s a blog written by a veteran system level guy at Microsoft. He puts some fun articles up – and I had to respond to this one. Here was his original article:

A car park in Birmingham switches from English to German in times of stress. Over a decade ago, a colleague of mine noticed an error message on the screen at the exit to the parking garage at the Seattle-Tacoma International Airport. The way the airport works, you pick up a ticket as you enter, and you pay your parking fee at vending machines stationed around the parking garage, and at the exit, you insert the (paid) ticket into the machine, which verifies that you paid your parking fee and opens the gate.

When my colleague pulled up to the machine, it had an error message. In German. Fortunately, my colleague knows German, and he recognized the error message as a Windows 95 serial port conflict resolution dialog. While he was trying to figure out how to click Abbrechen on a machine with no mouse or keyboard, an attendant walked up, took his ticket, and opened the gate.

A conversation ensued about what went wrong and the right way to fix it. Here was one guys response:

yuhong2: Well, as I remember, the dialog in question relates to multiple DOS virtual machines trying to access a serial port at the same time. There was several ways to virtualize a hardware device across multiple DOS VMs, and one of them was to allow only one DOS VM to access a device at a time. If two of them tried to access at the same time, the only option was to pop up a conflict resolution dialog like this one, and Win3.x and Win9x had built-in support for doing this. More information on all this stuff can be found in the old DDK docs (like the Windows Server 2003 SP1 DDK) that had the VxD documentation.

At which point I needed to reply with the following, and got some serious ‘Amen brothers’:

@yuhong2 – your answer is totally plausible, but shows what a lot of us do (me included) – in the face of a bug like this – we just keep jumping down the engineering rat/rabbit hole without stopping to ask a more fundamental question of whether you’re even on the right track with your architecture.  I’ve learned to stop looking down the hole and to try and smell them coming before I start. Trust me – there IS no bottom. And more importantly, when you’re spending your time looking off the cliff, you’re not looking at your goal anymore.

After being a software engineer for 10 years on major projects – I’ve learned this one lesson about code usability: if you’re using more than 3-4 sentences your mom can’t understand to describe a fix or architecture; something is likely gone, or will be going, very wrong soon.

While I completely understand what yuhong2 is saying and it’s very plausible and intelligent, if it’s true, it shows me that someone made a terrible choice of platforms/architecture when choosing to solve this problem of a parking meter. I’ve worked with more than a few tech leads that come up with horrendously complicated algorithms and architectures to solve problems that could be solved MUCH more easily. And you know what? The easier solution, while perhaps not the fastest or cool looking, is almost always the fastest and best long-term. Why? Because those complex architectures have even more complex problems. i.e. yohung2’s answer.
As the project grows, it just gets worse and worse – not better. Until you need a phD to figure out why some part of your threaded, interconnected data structures are hanging or getting corrupted once every 30 days. This should be a moment to stop and ask yourself what you’re really doing. Sure, certain problems really do require complex solutions (i.e. 3D graphics, threaded applications and drivers – some of what I do), but know your tools and the strengths and limitations of them (the hardware platform, the language you’re using, the software/os stack, etc). Yes, you darn well better have a good toolbox of the latest algorithms, languages and OS info, but your toolbox will do nothing but get you into trouble if you don’t know how and when (or when NOT) to really use those tools. If yuhong2 is right, you see how your complex solution just backed you into a corner and shows you someone probably picked the wrong stack or algorithmic approach when just having a very simple box that checked times would have worked.

I have worked long enough to also know that many times platform choices are out of your control as an engineer – budgeting and marketing often limit you. But if you’re forced to use a platform – design so-as to avoid the limitations of that platform – don’t try to force them into doing what you want. As a rule of thumb, always stick with the simplest solution that completely solves the problem first, then you’ll be re-writing to solve performance and feature issues instead of core this-thing-doesn’t-even-work-yet problems. At worst, I’ll have a working system that’s slow and I can then optimize and re-design the parts that need it. I won’t have wasted time optimizing for things that may not have needed any help. Now the caveat is that you really must know what you’re doing and why your making those choices for simplicity. It’s just as bad to stupidly pick an architecture that’s too simple for the problem and gets you just as much in a corner. But I’ll argue a working, slow system will always sell and ship before one that’s 10 months late and MIGHT be faster. Simplicity also makes the code more maintainable and much more extensible long term as there’s less inherent inertia in the code to move about.

Software is like using clay to make art. Sure you can beat and force and manipulate it to create very complex structures – but you start making more and more complex problems for yourself. i.e. if you try to build a car engine from clay. If you work *with* the clay/code’s natural strengths and not force it to do what it isn’t good at, however, you end up with beautiful thing that’s simple, works well for its intended use, and is very easy to maintain (a clay bowl).

Another guy summarized even better:

For any sufficiently complicated system, there will always be failure modes where the average Joe has but two options: call an expert to fix it, or arrange his life to not use that system. In the case of the parking garage, Joe’s only rational response is to use a different exit lane.

Computer interface design goes wrong when the programmer refuses to admit that the system is in such a state. The parking garage’s system should have displayed “Closed – use a different lane”, with the error message shown on a maintenance screen.

This echo’s some of the great wisdom I’ve been learning from this great book:

Believe it should be on every programmer’s shelf.  I’ll do some more postings as I get through it.

An interesting puzzle: eyAnOicgPT4gJycsICcgJyA9PiAnLScsICdzXG4nID0+ICdzLmNvbVxuJyB9 (3548, 4648)

An interesting puzzle: eyAnOicgPT4gJycsICcgJyA9PiAnLScsICdzXG4nID0+ICdzLmNvbVxuJyB9 (3548, 4648)

This mysterious email popped up on craigslist in the jobs section – spawing an interesting online contest that sucked up most of yesterday.

http://www.networkmirror.com/hUmsXHsC3yihic9B/denver.craigslist.org/sof/514727825.html

I was very skeptical that it was more viral marketing for Cloverfield (http://www.1-18-08.com/) Which I was not at all interested in promoting. But the puzzles got interesting, then more interesting, then more. I got interested in the coding parts, and a small community popped up to answer the questions.

The solutions broke down like this.

1. The original text was simply Base16/32/64 Data Encoding, which gave you some instructions:
{ ‘:’ => ”, ‘ ‘ => ‘-‘, ‘sn’ => ‘s.comn’ }

on how to decode the message title – which gave you a web address to go to: wanted-master-software-developers.com

2. You then had to code up a function that satisfied the sequence of test sections. It turned out to be a logic diagram that had ‘falling’ true/false parts of the matrix that acted like tetris pieces with an extra ‘sticky’ rule. There were a variety of ways to solve this coding function – brute force, or mimic the logic of the falling true/false sections. Here was a short answer:

f = function(d) {
for (var i = d.length – 1; i > 0; i–)
{
for (var j = 0; j < d[i].length; j++)
{
if (d[i-1][j] == true && d[i][j] == false && d[i-1][j+1] != true
&& d[i-1][j-1] != true && (d[i][j-1] != true || d[i][j+1] != true)) {
d[i-1][j] = false;
d[i][j] = true;
}
}
}
}

peopled tried cheating by doing:
f = function(d) { TDD.assertEquals = function(a,b) { return true; } }

But when you got through all the tests successfully, the function spits out a weird list of words. These words are from the wikipedia article on Henry Ford (gained from the other clues embedded in the html). People wrote down the indexes of those words, then wrote the indexes in the form of which were the deltas of the distances in between the words which lead to the sequence:

0,1,1,2,1,1,2,1,1,2,1,1,2,2,2,1,1,2,1,1,2,4,2,3,3,1,1,-2,0,1,1,-2,0,1,1,-5,0,0,0,-1,2,-4,-2,1,-1,2,0,-2,1,-5,0,1,1,-4,-2,0,1,1,-4,-2,0,1,1,-2,0,-2,-2,0,1,1,0,-2,1,-5,0,0,0,-4,0,0,0,-2,-2,0,1,1,-6,0,1,1

When these are fed back into the correct F function (which you figured out above), the algorithms true/false matrix is converted to blue blocks that spells out “coLLAborATE” in the 2D grid below, which you add to the ?key= http at the top:
http://www.wanted-master-software-developers.com/?key=coLLAborATE

Which gives a cryptic box with text and a strange pixely border around it.

3. Problem 2/3:

When viewing the HTML, the id tags on each section were strange. When pulled out in order, they gave this sequence:
IMCB OMC JHKC PHL ODLTP ACC DCOLDB OH IMCBJC VTT AOVDOCK BHI SHAOXBQ
PHL CZLVTA EC ODVBASHDOA OH DCVTEA LBJMVDOCK

Which was a simple substitution cypher:
WHEN THE CODE YOU TRULY SEE RETURN TO WHENCE ALL STARTED NOW POSTING YOU EQUALS ME
TRANSPORTS TO REALMS UNCHARTED

So go back to problem 1/3, and enter the http address:

wanted-master-software-developers.com
and change it to:

wanted-master-software-developers.com/?you=me

Which leads you to page 3/3

Problem 3/3

People started noticing that the text in 2/3 hadn’t been solved – and that the image around 2/3 was unique and not around the 3/3 question. People noticed the name strawberry-rhubarb.css was strange too – along with the font name called Boulder-18. There was also some patterns in the bit layout of the weird border image. After looking around at the image a bunch, they counted the number of grey pixels between black pixels and got: 3 1 4 1 5 9 2 6 5 = pi. From the first red pixel to second red pixel is the pi encoding. From the 2nd to 3rd red pixel, the number before the green pixel is the index into pi, and the number after the G is the 6 digits of pi at that location (to verify you’re not insane). After the 3rd R is many more indexes in this form. So, someone downloaded the first million digits of pi and wrote a program to do the work for us. You get a big list of indexes into pi, and the values they point to. Every one of those indexes is a 6-digit value – and was unique in they all either started with a 0 or 1. This got people thinking and if you take those indexes and interpret them as ascii, each 6-digit index is a pair of ascii characters:

So the first few indexes extracted from the image give:
111112 = 111 112 = o p
032099 = 032 099 = ‘space’ c
111100 = 111 100 = o d
101115 = 101 115 = e s

equals: “op codes” – wow! Keep going and get:

op codes: e: push integer value of next ascii char (list 1). u: pop
value and output as ascii char. l: pop value, push ceil (value/2). a:
pop two values, push sum. i: pop two values, push 1st popped – 2nd
popped. n: pop value, push value + 1. t: pop value, push value – 1. r:
pop value, push value * 2. other: discard. list 1: -, A, B, I, N, R.
eAeNlaueNe-nlaueAe-ttaueAe-ttaueBe-au = hello

This ‘algorithm’ makes sense when looking at the garbled text on 2/3 and 3/3. I followed the algorithm on the text by hand, but after 2 minutes, I realized that writing up the solver in java would be faster. I wrote up the stack machine/rules in Java and I ran the text on 2/3 through it and got:
cerebrum, vere-tempus, together (adv).

The text on 3/3 gives:
Explain the significance of the date:
(with 1-18-2007). The button’s text is: Go.

So, you put the answer on 3/3, but the question is 2/3. But what did it mean?

So, folks brainstormed to get cerebrum=brain, vere-tempus = real-time, and together = simul/una = as one. After scads of folks googling all kinds of combinations, one guy hit on: “+brain, +real-time +una” comes up with a link to http://www.n-brain.net/faq.html

Which is a collaborate project called UNA being released mid-January – and is in Boulder, CO (which the text encoding was Boulder-18 non-sense font)

So, the significance of 1-18-2007 is that it’s the release date for their UNA project by n-brain. More fiddling around with combinations (spaces/not/etc). People looked at the code for the button and tried them encoded as well as not and hit upon the phrase:

UNAreleasedate

Re-encoded using their method to get (can be re-encoded in many different ways if you’d like):

eRnnnueNueAueRleIaue-leNaueRleBanue-leNaue-leIanueBleRaue-leNaueBleBanue-leIanueBleRanue-leNau

Enter that in the text box on 3/3 page and that gives you the solution and a link to the congratulations page – indicating I was solver #88. I entered the form, but declined the job interest (I’m happy where I am right now). But come mid-January I should have a copy of some free software!

Blogger v1.0 released

Blogger v1.0 released

Check out my projects page to get a copy. It’s a little tool I developed for simultaniously generating the HTML and RSS blog feeds used on this website. You can see the project page for details, downloads, etc.  I generated this very entry with it!

RSS feed active!

RSS feed active!

I undertook a little coding project for fun.  After a number of requests, I’ve added a RSS feed to the site for you bloggers. I wrote up a little C++ app to update my blog in both HTML and RSS 0.91/XML format. The RSS feed only keeps the last 10 articles. The code for the blogger is really messy right now; but works. I’m planning on cleaning it up and then publishing the writer class. It’s silly nobody has released such a tool before…