Putting spell check in 64k of RAM

Putting spell check in 64k of RAM

Abhinav Upadhyay walks us through a wonderful bit of computer history. He talks about how Steve Johnson at AT&T wrote one of the first spell checkers. His method could encode a word in just 14 bits of memory; so a dictionary with 30,000 entries would take up a fraction under 52 kB. This is even better compression than gzip – and it can perform fast lookups.

Once the dictionary grew to 30,000 words, the Bloom filter approach became impractical. Douglas McIlroy’s solution was to store differences between sorted hash codes , after discovering these differences followed a geometric distribution. These followed a distribution that could be easily run length encoded with something called Golomb’s code.

It’s a fantastic examination of applied computer science. Definitely worth a read

Articles:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.