Help win a contest and get an iPhone 3GS!


The Story


Let's work together to win the Engine Yard programming contest! The prize is a new iPhone 3GS and $2,000 of cloud computing time. Help us win and we'll give you half of the prize - your choice of iPhone or the cloud time ...


How to Help

It's simple. To win the contest we have to find a sentence that hashes to a value similar to another sentence. The only way to do it is to try lots of variations, and that means lots of computers. That is where you come in.

We're a crowdsourcing company, so we figured we'd amass as many people as possible. Is it crowdsourcing? Maybe, maybe not, but we think it's the best way to win this contest. To help us out and possibly win an iphone, just download our app and run it.

Do you have lots of machines? Run it on all of them an increase your chances of winning :).

Download the application (click on the big DL link for more info.)


The app will message us occassionaly with the best matches that your computer finds. If we win the contest, we'll pass along half of the prize to the person that gave us the best answer the soonest.




More Details Than You May Want...

The goal is to find a short sentence from a dictionary of words with the lowest possible hamming distance to another short sentence.

SHA-1 is a hash function, and while some weaknesses have been exposed, there really isn't a way to find a close match much better than just searching sentences. We're crowdsourcing guys, not cypto/performance experts, so we just wanted a reasonably fast sha-1 and hamming distance implementation. The rest is up to how many people we can gather and a little bit of luck.


Further Thoughts

It was clear that with a reasonable iteration function and hamming function, the majority of time would be spent in the sha-1 calculation. We started with Steve Reid's public domain implementation , which ran twice as fast as the standard ruby library's implementation. We then switched to Christophe Devine's GPL implementation , which was roughly twice as fast again. On my machine I can check around 1.5 million sentences per second.

I was intrigued by Daniel Niggebrugge's very fast SHA-1 cracker and modified it to work for the contest, but found that it didn't always give correct answers on every architecture. I would guess that we're within an order of magnitude of the fastest possible implementation, and hopefully we can compensate for that in numbers :).


More implementation details

To generate our string we're allowed to choose words from a dictionary of 1000, change case, and append five arbitrary characters. The amazing thing about combinatorics, is if you have an 100 character sentence with 90 letters in it, just looking at uppercase vs lowercase would require checking 2^90 cases. 2^90 cases is roughly 10^(9*3), which at 1 million sentences per second would take 40 quadrillion computer-years to finish. Shorter strings are more efficient to SHA-1 - in particular if your string is longer than 63 characters, you have to do the whole operation twice.

So we restrict ourselves to a single string with 12 3-letter words, and 11 spaces, followed by 5 characters randomly generated by every process. To search the entire captialization space would require 2^(12*3+5)=2^41 checks, which is roughly 2 quadrillion states -- at 1million shas/sec that would take about 25 days.

More discussion