Thursday, December 28, 2006

2006 Year-End Language Version Wrapup

At the year end, to help give a sense of progress of time, we present the current versions of some of our programming language and tools favorites.


We close this list with a quote from the British computer scientist who created the ML programming language when asked "Did you learn Greek or Latin?":

"I learned Greek from the age of ten and Latin from the age of eight and I regarded them both as a kind of defective form of mathematics because they were taught with such accuracy."

--Robin Milner

Source: An Interview with Robin Milner by Martin Berger.

Thursday, December 21, 2006

All I want for Christmas

Attn: Request Department, Geek specialist
House of Santa Claus
North Pole

As I could not find a reputable looking web site to post my request, I'm just putting this stuff on my blog and assuming your Google alert for me will notify the right people.

Thanks for all the great stuff you got me last Christmas. Like, a new job, this blog, the chance to meet that guy from the New Zealand Supercomputing Centre, and an entertaining and thought-provoking rant in the form of The Kingdom of Nouns. It wasn't what I asked for, except for the job, but I liked it all anyway. Even the Microsoft Windows Compute Cluster 64MB promotional USB memory key that was so cheap and ubiquitous I got two at different events.

For Christmas 2006, I'd like an assortment from the following wishes:


  1. A better mental model of the web browser/web server interaction. The one I've got now is pretty rusty and broken down. I mean, how am I supposed to think about the back button in the presence of an AJAX interface? What's the right way to view multiple user browsers pointed to the same web site, as far as sharing or not sharing cookies and sessions?

  2. An elegant solution to the Tower of Babel situation regarding data representation

  3. Tab completion built in to more widgets that I use on a regular basis.

  4. Better Ant documentation

  5. Better techniques for isolating code in Java. Onejar is a pretty good start.

  6. Time to catch up on Java. I'm still stuck on 1.4.2, and now they've announced 1.6.

  7. Time to play with Ruby.


I don't know how you're going to package the "time" requests, but try to do better than giving me insomnia or something.

Tuesday, December 12, 2006

Splitting the Zip

Can you remember a time when you thought "Goodbye to splitting a file across a bunch of floppies, it all fits on a CD now"? Maybe like me you're never 100% comfortable with this form of information surgery where you split a file, usually valuable, into parts, transmit it somewhere, and reassemble. If so you blew a sigh of relief when thinking those days were over, because hey, CD's store everything. Well, it was only a matter of time before the scenario revisited me.

In this case, the patient was a 1.2GB zip file. In order to help a certain close relation meet an academic commitment at the end of the semester, I was called upon to install at home a trial version of the program used in the school lab. The problem came after the two-hour download, when the file was reported to be corrupted and we noticed about 45 megabytes were missing from the expected size.

We figured that transmitting such an enormous file was too much opportunity for error. Something got dropped in transmission. We could re-download the file on a computer with a better, faster Internet connection than our DSL hookup. We did this fairly quickly. But we still needed to get the file to our computer at home somehow. I was surprised that Firefox wasn't doing some kind of data integrity checking. My solution then was to do it manually: split the file into chunks and verify each chunk with its MD5 hash. If a chunk got transmitted correctly, it would be progress because we didn't need to start over from scratch.

Hoping to find standard Unix commands to do this, I found that split and cat both support byte mode, so they could be used. I did a quick check that the Cygwin versions worked as I thought. The procedure I used is this:

Step 1. Decide what size chunks to use

I settled on a size of 300,000,000 bytes, big enough to make five chunk files.

Step 2. Split the file


% split -b 300000000 valuablefile.zip

This created a series of files xaa, xab, xac, xad, xae. The first four have the exact size specified, 300000000, and the last (xae) has the remaining bytes.

Step 3. Transmit and check integrity of the chunks

As each chunk file finished download, we ran md5sum on the chunk and compared the hash with that obtained on the originating computer.

Step 5. Reassemble the chunks

% cat -B x* > valuablefile.zip

This step took about 18 minutes on Sony Vaio laptop, and I suppose there's no surprise that virtually none of it was CPU time.

At this point we had our file back, and was full and complete. I offer this account in hopes it will be helpful. If you find it helpful, leave a comment!