Tuesday, July 24, 2007

Not programming for the ICFP 2007 Programming Contest

Earlier this year I read Tom Moertel's Seven Lessons from the ICFP Programming Contest. The appeal of a round-the-clock contest has dwindled over the years for me, but Tom's experience made me resolve to look into entering this year, and above all, not to write any code until it was patently clear why I needed to. I think this turned out to be a good thing. I did register but I didn't have the time to spend much more than a few hours Friday afternoon and a couple more Saturday afternoon. Still, in that time, I had the satisfaction of thinking about the problem and anticipating several things which turned out to be true. It's been interesting to compare my experience with those of people actually entering the contest.

Think, then code

The chief lesson of Tom's that I want to focus on is Lesson 3, understand the problem before solving it. This is so important that's it odd to see how often it gets broken. Tom relates:
So, doing the math, it would seem that in the space of thirty scant minutes, I managed to get the Task, print it out, shower, dress, read the entire excruciatingly detailed specification -- all of seven 10-point typeset pages -- and decide, as the next course of action, that it would be logical to begin writing a parser Right Freakin' Now.
Learning from this, I printed out this year's 22 page specs and took them to the library, away from my computer, where I could read. I got the idea that DNA was transformed to RNA by the execute task, followed by the transformation of RNA to a bit-mapped image by the build task. I noted that the given DNA input was supposed to contain hints and messages.

Keeping the goal at the fore

Section 3 of the spec, "From DNA to RNA" started moving me from the fairy tale about Endo and his spaceship into the realm of programming. I started reading the specification of DNA sequences and the operations on them:
xs [diamond] ys - concatenate two sequences, xs and ys
[m..] - subsequence of xs, starting at m (abbreviation of xs[m..len xs])").
I could feel gears turn and cams snap into place in my head. Upon turning the page, and the next, we started to get into real code.

When I hit page 5, decoding patterns I actually stopped reading. I did something that I'd started to do recently when assembling cheap furniture: I turned to the end of the instructions to see how it would turn out, and work backward from that. I'm glad I did, because I was starting to lose the thread of what the problem was about among the details of an algorithm.

On page 20, section 5 "How to make Endo live", it was revealed that the whole point, what we were supposed to submit, was a prefix to Endo's DNA which would transform the image. The shorter a prefix, the better, and the lower the number of wrong pixels, was ten times better. This helped me realize that whatever approach was taken, there was an optimization component to it.

After reading about the RNA graphics language, I had enough of the basics to proceed. I had the idea that the DNA represented a sort of programming language and that one needed to write code in it which would generate graphics commands to transform the images. I hoped there would be symbols to generate things like the blades of grass or the lettering or the flowers. I think this now turns out to be right, as the "genes" hinted at in the reference are subroutines for helping to transform the image, or at least generate the new target image.

Observing

The next thing I did, again without programming, was to study the two pictures, noting any differences I found. A framework I use for studying informs that "observation is the first step." This was useful, as I started to realize how many differences there were. Even similar elements, like the grass, the fish in the stream and the trees on the left horizon differ in the details (the fish are facing different directions). My office mate got pulled into the problem, and he postulated that perhaps even the blades of the windmill were not at the same angle; he was right. We started using Mathematica to test our hypotheses. This turns out to be a good tool for the problem, because imported images immediately are accessible as Raster objects, which contain a list of lists of RGB values, perfect for manipulating and running statistical analyses on. We wired up a Dynamic expression which would allow us to move the mouse over the source image and display information about that pixel position in the source and target images. With this interactive tool we quickly found that the night-to-day transformation was not a simple gray-level shift or color channel shift.

The conclusion of this observation, as I walked to my car late Friday afternoon, was that more knowledge would be needed to do the transformation. The target picture was simply too complex to think of the problem as a series of basic graphic operations. The result would be not much different than encoding the target picture as the DNA prefix. I remember telling my office mate, the best thing to do is probably to write a disassembler for the DNA and study the original string.

Disappointing Saturday

I wish I'd followed my advice, of studying the original DNA string, because I never did it. My two or three hours of work on Saturday consisted of going back to study the DNA and RNA specs in detail, followed by my first coding session which only got as far as a non-functioning pattern function.

Conclusion

Even though I submitted nothing, for the time that put into it the non programming approach at the outset was good. Of course to solve the problem you need the computer time, but the forced discipline of thinking about the goal and the problem at a higher level seemed to work. Others reported spending most of their weekend coding things that in the end did not help them solve the problem.

If I'd truly followed my own thinking on Friday, I could also have submitted something like a prefix to set all the pixels to the target's dominant color, or learned more about the problem by reverse engineering the DNA than I did by coding up an algorithm that I already understood.

Monday, July 23, 2007

Diffing Endo

In the brief time I had to look at the ICFP Programming Contest this past weekend, I came up with one result I'd like to share. For background, roughly speaking the task was to provide a prefix to Endo's DNA which would transform this nighttime picture of an alien saying "lambda" to this other daytime picture of an alien cow say "mu":



Source picture



Target picture

Without time to inspect the DNA to find how it generated the source picture, I resorted to merely studying the two pictures to imagine how the transformation could be done in a symbolic way that wouldn't devolve to enumerating all the pictures of the target. What I come up with was this image, which is what you get when you subtract the target picture's RGB values from the corresponding values in the source:


Diff of target-source