Thursday, May 25, 2006

Bright ideas

I'm really glad I have a light with a touch-sensitive switch. Last night I was holding my 2-month old daughter, who had just fallen asleep, and I wanted her to stay that way, so I went to turn off the light. With my hands full, I was able to slip out of my right house shoe and use my bare toe to turn off the light. It's irrelevant that she woke up just 20 minutes later, I was still happy about being able to turn off the light when I need to.

Saturday, May 20, 2006

Quine Dissected

A program that prints its own source code is one of the fun puzzles in computer science. I attempted this problem a couple years ago, and now I wanted to introspect a bit on the process that led me to a solution. Solutions for C++ and Standard ML are found at the end of the post.

A first attempt in some language would look something like this:


or rather it would start like this, because of course you get stuck. Your brain's Recursion Detector goes off, and you realize you just can't go on like this, into the infinite reflections of opposing mirrors, because you'll miss lunch, and your editor won't be able to hold it all anyway.

So we take a step back. We take stock of what we know:

1. The program has to start some way.
2. Similarly, the program has to end some way.
3. We assume that our technique will involve quoting something from our program. There may be quines that don't quote anything, but we'll try it.

Armed with these observations, let's go back to our program.


This looks good for the form of a program that is going to print something static (not based on any input). However, we already tried it, and ran into the recursion problem. But let's look at the program at or around the point things went wrong:


After this, our string starts recursing. You could perhaps say it starts recursing at the second Print, but that is the first time the program appears in a quoted string.

My thought was to put a variable marker ("S" for "self") in the quoted string to mark the recursion:


My idea was to write a function, let's call it f, which takes the string above and prints it contents, except when it hits the S. At that point don't print it but instead skip it and then print the remainder. Of course the function definition would need to precede the Print call, but as long as the function includes no quotes, it's static text and falls into Observation #1, the program must start some way.

I wrote a version like this, but then realized the marker was a little arbitrary, and that really you could simply split the program into two parts, the part before the recursion point, and the part after. Why not then make a function which takes two strings.

If our program had the form A"A","B"B, where A and B were arbitrary strings of source code which did not attempt to contain their own quoted source code, then we could write a function taking the strings A and B which would print this pattern: A+quote(A)+","+quote(B)+B. This function would then be printing the pattern of the source, without having to include the quoted form of the source itself. The function quote returns a quoted form of its argument string, using the type of quotation marks appropriate to the language, and including any escaping needed (e.g. if the given string includes quotation marks itself).

From here, we can deductively arrive at an actual implementation. We decompose A and B a little bit:

  • Apreamble: The code for f plus any setup or overhead needed at the beginning of a program in the language (e.g. #includes or Program SelfPrinter).

  • Bpostamble: Code to finish up the "main" part of the program and finish the program.

Hence A=[Apreamble]f(, and B=)[Bpostamble]. (If a function call looks different in the language, the f() syntax is easily changed.) Does this work? Does it solve the problem? Let's substitute these parts into the original form:

Original form:


Substitutions from A applied:


If Apreamble and Bpostamble don't include any "recursive strings", this will work. A full example will be a little difficult to read, due to the repetitions inherent in the pattern.

Enough symbolics, let's get to a real example, in Standard ML/New Jersey. The function f is given by this definition, with some supporting definitions:

val q="\""
fun quote s = concat[q,String.toString a,q]
fun f(a,b)=print(concat[a,quote a,",",quote b,b])

The code, with the string literals removed looks like this:

val q="\""
fun quote s = concat[q,String.toString s,q]
fun f(a,b)=print(concat[a,quote a,",",quote b,b])

Hopefully now you can see how A can be replaced by all the code leading up to, but not including, the quotation mark before A, and likewise B by the code after the quotation mark following B.

Finally then, here are full solutions following this pattern for SML and then C++.

SML Quine

val q="\""
fun quote s = concat[q,String.toString s,q]
fun f(a,b)=print(concat[a,quote a,",",quote b,b])
f("let\n val q=\"\\\"\"\n fun quote s = concat[q,String.toString s,q]\n fun f(a,b)=print(concat[a,quote a,\",\",quote b,b])\nin\n f(",")\nend\n")

C++ Quine

#include <iostream>
#include <string>
typedef std::string str;
str str2lit(str s)
str r;
for(str::iterator i=s.begin(); i != s.end(); ++i)
if(*i == '\n') r += "\\n";
else if(*i == '\\') r += "\\\\";
else if(*i == '\"') r += "\\\"";
else r += *i;
return r;
void p(str a,str b){ std::cout << a << str2lit(a) << "\",\"" << str2lit(b) << b; }
int main()
p("#include <iostream>\n#include <string>\ntypedef std::string str;\nstr str2lit(str s)\n{\n str r;\n for(str::iterator i=s.begin(); i != s.end(); ++i)\n if(*i == '\\n') r += \"\\\\n\";\n else if(*i == '\\\\') r += \"\\\\\\\\\";\n else if(*i == '\\\"') r += \"\\\\\\\"\";\n else r += *i;\n return r;\n}\nvoid p(str a,str b){ std::cout << a << str2lit(a) << \"\\\",\\\"\" << str2lit(b) << b; }\nint main()\n{\n p(\"","\");\n return 0;\n}\n");
return 0;

Tuesday, May 16, 2006

Java's unsigned bit shift right

Not having had the need for bit-twiddling in Java, I just now learned that Java has an "unsigned shift right" operator, >>>. The signed operator >> retains the sign bit (for turning -8 into -4, for example) and >>> shifts in a 0 to the most significant bit. This unsigned right shift operator has been part of the language since 1.0. I suppose the need for the two operators arises because all Java's integral types are signed, meant to support both signed and unsigned operations.

Friday, May 12, 2006

Functional languages in the real world

Once in the mid 1990's I was interviewing with a Fortune 500 company. "What's with all this functional programming experience?" I was asked. "What's the point? Are they really going anywhere?" Especially for this real-time embedded system company, functional programming sounded like a waste of time.

A lot of us who enjoy functional programming have wondered if it will catch on. My answer to that interviewer was that Java was catching on, and it featured garbage collection and allowed anonymous classes which can be used to imitate first-class functions. A little weak perhaps, but it did seem to make a point with him.

Then XSL/XSLT came along, and I was really happy, because here was something where you were supposed to think functionally, where you thought more about transformations and mappings than storage elements and order of evaluation.

And then recently, I finally stumbled across a truth which was out there but unknown to me: Javascript is functional. At least, it offers first-class functions, garbage collection, and lets you program in a functional style. The featured link shows how the author implemented a Lisp to Javascript translator in just a few hundred lines of Javascript, made easy in part because Javascript has anonymous functions to start with.

Then today I ran across this presentation, which shows how to do both functional and aspect-oriented programming in Javascript. The authors are behind Dojo, an open source Javascript toolkit for doing things like AJAX, eventing, and animations. I haven't examined the source for Dojo, but I presume that given the presentation, the Dojo system actually used the functional programming capabilities of Javascript.

The amazing thing is that Javascript had me fooled: in the words of one of Cryptopunk's commenters, "I thought JavaScript was just super-dumb Java". Lots of programmers and web developers use Javascript this way, tweaking HTML DOM elements, declaring variables without types, and generally writing like it was BASIC for the web. Now we find that the C-style syntax and Java-like supporting libraries were hiding our friend, the lambda.

Even neater, this blindness can go both ways. Developers who appreciate functional programming can take advantage of Javascript's functional features, perhaps incorporating Dojo to achieve some really high-quality web interfaces quickly and with some code they can possibly reuse at the end of the day. And nobody is going to ask them "what's the point". All the developer's manager probably knows is that the implementation language is Javascript, and who doesn't use that?

So have we arrived in a world where functional languages have a real-world need to justify their development and support? I think we're getting there. If you couldn't already get a job writing in OCaml for a New York securities firm or with a European firm coding in ERlang, you can almost certainly get a web monkey position hacking web pages in a functional style with the help of Dojo, and no one will know the difference.