Wednesday, January 30, 2008

Digging into Arc In 24 Macros Or Less

OK, so Paul Graham has a subtle way of getting me sucked into his much-anticipated language: he includes a tempting "hello, world" web application one-liner at the end of the tutorial that gives me a very (this is for you, Todd) pedestrian error message, and suddenly I'm knee deep in his world of macros trying to figure out what could possibly go wrong. Quick and dirty indeed.

So, what we're doing in this post is tracing through some of the Arc source code, finding the answer to why it doesn't work in Windows, and providing a patch. The source is from the arc0 distribution released Tuesday, January 29, 2008. This post is meant to be read in conjunction with the language description from the tutorial.

The one-liner in question is the first line in this snippet from the tutorial:

arc> (defop hello req (pr "hello world"))
arc> (asv)
ready to serve port 8080
Cool, that would be nice to define a function-like form that takes a web request and has its standard output get interpreted as an HTTP response.

However, I get this on executing (asv) on Windows:
arc> (asv)
The syntax of the command is incorrect.
The syntax of the command is incorrect.
Error: "open-output-file: cannot open output file: \"c:\\apps\\arc0\\arc/hpw\" (The system cannot find the path specified.; errno=3)"
This seems to happen only on Windows. On Ubuntu Linux the server runs, though there's another error for me in serving the page.

I was just curious to look at the line of code that seemed to be Unix-specific, or understand the source of this error.

For the record, and since the documentation is just a shade sketchy, this is how I'm installing and setting things up:
  1. Install MzScheme version 352 as recommended
  2. Put MzScheme's bin directory on the path
  3. From the arc0 directory, run the command mzscheme -m -f as.scm, as recommended

Debugging asv, the Arc Server

The definition for asv is in app.arc:
(def asv ((o port 8080))
(serve port))
The (o port 8080) means asv takes an optional argument port with a default value of 8080. Brevity is certainly a hallmark of the Arc dialect.

Let's dig into load-userinfo, since is first and sounds like it would be using the file system:
(def load-userinfo ()
(= hpasswords* (safe-load-table hpwfile*)
admins* (map string (errsafe (readfile adminfile*)))
cookie->user* (safe-load-table cookfile*))
(maptable (fn (k v) (= (user->cookie* v) k))
The safe-load-table of hpwfile* sounds like a likely place to look when getting error involving an hpw directory.

The definition of hpwfile* at the top of app.arc is "arc/hpw", so we are evaluating (safe-load-table "arc/hpw").

Evaluating (safe-load-table "arc/hpw")

safe-load-table is in arc.arc and defined as

(def safe-load-table (filename)
(or (errsafe (load-table filename))
Here we get the or function which evaluates arguments until one is true and returns that. Remembering that filename is bound to "arc/hpw", we need to look at (errsafe (load-table "arc/hpw").

Evaluating (errsafe (load-table "arc/how"))

The definition of errsafe shows we have arrived at our first macro:

(mac errsafe (expr)
`(on-err (fn (c) nil)
(fn () ,expr)))
The backtick (`) means quote the expression.

The comma preceding expr means to evaluate expr, but I wasn't sure of the order of evaluation, so I played with it a little bit:

arc> (mac m (expr)
`(do (prn "m started")
*** redefining m
#3(tagged mac #)
arc> (m 5)
m started
arc> (m (do (prn "evaluating expr") 5))
m started
evaluating expr
This clearly shows the evaluation happening at the point where expr is encountered in the macro. If expr appears twice in the macro, it would be evaluated twice:

arc> (mac m2 (expr)
`(do (prn "m2 started")
(prn "m2 discarded first expr")
#3(tagged mac #)
arc> (m2 5)
m2 started
m2 discarded first expr
arc> (m2 (do (prn "evaluating expr") 5))
m2 started
evaluating expr
m2 discarded first expr
evaluating expr

At this point I have to stop and thank PG for reacquainting me with Lisp macros. It's been... a long while since I've used these.

Getting back to the thread of execution, the expression (errsafe (load-table "arc/hpw") means this:

(on-err (fn (c) nil)
(fn () (load-table "arc/hpw"))))
Where we now know that (load-table "arc/hpw") may or may not be evaluated based on what the wrapping entity, on-err, does.

Definition of on-err

We find on-err defined in ac.scm. Since that's a Scheme file, not an arc source, that makes it essentially a language primitive defined thus:

; If an err occurs in an on-err expr, no val is returned and code
; after it doesn't get executed. Not quite what I had in mind.

(define (on-err errfn f)
(lambda (k)
(lambda ()
(with-handlers ((exn:fail? (lambda (c)
(k (lambda () (errfn c))))))
(xdef 'on-err on-err)
First, the easy part: xdef is a macro call that puts this Scheme code into the initial environment of Arc. Second, let's interpret the sometimes formidable call-with-current-continuation.

Presuming a bit the behavior of MzScheme's with-handlers, we start by asking to evaluate the thunk f, which was the second argument to on-err, the (load-table "arc/hpw" in our case. In case evaluating f raises any Scheme exceptions, by invoking the current continuation k we return from on-err saying that it evaluated to whatever (errfn c) evaluates to, where errfn is the first argument to on-err and c is a representation of the error condition itself.

Example of on-err

We can see an example usage of on-err at the top-level of the Arc prompt earlier in ac.scm:

(define (tl2)
(display "arc> ")
(on-err (lambda (c)
(set! last-condition* c)
(display "Error: ")
(write (exn-message c))
(lambda ()
(let ((expr (read)))
(if (eqv? expr ':a)
(let ((val (arc-eval expr)))
(write (ac-denil val))
(namespace-set-variable-value! '_that val)
(namespace-set-variable-value! '_thatexpr expr)
This defines what to do at the top level if an error is encountered.

Now that We Know How on-err Works...

We were trying to understand what (safe-load-table "arc/hpw") does, thinking it would explain why we get an error related to a missing hpw directory only in Windows.

That led us to the definition of safe-load-table as

(def safe-load-table (filename)
(or (errsafe (load-table filename))
which led us to the errsafe macro, which in our case expanded to

(on-err (fn (c) nil)
(fn () (load-table "arc/hpw"))))
We now can interpret how to evaluate it. We now know the second argument to on-err is evaluated, so we do need to know what (load-table "arc/hpw") does. If that causes an exception in the underlying MzScheme implementation, the error handler (fn (c) nil) will be invoked, and the whole expression will therefore return nil. That would be treated as false by the or in the safe-load-table above. In that event the second argument, (table), would be evaluated, and I take it mean return a default-constructed table.

But we need to look at load-table. We should wonder if the special with-handlers expression in on-err's definition is missing an exception that is being thrown by an underlying Scheme implementation.

Definition of load-table

We find load-table in arc.arc:

(def load-table (file (o eof))
(w/infile i file (read-table i eof)))
There is an optional argument eof but no value is given in the call site, so apparently it defaults to nil:

arc> (def f (x (o y)) (list x y))
arc> (f 1)
(1 nil)
arc> (f 1 2)
(1 2)
The w/infile macro definition is also in arc.arc, defined with a few other macros sharing a macro helper named expander and another macro after:

(mac after (x . ys)
`(protect (fn () ,x) (fn () ,@ys)))

(let expander
(fn (f var name body)
`(let ,var (,f ,name)
(after (do ,@body) (close ,var))))

(mac w/infile (var name . body)
(expander 'infile var name body))

(mac w/outfile (var name . body)
(expander 'outfile var name body))

(mac w/instring (var str . body)
(expander 'instring var str body))
The . body notation, as I had to remind myself, lets body stand for the list of whatever arguments remain from the call site. For example:

> (define (f x y . z) (list x y z))
> (f 1 2)
(1 2 ())
> (f 1 2 3)
(1 2 (3))
At any rate, at this point since I'm trying to get the post out, I'm not going to pretend I understand the evaluation order of how and when the expander definition gets substituted into the macro body.

I tried for a while without success to understand why the variable i doesn't get evaluated too early. The fact that expander is an ordinary function and not a macro leads me to believe that its arguments, which certainly include the undefined i, should be evaluated before being passed to expander so it can create its backquoted let form.

So instead, we'll assume that what appears to be the intent is the case, that (w/infile i file (read-table i eof)) is expanded to this:

(expander 'infile i file ((read-table i eof)))
and then this:

(let i (infile file)
(after (do (read-table i eof)) (close i)))
and finally this, when the symbols file and eof are evaluated:

(let i (infile "arc/hpw")
(after (do (read-table i nil)) (close i)))
The function infile is the next candidate.

Is infile the end?

As it happens, infile in Arc is directly tied to MzScheme's open-input-file, defined in ac.scm:

(xdef 'infile open-input-file)
So, we try this in MzScheme directly, and reproduce the error:

$ mzscheme
Welcome to MzScheme version 352, Copyright (c) 2004-2006 PLT Scheme Inc.
> (define i (open-input-file "arc/hpw"))
open-input-file: cannot open input file: "c:\Apps\arc0\arc/hpw" (The system cannot find the path specified.; errno=3)
This is all very interesting, but as it turns out this is not the source of the errors.

If you evaluate (load-userinfo) in the top level, you don't get an error message.

Getting a Clue

So I erred a little bit on the depth-first side here. After some simple debugging you find that (load-userinfo) runs fine.

What doesn't run fine is the (serve port) call, the code of which is in srv.arc:

(def serve ((o port 8080))
(nil! quitsrv*)
(let s (open-socket port)
(prn "ready to serve port " port) ; (flushout)
(= currsock* s)
(after (while (no quitsrv*)
(if breaksrv*
(handle-request s)
(errsafe (handle-request s))))
(close s)
(prn "quit server"))))
And what doesn't work here is ensure-install:

(def ensure-install ()
(ensure-dir arcdir*)
(ensure-dir logdir*)
(when (empty hpasswords*)
(create-acct "frug" "frug")
(writefile1 'frug adminfile*))
And what doesn't work here is ensure-dir, defined in arc.arc:

(def ensure-dir (path)
(unless (dir-exists path)
(system (string "mkdir " path))))
And finally our search is at an end.

Arc is shelling out to perform the directory create. MzScheme does provide the directory-exists? function portably (it detected my E:/ drive), but why is Arc executing mkdir? MzScheme defines a make-directory function.

Patching Arc

This has not been a total loss. Now we know enough to fix Arc so it works in Windows.

In ac.scm, at line 873 after the definition of dir-exists, define mkdir:
 (xdef 'mkdir (lambda (path)
(make-directory path)))

In arc.arc, at line 1202, replace the system call with the call to our new mkdir function:

(mkdir path)))
And finally we have enough to retry the Hello, World one-liner:

arc> (defop hello req (pr "hello world"))
arc> (asv)
ready to serve port 8080
srv thread took too long
user break

=== context ===
c:\apps\mzscheme\collects\mzlib\ system*/exit-code
c:\apps\mzscheme\collects\mzlib\ system*
OK, so it timed out after 5 seconds.


This Arc release actually got me moving from thinking about Lisp/Scheme to actually using it, and appreciating macros. I do admit that I enjoy the brevity of the keywords and design of the syntactic forms, and if that's all Arc ever is it's still value-added.

Releasing a version before things were polished may be a good thing for Lisp as a whole, and the project in general. I recall the initial Java releases had just enough of a standard library to get people's attention immediately, and when you looked closely at it you could see how it could be built out. I think that combination may prove irresistible for hackers, who have always been Paul's intended audience.


Amazing said...

My hat is on the floor.

randallsquared said...

"The fact that expander is an ordinary function and not a macro leads me to believe that its arguments, which certainly include the undefined i, [...]"

No, expander's arguments include 'name', not 'i', which is a symbol that happens to be the *value* of the 'name' argument to expander. The expander function is getting run at macro-expand time, and the result of it is inserted in place of the macro that called it.

John said...

Funny, I came to the same conclusion, but by a much less scientific method :I found where "frug" was being written into the hash file. That led me to ensure-dir and ultimately to the mkdir problem, too.

Thanks for writing this up, though. I am a total LISP noob and seeing your thought process was valuable to me.

JFKBits said...

randallsquared: Thanks, though your explanation is not in the terms I would prefer, it encouraged me to try to find the right answer.

The error in my thinking was in not noticing or understanding the significance that the call to expander is not backticked.

Typical macros that I'd seen have bodies that are backticked. I made the mistake of imagining the call to expander being quoted this way, and then after macro expansion a call to expander being presented to the evaluator.

In fact, this example shows that macro expansion can run arbitrary code, using the macro arguments which are implicitly quoted expressions. After seeing some of what can be achieved with C++ templates, I can appreciate the implicit power.

murphee said...

Nice walkthrough.

Slightly off-topic Arc observation: the "o" for optional arguments strikes me as a unfortunate decision... doesn't really suggest that it's an an operator and will also cause weird behavior if one chooses to name an argument "o" (which of course no one would).

Better alternatives (in my view):
Ruby: def foo(x=42)
Mathematica: F[k_:kdef]

Anonymous said...

This post seems to be missing the point. The problem is not that Arc had a bug on Windows. The problem was that THERE WAS NO BACKTRACE. The only way to hunt down the bug was to go through and do basically a binary search on the code. In every sensible programming language, when you get an error, you get a message with the line number of the error. This allows you to simply jump directly to the buggy bit.

Making programs shorter? Good idea. Making debugging shorter? Priceless.

JFKBits said...

Anonymous said: "This post seems to be missing the point...THERE WAS NO BACKTRACE"

Yes, this release has a lot of... shall we say simplifying assumptions.

Ken Shirriff said...

Nice analysis. Is there a solution to the 'srv thread took too long' problem on Windows? I'm hitting that too.

Jeff Bonevich said...

Ran into the same issue with mkdir on Mac OS X (10.4.11) and the arc1 release, only in this case the problem was with a bad option (-f) to mkdir. Thanx for the excellent guide.