Friday, May 16, 2008

Call by Need Lambda a Poor Man's Macro?

I've been seriously considering why more languages don't include a call-by-need lambda (hereafter called "lazy lambda"). With its delayed evaluation it offers macro-like powers for writing non-looping forms (they're bad for loop forms since the arguments are evaluated but once by design), but they don't have the bloating effect of macro expansion (if code size is more important to you than the time overhead of a function call), and they are hygienic. They're not a cure-all, but this seems to be an approach which can still be wielded effectively by trained professionals.

How to Use Lazy Lambda


Here's how a lazy lambda would work in an existing language like Scheme. You have a new abstraction lazy-lambda with precisely the same syntax as lambda. When applied, lazy-lambda follows call by need evaluation rules. That means arguments to the lazy procedure are not evaluated at the call site, but only when their corresponding formal parameter is first referenced inside the body. On this reference, the evaluated argument value is remembered for future references inside the body. Here's how you might write something like Groovy's Elvis operator:

(define ?: (lazy-lambda (expr default)
(if (null? expr) default expr)))
The thing I like is that this is automatically hygienic: it works fine even if you call it in an environment with variables named expr or default.

Implementation


I like to divide my thinking about new features into two phases: introduction and use. When lazy-lambda is introduced, the parser needs to create a structure essentially identical to that of a lambda, namely an arguments list and a body expression, but of course it needs to be marked as a different type from lambda so the evaluator can distinguish the two. Lazy lambda is used in two ways, once when it is evaluated (e.g. when a lazy-lambda expression is returned from a function) and once when it is applied.

Summary of Lazy Lambda Implementation
Introduction: '(lazy-lambda args body)
Evaluation: '(closure (lazy-lambda args body) env)
Application: Bind args to thunks, evaluate thunks in a memoizing way

When lazy-lambda is evaluated, it should create a closure, the pair of the lazy-lambda expression and the environment in which it was evaluated. This closure needs to be marked as a different type from a regular closure. Alternatively the evaluator can be arranged to check the type of the "lambda" expression: a closure may look like '(closure (lambda args body) env) or '(closure (lazy-lambda args body) env).

What happens a lazy-lambda closure is applied? We know you don't evaluate the arguments, but what then? As with eager closures, you first create a new environment scope. Then, instead of binding the formal arguments to the evaluated values of the arguments, you bind the formal arguments to some box (container) of the unevaluated argument expressions. The container needs to be distinct from any other language type so that the evaluator knows how to treat it. That is, once we set up the new environment scope, we will simply evaluate the body of the lazy-lambda under this new environment, and the references to the formal arguments need to be handled in this special memoizing way. So let's introduce an expression type delayed, which is not directly creatable in the language, and we bind each formal argument x with actual argument expression A to the "value" (delayed A env). The env value will be needed when we evaluate A, because we will need to evaluate it in the calling environment, not whatever environment is in effect when the symbol is first referenced. (Think about what gets returned by (lambda (x) ((lazy-lambda (x) (begin (set! x (+1 x)) x))) x).) Then when the evaluator handles variable references and gets a delayed value back from the environment lookup, it's time to do the memoizing: evaluate A in environment env, rebind (set!) the referenced variable with its evaluated value, and return that.

Conclusion


None of the popular scripting languages I can think of (Javascript, Perl, Python, Ruby) have a macro facility, but most of them have anonymous functions which evaluate to closures in the Scheme sense. On the other hand, they also tend to have a richer set of control structures (Perl's unless), and they have other mechanisms (dare I say classes and objects?) which address most of the killer applications for macros, and hence for lazy-lambda. But for all that, I'd have to figure that those languages, and the tinier languages or DSLs, could add this feature.

Macros are subtle things to get right, and I'm sure there are deficiencies I haven't addressed here. But that shouldn't stop us from thinking about these issues, and I think there's some potential value in the call by need lambda.

2 comments:

Reginald Braithwaite said...

Given macros and strict lambda, isn't it possible to define lazy-lambda without resorting to changing the interpreter?

JFKBits said...

reginald braithwaite: Yes, if you have macros, you don't need this "macro substitute", and I think it sounds possible to do what you suggest. Perhaps it was confusing for me to use Scheme, which has its famous macro system, to describe the lazy-lambda implementation. My focus is on the languages without macros. My inspiration is my employer's language, Mathematica, which has numerous evaluation controls but no macro system per se, and still it lets you define your own control structures and looping constructs, and object-oriented frameworks (a la Common LISP) at will. My reasoning was that language implementers may not be comfortable designing and implementing macros, but adding a variation of lambda, particularly if the language already has strict lambda, is quite straightforward and should give users the evaluation control otherwise missing.

Out of curiosity, why do you ask?