Wednesday, March 21, 2007

Introducing Ila

Today I'd like to tell you a little bit about a programming language you've never heard of. I know you've never heard of it, because I'm not going to use its real name. (That was comment bait: name the logical error I just made.) It wouldn't really help if I did use its real name, because it exists only in academic papers written by my Master's degree advisor and in the compiler work I did for that degree.

This language is the chief reason this blog exists at all: I want to share the things I learned from the work on Ila. From the design of the language, I learned about object-oriented programming: what it is, what it isn't, and how simple it actually is. The language design, which is mostly my advisor's, exposes the nuts and bolts of individual OO features. And even though I had worked on several interpreter and compiler projects before, with the Ila compiler I learned a lot about language implementation, such as how to do type inference in the presence of recursive types.

I don't claim to be an omniscient programming language expert. But the Ila project exposed me to the inner workings of some stuff you usually don't hear too much about, and I think I have something to contribute to both the programming community about things object-oriented and the language and tools community. I had hoped to write some things up for PLDI 2002, but life went on and I never did. Now this blog gives me an outlet to articulate some of what I learned.

My hope is that you can learn from what I did by reading about it, without actually having to do it, though perhaps my words to come on this topic will inspire you to do something with the ideas.

Let's get to a very brief high level introduction then.

The Ila Programming Language

Ila (pronounced "EE-la") is what I'm now calling the language accepted by the compiler I wrote for my master's thesis work between 1998 and 2001. Here's a quick run-down of what it is:
  • Ila is a statically-typed object-oriented language conducive to functional programming with SML-like syntax.
  • Hello, world looks like
    println "Hello, World."
  • Ila is influenced by and conceptually related to the Pizza programming language and the Moby programming language.
  • Ila is implemented as a translation to Java source code.
  • Ila is type-crazy (static typing, no typecasts, elaborate subtyping rules) but has the type inference to make it endurable.
  • Side-effects are clearly isolated by the type system: code is divided into expressions (free of side-effects) and commands (allow side-effects). An Ila program can be either of these, an expression (purely functional) or a command (usually a sequence of commands).
  • Functions are pure and have no side-affects, in syntax something like fn x => x.
  • Procedures can have side-effects, and in syntax look something like prc n => println "Hi, "+n. Functions cannot call procedures (of course procedures can call functions).
  • Data structures are built from lists, tuples, and records. The syntax for these is thus:
    []                               (* empty list *)
    ["lax","sjc","ord"] (* list with 3 strings *)
    (54,40) (* pair, or 2-tuple *)
    {lat=41.97944,long=-87.90417} (* record with 2 fields *)

    {dep=("21/3/2007 1020-8","lax"),
    arr=("21/3/2007 2312-6","ord")} (* tuples nested in a record *)
  • The most primitive object is the variable, which has get and put methods. Get is a function, and put is a procedure. Variables are instantiated by the Var class, which is a generic class that operates on any type, the type of the value you want the variable to store.
Some of the more mind-blowing, or at least esoteric, stuff about Ila:
  • When we say "conducive to functional programming", we include the notion that classes are first-class. This makes it possible to use a combining style of programming, writing functions that take classes as arguments and return a new class.
  • Ila has limited support for parametric polymorphism (or "generics") , in the sense that of necessity some built-in types and functions could operate on any type. (User-introduced generics were something that was on the hazy roadmap.)
  • Most objects are represented by records, which in Ila (as in many theoretical languages used for academic study of object-oriented theory) is an orderless collection of named values, like {lat=40,long=37}. This notion of record is a little like a dictionary or hash with a fixed collection of key-value pairs. Though an individual record value has a fixed number of fields, there are record operators for producing records with more fields (to implement subclassing) or with different values for some fields (to implement method override in a subclass).
  • There is also a fixed-point operator which works on records. While the need for the effect of the fixed-point operator is universal and included in all OO languages I'm aware of, I've never heard of its being exposed as an operator. Implementing this was quite interesting.
Since you're a smart reader who can fill in a lot of details without my explaining, let's end this introduction with a couple of real Ila programs.

The first, add2.ila, is an elaborate interface to your computer's ALU, but it also shows how variables (x and y) are created, which happens to be the same way any other objects would be instantiated, and it also shows a subtlety of subtyping in action. Bonus points to whoever can identify the subtlety. Hint: you will need to make an assumption about the types of promptread's parameters.

add2.ila
new x isa Var[int];
new y isa Var[int];
promptread("Enter a number:",x);
promptread("Enter another number:",y);
print "The sum of your numbers is ";
print(x+y);
println "."

The second program, redhen.ila, is an example fun to code up in a functional language because it emits a familiar children's story with a strong repetitive pattern, and the code shows data structures, functional style, and the command/expression interaction. The app function is like in SML, it takes a procedure and a list and applies the procedure to each element of the list.

redhen.ila
let
val LRH = "Little Red Hen"
val lazyanimals = ["duck","dog","cat"]
val actions = ["plant the wheat","harvest the wheat",
"grind the wheat","bake the bread"]
fun respond willing = prc animal =>
(println("\t\""+
(if willing then
"I will"
else
"Not I")+
",\" said the "+animal+"."))
fun questionResponse willing = prc action =>
(println("\t\"Who will help me "+action+"?\" asked the Little Red Hen.");
app (respond willing) lazyanimals)

proc unwilling action =
(questionResponse false action;
println("\t\"Then I will do it all myself,\" said the "+LRH+".");
println"")

proc willingly action =
(questionResponse true action;
println("\t\"Hmm, that seems strange,\" remarked the "+LRH+",\"");
app (prc action => (println("\t I did "+action+"."))) actions;
println "\t I should also eat the bread.\"")
in
println("\n\n\t\tThe Story of the "+LRH+"\n\t\t\tas told by Ila\n\n");
app unwilling actions;
willingly "eat the bread";
println("\tSo the "+LRH+" shared the bread with her little chicks.");
println"\n\t\t\t\tTHE END"
end

That's Ila in a tiny nutshell. Let me know what you think, and I'll try to pick a good place to begin discussing how Ila can help us understand the rest of the programming world.

4 comments:

Ken Bateman said...

I've been having similar ideas about a language based on ML with an pure expression/impure command distinction. I was thinking that it was possible in some circumstances for a function with side effects to be considered an expression in a wider context.

let incr nref = nref := nref + 1

Obviously incr has side effects.

let plus3 n =
let nref = ref n in
incr nref; incr nref; incr nref;
!nref

plus3 calls a procedure with side effects, but the variable referenced isn't accessible from anywhere else. It's the FP version of "if a tree falls in the forest when there's no one to hear it, does it make a sound?". From the point of view of any caller, plus3 is indistinguishable from a purely functional expression.

JFKBits said...

Ken:

That sense of local mutable state is definitely useful.

For example, there's the trick where local state is used for caching results. To the observer, the function is free of side-effects.

The imperative language C++ also supports this with the "mutable" keyword. A const method in C++ is supposed to be guaranteed to be free of side effects (on the object anyway, I think it's allowed to modify objects accessible via the global scope). But instance variables marked "mutable" can be modified in a const method, and it's recommended that you do use it for things like caching. It's a "shut up turkey, I know what I'm doing" keyword, and it makes C++ const methods have "cooperative referential transparency".

All that said, the command/expression separation is not something I intend to focus on in Ila, but it must be explained.

Back to your language, any special use cases or motivations for the command/expression separation feature?

Ken Bateman said...

I was a little dissatisfied with ocaml because there is no way to effectively tell if a function causes or depends on side effects.

I also like the erlang model where you have a bunch of functional processes that transactionally record their state. This architecture enables some pretty cool highly reliable, parallel, redundant systems.

If an ocaml variant could assume that a function didn't depend on or cause any side effects, then it would be easier to automatically parallelize programs. You could also detect when programs are depending on an unspecified order of operations.

I think that a pure function should not only cause no side effects, but should not depend on any other side effects. In other words, a pure function should not read any references that may change in between invocations. Or you could say that a pure function should depend only on constant values in its closure and the parameters given it.

It might also be acceptable for a pure function to have a reference passed in as a parameter, and for the pure function to modify the value of the reference. Then purity would be defined has causing and depending on no 'hidden' side effects, and you would have the assurance that if you call the function multiple times with equivalent arguments the function will always have the same effect and return value.

I suppose repeatability might be a better term than purity for this quality.

JFKBits said...

Ken:

Your description of ERlang and automatically parallelizing programs made me realize there's another type of language which does this: machine language. Hardware designers control the semantics of each instruction and the local state (registers) and so they know how to reorder and parallelize an instruction stream. It might be instructive to think about that area a little bit.

I do wonder how easy it would be to statically determine the "repeatability" property. I'd guess there'd be a set of circumstances where it could be confirmed, but not all code would be like that. Hopefully typical scientific codes, or whatever application area you're parallelizing, may fall into the category of things that can be determined to not depend on local state.