Friday, January 18, 2008

Gradual Typing

There was a nice exchange on the TYPES list a couple days ago announcing a new paper "Gradual Typing with Unification-based Inference" (GTUBI hereafter) by Jeremy Siek & Manish Vachharajani at Colorado.edu. It drew remarks from Tim Sweeney, the only game programmer I know who gives invited POPL talks, and Phil Wadler.

While I was aware of rumblings to add type inference to dynamic languages like Ruby with enhancements to RDT/Aptana and Python with Pysco, I had not been following the academic developments. Aside from its contributions of adding type inference to gradual typing (as well as discussing some approaches that do not work), this paper opened my eyes to the notion of gradual typing making its way through the theoretical programming language crowd.

Which Time to Run: The Raging Debate

Run-time or Compile-time? That is the question. I've been watching Raganwald chronicle this debate from time and time, and it's at least productive in finding out what worth each approach has. I'm still back on the compile-time side, if you didn't notice. GTUBI (the paper we're talking about, remember?) has this characterization:
Dynamic languages such as Perl, Ruby, Python, and Javascript are popular for scripting and web applications where rapid development and prototyping is prized above other features. The problem with dynamic languages is that they forego the benefits of static typing: there is no machine checked documentation, execution is less efficient, and errors are caught only at runtime, often after deployment.
--Siek & Vachharajani, GTUBI
While I can't claim to appreciate dynamic typing, I do acknowledge that with all those Javascript errors flooding my browser error console have somehow not stopped me from basically doing what I need to do on the web.

What Makes it Gradual

I always like to know the origin of a name. In this case, gradual refers to the idea that in a gradually typed language, you can gradually add type annotations to what is otherwise a dynamically typed program.

Dynamic Typing is Not Typeless

The static type folks like to point out that dynamic languages are not typeless -- they're just languages with one type. In GTUBI, they write this type as ?; in Wadler's paper it's written Dyn. Rightly or wrongly, I typically think of this type as "expression," influenced by the LISP notion of everything as an S-expression.

What's Essential in Gradual Typing

One essential element of a dynamic language is the business of type conversions or coercions: since x+3.14159 isn't statically compiled to know which + operator to use (int+int, real+real, string+string), the types are examined at runtime (when the + function is applied) and applicable conversions take place. Conversions in dynamic languages are thus usually implicit, but you could have explicit conversions (the Ruby to_s method). GTUBI states
Allowing implicit conversions (e.g. from ? to int) that may fail is the distinguishing feature of gradual typing and gives it the flavor of dynamic typing.
--Siek & Vachharajani, GTUBI

Type Consistency

The main idea, say the GTUBI authors in their review of Gradual Typing, is to replace type equality in the type checker, with a relation called type consistency written as ~. Their examples:

int ~ int (int is consistent with int)
int !~ bool (int is not consistent with bool)
? ~ int (the unknown type is consistent with int)
int ~ ? (int is consistent with the unknown type)
(int -> ?) ~ (? -> int)
(function from int to the unknown type
is consistent with
a function from the unknown type to int)
(int -> ?) ~ (? -> bool)
(function from int to the unknown type
is consistent with
a function from the unknown type to bool)
(int -> ?) !~ (bool -> ?)
(these two function aren't consistent,
as their arguments aren't consistent)
(int -> int) !~ (? -> bool)
(these two function aren't consistent,
as their return types aren't consistent)

Update: fixed last two examples to state !~
They then give the rules for the ~ relation, noting it is reflexive and symmetric but not transitive.

Conclusion


The Gradual Typing system is interesting, and I would probably give a language using it a whirl. One thing I don't think it addresses is the eval function present in the well-known dynamic languages that allow symbols to be redefined dynamically. That appears to me to kill any hope of the usual static type checking assurances.

5 comments:

aog said...

With regard to Perl, they got some basic facts wrong. For instance, "$x + 3.14159" is not ambiguous, because "+" is a numeric operator. String concatenation has a different operator. Moreover, this shows that the idea that such languages have only one type is wrong, or the entire question of what to do with regard to types in that expression wouldn't come up. The true essence is that these languages have a small, fixed set of types. One can put a gloss over those to make it look like one can create types (e.g., Class::Struct in Perl) but it's still really only those basic types.

I would also say that dynamically typed languages are not just for RAD and prototyping, but for dynamic environments where it's not possible to enumerate all useful types ahead of time. This is related to the discussion we had once upon a time about the ITE type system and the deadly "middle ground" of type complexity. It's not unrelated that, given a statically typed language, we found it necessary to build a dynamic type system to handle the application space. The only question is where you want the dynamic support.

JFKBits said...

aog: If anybody "got some basic facts wrong" it would be me, because the x+k example was my way of trying to explain the Dyn type. Just so nobody is confused, the guys writing the paper I linked did not invent gradual typing, they're just building on it and I'm referencing material from their background section.

As far as "$x + 3.14159" in particular, I didn't mean Perl in particular but was trying to brush a broad stroke across all the languages. Even in Perl, is there a different internal representation for ints versus floats? Ultimately a function needs to make some assumptions about the data passed to it (its contract?) and I picture run-time checking of type tags: is it a machine-sized int? A Bignum? A 32-bit float? A double? Each of these has different + code and somebody has to dispatch or point to the right one.

"Having only one type is wrong:" I see what you mean, how we can we say there's just one type and then go on to talk about checking different types dynamically. Perhaps we should say from a static perspective there's only one type (expression) and then at runtime you see the different types come alive. Interesting distinction, I'm glad you pointed that out.

I did think of the ITE several times while preparing this post. Oddly enough, I was thinking of it as an example of achieving the same goals as the dynamic systems (compare with Ruby on Rails brand ActiveRecord) in a statically typed system (C++). Maybe you can clue me in offline, where are you buying me lunch?

dak said...

The way I've always heard the distinction made is that in a dynamically typed program, only the values have types.

aog said...

jfk;

Sure. We should definitely do lunch. You should also set up a comment feed.

Whether ints and floats have different representation in Perl is, I think, undefined (i.e., up to the implementor). Float must be supported, but integers are considered purely an efficiency issue.

dak;

That's true, but for some reason (an interesting subject by itself) generic variable languages tend to have a very small set of types (e.g., Lisp, Perl). I suspect that it's the "deadly middle ground" issue. What we found in working with type systems is that you should either be very simple, or very complex. But the middle ground tends to get the worst of both worlds, complexity without the corresponding power.

JFKBits said...

aog: Comment feeds have already been set up, look for them at the very end where it says "Subscribe to: Post Comments (Atom)".