Friday, April 20, 2007

Proofs, Education and Open Problems

Announced on the Types mailing list today: http://tlca.di.unito.it/opltlca/, a convenient list of open problems for research in the area of typed lambda calculus.

After reading Dijkstra's writings about formal proofs being essential to computer science, I feel ill at ease to call myself a computer scientist. After reading his derision of software engineering, I feel equally uncomfortable calling myself a software engineer. Maybe I should try to learn a bit more of my impulse purchase How to Prove It by Daniel Velleman. Then I can take on the whole list, starting with a direct proof of the confluence of βη  -strong reduction!

I was comparing notes with my wife on the role of proofs in education. She was introduced to proofs in sixth grade. She didn't understand how I could be so lacking in familiarity with proof technique, which is what "How to Prove It" teaches. My experience was simply the straightforward geometry proofs that I think are common to U.S. education, and my undergrad computer science degree was also relatively light on proofs, or at least on the level of rigor expected from my graduate school experience. Thus I frequently lament my weak background in proofs.

The Dijkstra essay has been Redditized a couple of times with good discussion. I characterize what Dijkstra wants to accomplish as this: sitting in a library in a comfortable chair with a stack of blank paper and pencils, crafting both algorithm and its proof. This is the Dijkstra picture of computer science, I gather, and he appears to scoff at testing. If someone wants to implement the algorithm, it has to be keyed in, perhaps translated to a programming language, and then what do you suppose should be done to ensure no errors were made in translating or keying this embodiment of the provably correct algorithm? Why, I suppose the program should be tested. I wonder if Dijkstra would call such errors "manufacturing defects."

One Reddit reader claims, that the problem with formal proofs of algorithms is that they run into the halting problem. That is, you end up needing to run the program being proved to understand it. I don't know if I buy it, but it's an intriguing idea, because a proof is done on a static representation of an algorithm.

I would like to improve my understanding of formal methods, so I'll hopefully get around to reading How to Prove It, and take a deeper look at Hoare logic. I'd also like to work on understanding the connection between programs, tests, and proofs because they do seem related in a deeper way than is usually acknowledged.

As I was reminded recently, "clear thinking is useful in any field", and whether you endorse formal proofs as the computing cure-all or if you're a web monkey, anything you can do to advance the frontier of knowledge, understanding and wisdom is a welcome contribution.

5 comments:

Anonymous said...

Let me respectfully suggest you read A Discipline of Programming instead.

That book makes it clear that Dijkstra wasn't talking about taking an existing piece of code and proving its correctness. (Down that path lies the halting problem, among other difficulties.) Instead, his approach used the requirements of proof of correctness to guide the development of the code itself (which, ironically, predates Test-Driven Development by about 30 years).

jfklein said...

jn: I gratefully accept your referral to "Discipline of Programming". My order has been placed with Amazon. Your characterization of Dijkstra's approach as proof-guided development does match how I perceived his presentation, my stumbling post notwithstanding, and I hope my agreement serves to correct any false impressions to other readers.

Certainly I'm not surprised that the approach of a brilliant researcher predates the common acceptance of something like test-driven development.

It feels like the test advocates are discovering a slice of the pie. Specifying inputs and outputs of a black box is the Hoare triple on a rough scale. The problem, as always, is that even on the level of a modest sized function under test, the true state space that it must work in is too large (accounting for exceptions, for example) for complete coverage, especially with case-by-case coverage as Dijkstra describes with the modified chessboard example.

I took a grad course in denotational semantics and got the feeling it was a booster shot of correct programming. I haven't totally understood why yet. But I think because it helped me see the underlying simplicity of the types out there and their usual operations, and how you can actually get complete coverage of cases if you can see past all the syntactic cruft through to the meanings. Either that or some sort of technical placebo effect.

Anonymous said...

I have a mathematics degree and own a copy of that book. It's a decent introduction, I suppose, but extremely basic.

If you're into CS then I recommend getting a solid book on discrete mathematics or graph theory. Any book published by Dover is probably good. If you're feeling particularly adventurous I recommend "Introduction to Discrete Mathematics" by Matousek and Nesetril.

jfklein said...

Jesse: "How to Prove It" is a bit of a basic introduction I agree. What I liked about it (or what it promised) is a survey of techniques. A bit like this list, but serious: http://www.ling.ed.ac.uk/~heycock/proof.html. This is the list with gems like Proof by eminent authority: "I saw Karp in the elevator and he said it was probably NP-complete." My electrical engineering friend added The Engineer's Proof: "If it works for three random x, it works for all x", which I'm sure is the approach taken by some unit test suites out there.

In addition to studying universal technique, I do also think it useful to study a variety of proofs in the field of interest, as you suggest. For theoretical programming languages this usually involves induction on trees: Given this environment and this typed phrase, the property holds for all ways to augment the tree by one step.

Theodore Norvell said...

"If someone wants to implement the algorithm, it has to be keyed in, perhaps translated to a programming language, and then what do you suppose should be done to ensure no errors were made in translating or keying this embodiment of the provably correct algorithm?" In the last chapter of A Discipline of Programming, Dijkstra addresses exactly this point. He talks of it being a third concern, the first being correctness of the algorithm and the second being the reliability of the hardware. He says "This third concern is not dealt with in this monograph, not because it is of no importance, but because it can (and therefore should) be separated from the others, and is dealt with by very different, specific precautions (proof reading, duplication, triplication, or other forms of redundancy)."