Friday, April 28, 2006

A Night at the Opera -- They're Doing HTTP/1.1

It finally sunk in that Opera's text-to-speech feature might be useful. So I had it read some stuff to me. I listened to the news while browsing my code, but I couldn't concentrate on both. That's the trouble with not having enough mindless tasks.

Then I hit upon the perfect use. This is how I can actually get through all these RFCs I've been meaning to read. Browsing to those de-facto Internet specifications is like an insomnia cure to me lately. The sentences lull you, alternating between being impenetrably obtuse and stupifyingly obvious. But with Opera Man reading the HTTP/1.1 (he says "slash one point one" just like you'd expect) RFC while I study it, suddenly it's like I'm in a classroom. I take in the easy bits visually, and skip ahead to the hard ones, or study the diagrams. The steady voice of the reader sets a pace, helping me not to lose place if I skip ahead, or lag behind.

The text-to-speech technology is certainly adequate, and its accuracy in pronunciation, inflection, and intonation is quite good. Only once did I hear a slip-up, when "content negotiation" sounded like the negotiation was well-satisfied rather than negotiating about the innards of a resource.

So the next time you're having trouble getting through something, give the Opera reader a try. And if you need to fight insomnia, point Opera to Project Gutenberg and have it read a bedtime selection from Alice in Wonderland.

Wednesday, April 26, 2006

Chicken chicken; he declared

Give your variables meaningful names. That's Good Programming Practice. But what's "meaningful" is open to interpretation. Without "foreach", you can argue that "i" is as good as it gets in The C++ or Java array-like iteration idiom:

for(int i=0; i <> things.size(); ++i)

You can also argue that in simple mathematically-oriented functions, plain old "x" and "y" are about as meaningful as you can get:

boolean max(int x, int y)
return x > y? x : y;
A lot of the need for meaningful variable names arises for variables whose type is so common, like int or string or Object, that you need a good name to differentiate this XML attribute name string from that fully-qualified classname string we're using as a hash key.

This leads us to the gray area of variables meaning just one thing. I'm talking about variables whose type is unique in your context, there's only one database Connection object, only one Runtime, or one (shopping) Cart. In cases like these, there's a strong temptation to reuse the type name itself as the variable name:

Connection connection;
Runtime runtime;
Cart cart;
Type type;
Chicken chicken;
There's one great thing about this practice: naming is a no-brainer. A downside is the variable name is redundant, and it could carry more information.

How do you decide whether "Cart cart" or "Connection connection" is a sufficient name? This is where context comes in. If the whole purpose of a module is to take a Chicken and make Soup -- how much more information do I need to know about that Chicken? I can reasonably proceed to chicken.pluck() and all the rest, according to taste.

Perhaps then it is the variables in "supporting roles", those that exist from coding nuts-and-bolts necessity which need good names. Things like "int returnCode;" and "boolean success;" and that int returned by a substring find function could probably use better names.

P.S. Today's title is brought to you by the seminal work on Chicken.

Monday, April 24, 2006

Bug Story: When Is 9+1 not supposed to be 10

It was a simple mistake, but in an embedded system it was hard to debug and had mysterious effects. Our team had a limited region of RAM for each task to store its own copy of a fixed-sized chunk of data. (A "task" in our operating system, pSOS, is the same as a thread or process.) In a C header, we defined constants to use as the start of each tasks's fixed region of memory, something like this:


And so on. Things tooled along, we got the first features going, and quite a ways into the project we started seeing weird crashes and unexplained error conditions. We found that some of the tasks were writing past that limited region of RAM. But why? While it was limited, we had calculated it had room enough for twenty-some tasks, and we only had 18 or so in our application.

What went wrong? Well, imagine you are an engineer who has just finished some feature development, and for the next feature in the application needs to code up a new task (thread), let's say to process a new stream of input from a real-time source. One of the various things you need to do is edit that header we talked about, and add your task, let's say it's the Golf task:


Simple: it's a cut-and-paste job, you just change the name and the number of your task, so your pointer points to the next section in this self-constructed array.
Let's look at the next few entries in this progression:


and so on up to eighteen, which in this telling is the last task:


The keen-eyed C programmers among you will have seen the error. What we have here is the Sierra task addressing data that is actually 24 entries past the BASE pointer, not 18, because every one of those multipliers that a programmer had to edit has the "0x" hex prefix, that is the constant is in base 16, not base 10. So really those latter tasks were actually writing past their permitted region of memory, as if we really had 24 tasks in the system. It was a false lack of resources. We could have but didn't test for this at runtime, because we figured we had already accounted for it when we counted tasks and added up the memory.

It is interesting to see where the error crept it. The TASK_KILO_STATICS entry is most likely where things went awry, continuing the subtly wrong pattern of "8, 9, 10" when it should have been "8,9,A", or, "0x8, 0x9, 0xA".

How could this error have been prevented? Is writing our constants in hex not the best choice? What other code structures could have prevented this? The only reasonable alternative I can think of is the "previous+1" approach:


This eliminates the "9+1" problem, but did we introduce some other possible problem, like forgetting to link an entry to a previous entry, especially when rearranging the order of these entries?

If anyone can think of other ways to have avoided or tested for this problem (without resorting to another programming language), I especially invite your comments.

If you're interested in reading excerpts of other bug stories and a discussion of causal analyses, check out "My hairiest bug war stories" by Marc Eisenstadt. I like the way he refers to the "detective work" programmers do in debugging.


Recently I was playing with Millstone, an open-source "web user interface component library." It makes you feel like you're writing a Swing-style UI, in Java source, and it creates the web interface for you. The nice thing is that if you happen to like the default "theme" (look-and-feel, presentation details), you never to have to get your hands dirty with the HTML or Javascript details. I find the optional Eclipse plugin highly useful, as it features an automatic end-to-end refresh -- you edit the source, and suddenly, it has recompiled everything and refreshed a browser with the new UI (displayed within Eclipse).

Of course, many applications will require some kind of customized theme, which can be done by tweaking, or overhauling, the XSL stylesheets. I have not played sufficiently with XSL to just go in and do this, especially compared to the kind of sweeping changes I like to envision, but it's nice to know that the smart Millstone engineers have achieved such a clean separation that I can do this without modifying my application's Java source code.

One area I want to play with more was validation. Millstone has a nice model for this, with a Validator and Validatable and ErrorMessage classes. Validators can be composed and a bunch of them can be added to a Property or a component, and the ErrorMessage is something that can be stuck onto a component so it shows a little error icon where you click for more info. Of course that's the default behavior and can be customized. It seems like a pretty flexible interface. However, in my brief time I had difficulty getting my Validators to be invoked. According to the Millstone forums, the validation used to be completely automatic, and now you must invoke it explicitly. This doesn't seem surprising, as I'm guessing the automatic stuff didn't completely suit everyone, and enough users needed explicit control. But I had difficulty getting this going. My goal was to use a text field where the user would enter a port number, so I decreed that the text field is only valid when it has a number from 0 to 65535, or simply blank (where a default is inferred). At first I tried adding a change listener which would invoke the validator, but it didn't seem to be invoked after I typed an invalid value ("-1" or "abc") and hit "return". It's quite possible my page needs some kind of submit button, I don't know. When I set setInvalidAllowed(false) on my TextField, the validator did seem to run, and I would get the expected red error icon if I entered an invalid value. One interesting thing the system did on these errors is to reset the text field to blank. I find that intiguing, because I'm not generally used to a UI control changing its value on its own. Presumably you can override or nullify this behavior somehow, because I could envision the text field holding some long text string with a single typo, and the user looking on with dismay to see it disappear forever. Maybe the text field has an undo history, I don't know.

Despite this experience, I'm impressed with the Millstone package and would like to see it come more into the mainstream, because it looks like a well-designed package with plenty of room to grow.

Welcome, gentle reader. JFKBits is about software and life; hopefully how to make life better, or less painful, through software. I would like to publish several things here:

  1. What's new (or old) in technology -- what's out there, what I'm finding useful, what could be better. I don't necessarily stay very up to date, so don't be surprised if I write about something that's 10 years old, if I just started experiencing it.
  2. Stories -- "war stories" and "bug stories" about my own software experience, and generate discussion on what we can learn from these
  3. Tips and tricks that I find for debugging specific problems. An ideal for me on this front is that whenever I encounter an error message or a funny bit of behavior, I blog it along with any solution I find. I feel this would be helpful in terms of replicating useful information, as well as adding to the reference count of any useful web sources.
  4. Theory Stories -- My own attempts to explain, for myself mostly, some concept or technology. In particular I would like to finish a project I started in my master's thesis where I was attempting to explain what I had learned, in a manner that was less formal than is required by a thesis.
  5. Reflections on technology and how it fits into the rest of life. I intend most of my audience to be tech-focused, and I think we tend to assume technology *is* life, or at least an important element of Life As it Was Intended To Be, but we should probably force ourselves to admit, occassionally, that technology serves some higher purposes. If we have to do that, might as well use bits.
Those are the goals. I'm looking forward to this.