Tuesday, May 08, 2007

Iteration over neighbors

In working on a graph problem, I came up with a distant cousin of fold, which I'm calling foldNeighbors. I entertain no delusions this is anything new, but it was fun and wanted to share its derivation. Maybe someone knows a name for it.

Start with a list of vertices, taken to be a path through a graph, with the intent of summing the weights of the edges. The accompanying picture shows a graph of major US airports and some typical non-stop airfares that the Traveling Language Enthusiast might pay this week.

vertices = [ORD, SJC, AUS, JFK, TPA, LAX]


Note I'm using Standard ML-like notation here, []'s delimit lists, :: is cons or list prepend, and currying is rampant.

Decompose into edges and fold

A simple way to understand the problem is to split the vertices list into a list of edges, i.e. pairs, map the weight function W onto it, and feed that to fold for summing:
Given E = [[ORD,SJC], [SJC,AUS], [AUS,JFK], [JFK, TPA], [TPA, LAX],
cost = fold + 0 (map W E)

The interesting part is how to get the edge list E, and in particular how to write the base cases. Let's call neighbors the function that takes a list and returns a list of neighboring elements.

The case for the empty list:
neighbors [] = []
But what about an argument with a single element? This will work:
neighbors [x] = []
I'm not sure I can explain it better than by saying (1) a graph of one vertex has no neighbors and (2) it makes the case for a two-vertex graph work out.

The general case is then
neighbors (x::y::rest) = [x,y] :: neighbors(y::rest)

We check this with a sample run of the two and three vertex cases:
neighbors [ORD,SJC] =
neighbors(ORD::SJC::(rest=[])) =
neighbors [SJC] = []
neighbors [ORD,SJC] = [ORD,SJC]::[] =
Three vertices:
neighbors [ORD,SJC,AUS] =
neighbors(ORD::SJC::rest=[AUS]) =
neighbors [SJC,AUS] = [[SJC,AUS]] see 2-vertex case
neighbors [ORD,SJC,AUS] = [ORD,SJC]::[[SJC,AUS]] =

Generalizing into an iteration combinator

While I like the decomposition approach, it is easy to understand, it has the performance drawback of creating the intermediate list of pairs. Depending on what gets optimized by the compiler or interpreter, we may want to traverse the list once and simply perform the weight lookup and summing operations. This led me to the idea of fold. Fold itself won't work because it only examines one element at a time, and this function that sums the edge weights needs to examine pairs at a time.

Let's call the function of interest edgeSum. After we apply it to the edge summing, we can easily turn it into a fold-like iteration combinator. Let's say edgeSum needs the following arguments:
  • v - list of vertices

  • w - weighting function that takes a pair of vertices and computes or looks up the weight between them

  • f - the summing function appropriate to the type of the weight (e.g. integer plus or real plus)

  • z - an initial weight, typically zero, the applicable additive identity for the weight type

With this vocabulary, we can define edgeSum:
edgeSum w f z [] = z
edgeSum w f z [x] = z
edgeSum w f z (x::y::rest) = f(w {x,y}, edgeSum w f z (y::rest))
As the function edgeSum is defined (carefully) in general terms, all that remains is to rename it to something like foldNeighbors and we have a general-purpose iteration combinator.

The only refinement we may make is to have foldNeighbors take just one combining function f instead of one that operates on the pair and one that combines the result. This will typically look like fn((x,y),acc)=>body. When implementing edgeSum, we would use f = fn((x,y),acc)=>W(x,y)+acc.

foldNeighbors f z [] = z
foldNeighbors f z [x] = z
foldNeighbors f z (x::y::rest) =
f((x,y), foldNeighbors f z (y::rest))

No comments: