Following on to "Dependent Iteration", I tried the RC filter simulation on a semi-realistic problem with 1.2 million iterations in a few platforms. The two I want to report on seem to run the gamut of fast to slow:

Language | Total | Time/Iteration |
---|---|---|

Java | 29 ms | 24 ns/iteration |

Python | 1507 ms | 1255 ns/iteration |

I ran the filter function in a loop and averaged the times to report the numbers above. The Java code is reasonably close to the metal here: on a 3GHz CPU at 0.3ns/cycle, 24ns is 72 CPU cycles to perform 3 double precision arithmetic operations and two array lookups and some loop overhead, probably with cache block fetches amortized across the life of the loop:

There must be a better way to do this than my Python code (running on version 2.5.1). Aside from whether this uses single or double-precision arithmetic, I tried using both arrays and lists, and reported the faster (the list version, actually):

public double[] lowpass(double[] x, double dt, double RC) {

double[] y = new double[x.length];

double alpha = dt / (RC + dt);

y[0] = x[0];

for(int i=1; i<x.length; ++i)

y[i] = alpha * x[i] + (1-alpha) * y[i-1];

return y;

}

One interesting thing in these exercises is that for the number of language platforms I'd like to try, it takes a nontrivial amount of time to figure out (a) how to load from a file in one-float-per-line format and (b) how to get wall-clock time within the environment.

def lowpassArray(x,dt,RC):

y=array.array('f',range(len(x)))

y[0]=x[0]

yprev=x[0]

alpha=dt / (RC + dt)

for xi in x[1:]:

y[i] = alpha * xi + (1-alpha) * yprev

yprev=y[i]

return y

def lowpassList(x,dt,RC):

yprev=x[0]

alpha=dt / (RC + dt)

y = []

for xi in x[1:]:

yi=alpha * xi + (1-alpha)*yprev

yprev=yi

y.append(yi)

return y

What about your favorite functional or scripting platform? How fast can you get this dependent iteration loop with an input size of 1.2 million?

## 2 comments:

I assume the python numbers are for the regular interpreter and not using psyco?

The performance numbers using psyco (http://psyco.sourceforge.net/) would be interesting.

The 1507 ms total, 1255 ns/iteration for Python is indeed for the regular interpreter.

Taking a cue from you, I installed Psyco and got 588ms total, 490 ns/iteration for the list version. Very curiously, Pysco did nothing to improve the array version; it's quite possible I'm not doing something right there.

Post a Comment