Progress report: Reproducing the TIOBE PCI

In which reproducing the TIOBE PCI is found to be trickier than it first appeared.

Recently we looked at how TIOBE describes that its Programming Community Index is computed. The description, while detailed, somewhat elusive, and curiosity overcame me. I've been hacking at a system for reproducing the results.

I've now got results, and they don't match what's on the web. The current disconnect is that I forgot about the phrase "for the last 12 months" that comes at the end of "The search query is executed for the regular Google, MSN, and Yahoo! web search and the Google newsgroups and blogs."

So, before I give you some notes about what I've learned so far, let's compare my results (so far) with what TIOBE published for March:

