Starting somewhere: SIGIR 2006

I had to start somewhere or else this blog would be idle forever, so I've decided to start with my impressions of SIGIR 2006. The first one that I've been to. And I suddenly realized how much I should have worked with this group during my Ph.D. There is an interesting overlap with what I was working on, both in problems and solutions. I've actually seen a presentation that contained pretty much 50% of all the basis of my research, just varying the method used (and a little bit the expected results from its use). Of course nobody referenced any of my papers, but that's what happens when you are not in the same research area. There is just too much out there and so people tend to isolated themselves to a specific research group.

And this brings me to one of the most interesting things I've noticed when listening to the talks: SIGIR is a very small community. There were about 700 people in the conference. At least about 10% from Microsoft, about 5% from Google, 5% from Yahoo and some other companies. As of actual researchers, I'm guessing there were about 300.

Aside from that, my observation was on the tight relation between these researchers. There was a core of about 20 labs that basically define SIGIR. They seem to have been there for years, citing each other, collaborating, and defining the state-of-the-art datasets and baseline solutions. Compared to the other conferences I've gone to, this had the largest amount of people either pointing to the people they cite in the audience, or people standing up at the end of the talk and giving their personal experience with the dataset and making constructive suggestions about how they tried to tackle some of the issues observed by the presenters.

This is a very good sign in many cases. It creates a very productive environment; and a comparable environment, where you can draw better conclusions about what you have done (I know I've suffered with that a lot on my research). However, it also causes inbreeding. New ideas and types of solutions are harder to come by, mostly because labs are building large infrastructures for a certain type of system and it's hard to part with it (I've listened to a couple of talks that the presenter had a very hard time explaining what they did in 25 minutes, because it's a temporal slice of results from a system that has been in the works for 10+ years). But also because there is an incentive of creating things that are comparable to what other people created.

In summary, it was a very exciting conference. I still have to follow up with some people that I've met there, send them links to some of my papers so that I can feel that what I've done in the past is not going to the waste of the "paper cloud". We'll see what comes out of it.

I wished I had more time to maybe write reviews of specific papers, but I'll have to leave that for some other lifetime, or parallel dimension.