## Tuesday, August 28, 2007

### Streaming Summer School Report

(Ed: This post is by Piotr Indyk, who is always willing to be your near neighbor)

Greetings from the Kingdom of Denmark! The country of Vikings, meatballs, and football teams that just refuse to win, has hosted the Summer School on Data Stream Algorithms last week (August 20-23). The school was organized under the banner of MADALGO, a new research center dedicated to MAssive DAta ALGOrithms, set up in Aarhus University. The inauguration ceremony for the center took place on August 24, with several people giving invited lectures.

Muthu (one of the invited lecturers) has covered the inauguration ceremony, so I will skip the detailed description. Suffices to say, it was a pleasure to see that the Danish Research Foundation (or as the locals like to say, Grundforskningsfond) is eager to support an algorithmic research center with a budget of roughly $10M over 5 years, while its US counterpart spends about$7M per year for the entire Theory of Computing program. Did I mention that the population of Denmark is roughly 2% of that of US ?

Anyway, back to the summer school. We had 70+ participants altogether, including 5 lecturers. The school covered the following topics:
• The dynamic Sudipto Guha gave two lectures. The first lecture was on algorithms for clustering. Massive amounts of data were clustered, including metric data, graph data, and a few careless participants sitting in the first row. In the second lecture, Sudipto covered the "random stream model", where the elements are assumed to be arriving in a random order, which circumvents the usual worst-case paranoia.
• The twin duo of T.S. Jayram and Ravi Kumar covered lower bounds: communication complexity, information complexity, and generally "everything you wanted to know but were afraid to ask". It was the first time I have seen the details of the linear-space lower bound for estimating the L_infty distance, and I am happy to report that I understood everything, or at least that is what I thought at the time. Jayram and Ravi have also occasionally ventured into the land of upper bounds, covering algorithms for the longest increasing sequences and probabilistic data streams.
• The scholarly Martin Strauss gave an overview of the algorithms for finding frequent elements, heavy hitters (sometimes on steroids) and their more recent versions used in compressed sensing.
• I have covered the basic upper bounds for the L_p norm/frequency moments estimation, as well as the algorithms for geometric data (clustering, MST, matching), notably those based on core-sets. The latter topic was originally supposed to be covered by Sariel Har-Peled; however, the dark forces highly enlighted and insightful geniuses of the INS [Sariel's corrections] have jeopardized his plans. I guess the force was not strong enough with this one...
We also had an open problem session. Some of the problems were copy-pasted from the "Kanpur list", but a few new problems were posed as well. The list will be available shortly on the school website, so sharpen your pencils, prepare your napkins, pour some coffee, and ... give all of this to your students!

The lecture slides are also available on-line. If you spot any typos, let the lecturers know.

Overall, I think the school has been a success, perhaps with the notable exception of the weather: it started to rain promptly after the school has began, and it stopped when the school has ended. One has to admire the timing though.

SOCG 2009 will be held in Aarhus. See you then!

(Ed: But what about the beer report ?)