Wednesday, May 26, 2004

Zipf's Law and Log-Normal Distributions.

If you have ever done data analysis, you have probably bumped into the (in)famous Zipfian distributions, popularized by the "80-20" law:
80% of the work is done by 20% of the people
The probability of the ith most frequent item is roughly i-alpha

Michael Mitzenmacher has a nice survey (PS) outlining the history of such "power-law" distributions, the generative processes that create such distributions and their relationship to log-normal distributions (a distribution where the log of the variable is normally distributed). Worth a read...

