1. Correct Ordering in the Zipf–Poisson Ensemble.
- Author
-
Dyer, JustinS. and Owen, ArtB.
- Subjects
- *
POWER law (Mathematics) , *PROBABILITY theory , *POISSON distribution , *STATISTICAL ensembles - Abstract
Rankings based on counts are often presented to identify popular items, such as baby names, English words, or Web sites. This article shows that, in some examples, the number of correctly identified items can be very small. We introduce a standard error versus rank plot to diagnose possible misrankings. Then to explain the slowly growing number of correct ranks, we model the entire set of count data via a Zipf–Poisson ensemble with independentXi∼ Poi(Ni− α) for α > 1 andN> 0 and integersi⩾ 1. We show that asN→ ∞, the firstn′(N) random variables have their proper orderrelative to each other, with probability tending to 1 forn′ up to (AN/log (N))1/(α + 2)forA= α2(α + 2)/4. We also show that the rateN1/(α + 2)cannot be achieved. The ordering of the firstn′(N) entities does not precludefor some interlopingm>n′. However, we show that the firstn″ random variables are correctly ordered exclusive of any interlopers, with probability tending to 1 ifn″ ⩽ (BN/log (N))1/(α + 2)for anyB
- Published
- 2012
- Full Text
- View/download PDF