2

Coupon Collector's Problem

 2 years ago
source link: https://sighingnow.github.io/math/coupon_collector_problem.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Coupon Collector's Problem

Aug 23, 2018 • Tao He| Probability Theory

The coupon collector’s problem describes the following question: there are n different types of coupons and the collector will receive a random coupon every time, what is the expected trials that the collector can collect all n types of coupons? We can conclude that the expected trails grows as O(nlog⁡n).

The Distribution of Trails for a new Coupon

Let T be the time to collect all n types of coupons, and ti be the time to collect a new coupon after i−1 coupons have been collected. We can see that the probability of ti satisfies geometric distribution with pi=n−(i−1)n, then we conclude

E(ti)=1pi=nn−(i−1)

We also obviously know that ti and tj are independent. Thus

E(T)=∑E(ti)=1p1+1p2+⋯+1pn=nn+nn−1+⋯+n1=n⋅(1n+nn−1+⋯+11)≃nlog⁡n+γn+12+O(1n)

The variance of the random variable T can be calculated as (since ti are independent)

Var(T)=∑Var(ti)=1−p1p12+1−p2p22+⋯+1−pnpn2<=n2n2+n2(n−1)2+⋯+n212<=n2⋅(112+122+⋯+1n2)=n2⋅π26

A classical puzzle that can be modeled as Coupon Collector’s problem is that, for a cubic dice, we are expected to get all six points after how many trails? The answer for the question should be

∑166i≃14.7≃15

Generalization

The Coupon Collector’s Problem has been generalized as the expected trails of collect m copy of each n coupons. When m=2, the problem is also called The Double Dixie Cup Problem1. The expectation of Tm satisfies

E(Tm)=nlog⁡n+(m−1)nlog⁡log⁡n+O(n),asn→∞

In the more general case, when pi is nonuniform, Philippe Flajolet gives2

E(T)=∫0∞(1−∏i=1n1−e−pi⋅t)dt

If the collector receives d coupons each run3, in the asymptotic case the m copy of of each n coupons will be collected expected within time

E(Tm)=nlog⁡n/d+(m−1)(n/d)log⁡log⁡n+O(n⋅m)

The Coupon Collector’s Problem can also be generalized as stochastic process456. The paper from MAT7 also gives exhaustive analysis of this problem. The expectation can also be expressed using the Stirling numbers89.

References


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK