Discrete uniform distribution

Discrete uniform
	Probability mass function; n = 5 where n = b − a + 1
	Cumulative distribution function;
Notation	or
Parameters	integers with ;
Support
PMF
CDF
Mean
Median
Mode	N/A
Variance
Skewness
Excess kurtosis
Entropy
MGF
CF
PGF

In probability theory and statistics, the discrete uniform distribution is a symmetric probability distribution wherein each of some finite whole number n of outcome values are equally likely to be observed. Thus every one of the n outcome values has equal probability 1/n. Intuitively, a discrete uniform distribution is "a known, finite number of outcomes all equally likely to happen."

A simple example of the discrete uniform distribution comes from throwing a fair six-sided die. The possible values are 1, 2, 3, 4, 5, 6, and each time the die is thrown the probability of each given value is 1/6. If two dice were thrown and their values added, the possible sums would not have equal probability and so the distribution of sums of two dice rolls is not uniform.

Although it is common to consider discrete uniform distributions over a contiguous range of integers, such as in this six-sided die example, one can define discrete uniform distributions over any finite set. For instance, the six-sided die could have abstract symbols rather than numbers on each of its faces. Less simply, a random permutation is a permutation generated uniformly randomly from the permutations of a given set and a uniform spanning tree of a graph is a spanning tree selected with uniform probabilities from the full set of spanning trees of the graph.

The discrete uniform distribution itself is non-parametric. However, in the common case that its possible outcome values are the integers in an interval ${\textstyle [a,b]}$ , then a and b are parameters of the distribution and ${\textstyle n=b-a+1.}$ In these cases the cumulative distribution function (CDF) of the discrete uniform distribution can be expressed, for any k, as

$F(k;a,b)=\min \left(\max \left({\frac {\lfloor k\rfloor -a+1}{b-a+1}},0\right),1\right),$

or simply

$F(k;a,b)={\frac {\lfloor k\rfloor -a+1}{b-a+1}}$

on the distribution's support ${\textstyle k\in [a,b].}$

Estimation of maximum

The problem of estimating the maximum $N$ of a discrete uniform distribution on the integer interval $[1,N]$ from a sample of k observations is commonly known as the German tank problem, following the practical application of this maximum estimation problem, during World War II, by Allied forces seeking to estimate German tank production.

A uniformly minimum variance unbiased (UMVU) estimator for the distribution's maximum in terms of m, the sample maximum, and k, the sample size, is

${\hat {N}}={\frac {k+1}{k}}m-1=m+{\frac {m}{k}}-1$ .^[1]

This can be seen as a very simple case of maximum spacing estimation.

This has a variance of

{\frac {1}{k}}{\frac {(N-k)(N+1)}{(k+2)}}\approx {\frac {N^{2}}{k^{2}}}{\text{ for small samples }}k\ll N

^[1]

so a standard deviation of approximately ${\tfrac {N}{k}}$ , the population-average gap size between samples.

The sample maximum $m$ itself is the maximum likelihood estimator for the population maximum, but it is biased.

If samples from a discrete uniform distribution are not numbered in order but are recognizable or markable, one can instead estimate population size via a mark and recapture method.

Random permutation

See rencontres numbers for an account of the probability distribution of the number of fixed points of a uniformly distributed random permutation.

Properties

The family of uniform discrete distributions over ranges of integers with one or both bounds unknown has a finite-dimensional sufficient statistic, namely the triple of the sample maximum, sample minimum, and sample size.

Uniform discrete distributions over bounded integer ranges do not constitute an exponential family of distributions because their support varies with their parameters.

For families of distributions in which their supports do not depend on their parameters, the Pitman–Koopman–Darmois theorem states that only exponential families have sufficient statistics of dimensions that are bounded as sample size increases. The uniform distribution is thus a simple example showing the necessity of the conditions for this theorem.

References

^ ^a ^b Johnson, Roger (1994), "Estimating the Size of a Population", Teaching Statistics, 16 (2 (Summer)): 50–52, CiteSeerX 10.1.1.385.5463, doi:10.1111/j.1467-9639.1994.tb00688.x

[Johnson-1] Johnson, Roger (1994), "Estimating the Size of a Population", Teaching Statistics, 16 (2 (Summer)): 50–52, CiteSeerX 10.1.1.385.5463, doi:10.1111/j.1467-9639.1994.tb00688.x

[1]

Discrete uniform
Probability mass function n = 5 where n = b − a + 1
Cumulative distribution function
Notation	${\mathcal {U}}\{a,b\}$ or $\mathrm {unif} \{a,b\}$
Parameters	$a,b$ integers with $b\geq a$ $n=b-a+1$
Support	$k\in \{a,a+1,\dots ,b-1,b\}$
PMF	${\frac {1}{n}}$
CDF	${\frac {\lfloor k\rfloor -a+1}{n}}$
Mean	${\frac {a+b}{2}}$
Median	${\frac {a+b}{2}}$
Mode	N/A
Variance	${\frac {(b-a+1)^{2}-1}{12}}$
Skewness	$0$
Excess kurtosis	$-{\frac {6(n^{2}+1)}{5(n^{2}-1)}}$
Entropy	$\ln(n)$
MGF	${\frac {e^{at}-e^{(b+1)t}}{n(1-e^{t})}}$
CF	${\frac {e^{iat}-e^{i(b+1)t}}{n(1-e^{it})}}$
PGF	${\frac {z^{a}-z^{b+1}}{n(1-z)}}$

Estimation of maximum

Random permutation

Properties

See also

References