In probability theory and statistics, the Poisson distribution (pronounced [pwasɔ̃]) is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event.[1] The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume.
| Probability mass function The horizontal axis is the index k, the number of occurrences. The function is only defined at integer values of k. The connecting lines are only guides for the eye. |
|
| Cumulative distribution function The horizontal axis is the index k, the number of occurrences. The CDF is discontinuous at the integers of k and flat everywhere else because a variable that is Poisson distributed only takes on integer values. |
|
| Notation | ![]() |
|---|---|
| Parameters | λ > 0 (real) |
| Support | k ∈ { 0, 1, 2, 3, ... } |
| pmf | ![]() |
| CDF | --or--
(for |
| Mean | ![]() |
| Median | ![]() |
| Mode | ![]() |
| Variance | ![]() |
| Skewness | ![]() |
| Ex. kurtosis | ![]() |
| Entropy |
(for large |
| MGF | ![]() |
| CF | ![]() |
| PGF | ![]() |
In probability theory and statistics, the Poisson distribution (pronounced [pwasɔ̃]) is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event.[1] The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume.
For instance, suppose someone typically gets 4 pieces of mail per day on average. There will be, however, a certain spread: sometimes a little more, sometimes a little less, once in a while nothing at all.[2] Given only the average rate, for a certain period of observation (pieces of mail per day, phonecalls per hour, etc.), and assuming that the process, or mix of processes, that produces the event flow is essentially random, the Poisson distribution specifies how likely it is that the count will be 3, or 5, or 10, or any other number, during one period of observation. That is, it predicts the degree of spread around a known average rate of occurrence.[2]
The Derivation of the Poisson distribution section shows the relation with a formal definition.
Historical background of the Poisson distribution was described by Gullberg (1997).[3]
Contents |
The distribution was first introduced by Siméon Denis Poisson (1781–1840) and published, together with his probability theory, in 1837 in his work Recherches sur la probabilité des jugements en matière criminelle et en matière civile (“Research on the Probability of Judgments in Criminal and Civil Matters”).[4] The work focused on certain random variables N that count, among other things, the number of discrete occurrences (sometimes called "events" or “arrivals”) that take place during a time-interval of given length. The result had been given previously by Abraham de Moivre (1711) in De Mensura Sortis seu; de Probabilitate Eventuum in Ludis a Casu Fortuito Pendentibus in Philosophical Transactions of the Royal Society, p. 219.[5]
A practical application of this distribution was made by Ladislaus Bortkiewicz in 1898 when he was given the task of investigating the number of soldiers in the Prussian army killed accidentally by horse kick; this experiment introduced the Poisson distribution to the field of reliability engineering.[6]
A discrete random variable X is said to have a Poisson distribution with parameter λ > 0, if for k = 0, 1, 2, ... the probability mass function of X is given by:

where
when the number of events occurring will be observed in the time interval
[7]The positive real number λ is equal to the expected value of X and also to its variance[8]

The Poisson distribution can be applied to systems with a large number of possible events, each of which is rare. The Poisson distribution is sometimes called a Poissonian.
, while the index of dispersion is 1.[5]
, which is the largest integer less than or equal to λ. This is also written as floor(λ). When λ is a positive integer, the modes are λ and λ − 1.Bounds for the median (ν) of the distribution are known and are sharp:[9]


are independent, and
, then
.[11] A converse is Raikov's theorem, which says that if the sum of two independent random variables is Poisson-distributed, then so is each of those two independent random variables.[12]
can be derived using a Chernoff bound argument.[15]

and
are independent, then the difference
follows a Skellam distribution.
and
are independent, then the distribution of
conditional on
is a binomial distribution. Specifically, given
,
. More generally, if X1, X2,..., Xn are independent Poisson random variables with parameters λ1, λ2,..., λn then
. In fact,
.
and the distribution of
, conditional on X = k, is a binomial distribution,
, then the distribution of Y follows a Poisson distribution
. In fact, if
, conditional on X = k, follows a multinomial distribution,
, then each
follows an independent Poisson distribution
.
), is an excellent approximation to the Poisson distribution. If λ is greater than about 10, then the normal distribution is a good approximation if an appropriate continuity correction is performed, i.e., P(X ≤ x), where (lower-case) x is a non-negative integer, is replaced by P(X ≤ x + 0.5).
and variance of about 1/4.[18][19] Under this transformation, the convergence to normality (as λ increases) is far faster than the untransformed variable.[citation needed] Other, slightly more complicated, variance stabilizing transformations are available,[19] one of which is Anscombe transform. See Data transformation (statistics) for more general uses of transformations.

Applications of the Poisson distribution can be found in many fields related to counting:[23]
The Poisson distribution arises in connection with Poisson processes. It applies to various phenomena of discrete properties (that is, those that may happen 0, 1, 2, 3, ... times during a given period of time or in a given area) whenever the probability of the phenomenon happening is constant in time or space. Examples of events that may be modelled as a Poisson distribution include:
The Poisson distribution may be derived by considering an interval, in time, space or otherwise, in which events happen at random with a known average number
. The interval is divided in
subintervals
of equal size. The probability that an event will fall in the subinterval
is for each
equal to
, and the occurrence of an event in
may be approximately considered to be a Bernoulli trial. The total number
of events then will be approximately binomial distributed with parameters
and
The approximation will be better with increasing
, and the
-distribution converges to the Poisson distribution with parameter 
In several of the above examples—such as, the number of mutations in a given sequence of DNA—the events being counted are actually the outcomes of discrete trials, and would more precisely be modelled using the binomial distribution, that is

In such cases n is very large and p is very small (and so the expectation np is of intermediate magnitude). Then the distribution may be approximated by the less cumbersome Poisson distribution[citation needed]

This approximation is sometimes known as the law of rare events,[26] since each of the n individual Bernoulli events rarely occurs. The name may be misleading because the total count of success events in a Poisson process need not be rare if the parameter np is not small. For example, the number of telephone calls to a busy switchboard in one hour follows a Poisson distribution with the events appearing frequent to the operator, but they are rare from the point of view of the average member of the population who is very unlikely to make a call to that switchboard in that hour.[citation needed]
The word law is sometimes used as a synonym of probability distribution, and convergence in law means convergence in distribution. Accordingly, the Poisson distribution is sometimes called the law of small numbers because it is the probability distribution of the number of occurrences of an event that happens rarely but has very many opportunities to happen. The Law of Small Numbers is a book by Ladislaus Bortkiewicz about the Poisson distribution, published in 1898. Some have suggested that the Poisson distribution should have been called the Bortkiewicz distribution.[27]
The poisson distribution arises as the distribution of counts of occurrences of events in (multidimensional) intervals in multidimensional Poisson processes in a directly equivalent way to the result for unidimensional processes. This,is D is any region the multidimensional space for which |D|, the area or volume of the region, is finite, and if N(D) is count of the number of events in D, then

In a Poisson process, the number of observed occurrences fluctuates about its mean λ with a standard deviation
. These fluctuations are denoted as Poisson noise or (particularly in electronics) as shot noise.[citation needed]
The correlation of the mean and standard deviation in counting independent discrete occurrences is useful scientifically. By monitoring how the fluctuations vary with the mean signal, one can estimate the contribution of a single occurrence, even if that contribution is too small to be detected directly. For example, the charge e on an electron can be estimated by correlating the magnitude of an electric current with its shot noise. If N electrons pass a point in a given time t on the average, the mean current is
; since the current fluctuations should be of the order
(i.e., the standard deviation of the Poisson process), the charge
can be estimated from the ratio
.[citation needed]
An everyday example is the graininess that appears as photographs are enlarged; the graininess is due to Poisson fluctuations in the number of reduced silver grains, not to the individual grains themselves. By correlating the graininess with the degree of enlargement, one can estimate the contribution of an individual grain (which is otherwise too small to be seen unaided).[citation needed] Many other molecular applications of Poisson noise have been developed, e.g., estimating the number density of receptor molecules in a cell membrane.

In Causal Set theory the discrete elements of spacetime follow a Poisson distribution in the volume.
A simple algorithm to generate random Poisson-distributed numbers (pseudo-random number sampling) has been given by Knuth (see References below):
algorithm poisson random number (Knuth):
init:
Let L ← e−λ, k ← 0 and p ← 1.
do:
k ← k + 1.
Generate uniform random number u in [0,1] and let p ← p × u.
while p > L.
return k − 1.
While simple, the complexity is linear in λ. There are many other algorithms to overcome this. Some are given in Ahrens & Dieter, see References below. Also, for large values of λ, there may be numerical stability issues because of the term e−λ. One solution for large values of λ is rejection sampling, another is to use a Gaussian approximation to the Poisson.
Inverse transform sampling is simple and efficient for small values of λ, and requires only one uniform random number u per sample. Cumulative probabilities are examined in turn until one exceeds u.
Given a sample of n measured values ki we wish to estimate the value of the parameter λ of the Poisson population from which the sample was drawn. The maximum likelihood estimate is [28]

Since each observation has expectation λ so does this sample mean. Therefore the maximum likelihood estimate is an unbiased estimator of λ. It is also an efficient estimator, i.e. its estimation variance achieves the Cramér–Rao lower bound (CRLB).[citation needed] Hence it is MVUE. Also it can be proved that the sum (and hence the sample mean as it is a one-to-one function of the sum) is a complete and sufficient statistic for λ.
To prove sufficiency we may use the factorization theorem. Consider partitioning the probability mass function of the joint Poisson distribution for the sample into two parts: one which depends solely on the sample
(called
) and one which depends on the parameter
and the sample
only through the function
. Then,
is a sufficient statistic for
.

Note that the first term,
, depends only on
. The second term,
, depends on the sample only through
. Thus,
is sufficient.
For completeness, a family of distributions is said to be complete if and only if
implies that
for all
. If the individual
are iid
, then
. Knowing the distribution we want to investigate it is easy to see that the statistic is complete.

For this equality to hold, it is obvious that
must be 0. This follows from the fact that none of the other terms will be 0 for all
in the sum and for all possible values of
. Hence,
for all
implies that
and the statistic has been shown to be complete.
The confidence interval for the mean of a Poisson distribution can be expressed using the relationship between the cumulative distribution functions of the Poisson and chi-squared distributions. The chi-squared distribution is itself closely related to the gamma distribution, and this leads to an alternative expression. Given an observation k from a Poisson distribution with mean μ, a confidence interval for μ with confidence level 1 – α is

or equivalently,

where
is the quantile function (corresponding to a lower tail area p) of the chi-squared distribution with n degrees of freedom and
is the quantile function of a Gamma distribution with shape parameter n and scale parameter 1.[21][29] This interval is 'exact' in the sense that its coverage probability is never less than the nominal 1 – α.
When quantiles of the Gamma distribution are not available, an accurate approximation to this exact interval has been proposed (based on the Wilson–Hilferty transformation):[30]

where
denotes the standard normal deviate with upper tail area α / 2.
For application of these formulae in the same context as above (given a sample of n measured values ki each drawn from a Poisson distribution with mean λ), one would set

calculate an interval for μ=nλ, and then derive the interval for λ.
In Bayesian inference, the conjugate prior for the rate parameter λ of the Poisson distribution is the gamma distribution.[31] Let

denote that λ is distributed according to the gamma density g parameterized in terms of a shape parameter α and an inverse scale parameter β:

Then, given the same sample of n measured values ki as before, and a prior of Gamma(α, β), the posterior distribution is

The posterior mean E[λ] approaches the maximum likelihood estimate
in the limit as
.[citation needed]
The posterior predictive distribution for a single additional observation is a negative binomial distribution,[citation needed] sometimes called a Gamma–Poisson distribution.
This distribution has been extended to the bivariate case.[32] The generating function for this distribution is
![g( u, v ) = \exp[ ( \theta_1 - \theta_{ 12 } )( u - 1 ) + ( \theta_2 - \theta_{ 12 } )( v - 1 ) + \theta_{ 12 } ( uv - 1 ) ]](http://upload.wikimedia.org/math/0/6/e/06e02205e6b9fd1822b5a9c24a9adfc0.png)
with

The marginal distributions are Poisson(θ1) and Poisson(θ2) and the correlation coefficient is limited to the range

The Skellam distribution is a particular case of this distribution.[citation needed]
Gallagher in 1976 showed that prime numbers in short intervals obey a Poisson distribution.[33] Ernie Croot(2010) stated this in informal mathematical language in his lecture notes on the Poisson distribution.[34] To understand this relationship the Prime Number Theorem will be required.
This theorem states that the number of primes
is about
, where the logarithm is take to the base e. In symbols if
denotes the number of primes less than x then

This implies that

In what follows the notation
is used to denote the number of primes in a given interval
.
Suppose that
is a large number, say
. Then a number chosen at random
has
(~ 0.43%) chance that it will be prime. A typical interval
will contain about one prime.
More than this is true: If a number
is chosen at random, and choose
and
not too large ( say
) then the number of primes in
is approximately Poisson-distributed:
![P\{ \pi [ n,{ \rm{ } } n + \lambda \log n ] = j\} \approx \frac{ e^{-\lambda} \lambda ^j }{j!}](http://upload.wikimedia.org/math/3/1/d/31d6bd3155a71f110ffa999cdf313e47.png)
Notice that equality was not used here: in order to obtain equality we would have to let
in some fashion. The larger
is the closer the above probability comes to
.
It is an interesting exercise to determine out why the primes would be expected to be Poisson-distributed.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||