Probability Distributions¶

Discrete Distributions¶

Bernoulli Distribution¶

The simplest of all probabilistic experiments perhaps is tossing a coin. The outcome is either heads or tails, or 0 or 1, or.. Such an experiment is called a Bernoully experiment and the corresponding distribution is called the Bernoulli distribution.

The Bernoulli distribution has parameter \(p\). In case the random variable \(X\) is Bernoulli distributed we write \(X\sim\Bernoulli(p)\).

The probability mass function is:

\[\begin{split}p_X(x) = \begin{cases} 1-p &: x=0\\ p &: x=1\\ 0 &: \text{elsewhere} \end{cases}\end{split}\]

The expectation equals:

\[\begin{split}\E(X) &= \sum_{x=-\infty}^{\infty} x\,p_X(x)\\ &= p\end{split}\]

and its variance:

\[\Var(X) = p(1-p)\]

Binomial Distribution¶

Consider a Bernoulli distributed random variable \(Y\sim\Bernoulli(p)\), and let us repeat the experiment \(n\) times. What then is the probabilty of \(k\) successes. A success is defined as the outcome \(Y=1\) for the Bernoulli experiment.

So we define a new random variable \(X\) that is the sum of \(n\) outcomes of repeated independent and identicallu distributed (iid) Bernoulli experiments.

The outcomes of \(X\) run from \(0\) to \(n\) and the probability of finding \(k\) successes is given as:

\[p_X(x) = P(X=k) = {n \choose k}\,p^k\,(1-p)^{n-k}\]

This is called the Binomial Distribution. For a random variable \(X\) that has a binomial distribution we write \(X\sim\Bin(n,p)\).

The expectation is:

\[\E(X) = n\,p\]

and the variance:

\[\Var(X) = n\,p\,(1-p)\]

Uniform Distribution¶

The discrete uniform distribution is used in case all possibe outcomes of a random experiment are equally probable. Let \(X\sim\Uniform(a,b)\), with \(a<b\) and \(a,b\in\setZ\), then

\[\begin{split}p_X(x) = \begin{cases} \frac{1}{b-a+1} &: a\leq x \leq b\\ 0 &: \text{elsewhere} \end{cases}\end{split}\]

Its expectation is:

\[\E(X) = \frac{a+b}{2}\]

and its variance

\[\Var(X) = \frac{(b-a+1)^2-1}{12}\]

Continuous Distributions¶

Uniform Distribution¶

The continuous uniform distribution characterizes a random experiment in which each real valued outcome in the interval \([a,b]\subset\setR\) is equally probable. The probability density function of \(X\sim\Uniform(a,b)\) is given by

\[\begin{split}f_X(x) = \begin{cases} \frac{1}{b-a} &: a \leq x \leq b\\ 0 &: \text{elsewhere} \end{cases}\end{split}\]

Normal Distribution¶

The normal distribution is undoubtly the most often used distribution in probability and machine learning. For good reasons: a lot of natural phenomena of random character turn out to be normally distributed (at least in good approximation). Furthermore the normal distribution is a nice one to work with from a mathematical point of view.

Let \(X\sim\Normal(\mu,\sigma^2)\) then the probability density function is:

\[f_X(x) = \frac{1}{\sigma\sqrt{2 \pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\]

where the parameters \(\mu\) and \(\sigma^2\) are called the mean and variance for good reason:

\[\begin{split}&\E(X) = \mu\\ &\Var(X) = \sigma^2\end{split}\]