2.5.3. Cumulative Distribution Function
The cumulative distribution function brings the discrete and continuous RV’s together. For a RV \(X\) the cumulative distribution function (often called the distribution function) is defined as:
\[F_X(x) = \P(X\leq x)\]
Note that \(x\in\setR\) even in case \(X\) is a discrete RV. We have:
\[\begin{split}F_X(x) = \begin{cases}
\sum_{k=-\infty}^{\lfloor x\rfloor} p_X(k) &\text{Discrete $X$}\\
\int_{-\infty}^{x} f_X(y)\, dy &\text{Continuous $X$}
\end{cases}\end{split}\]
Below a probability mass function \(p_X\) is plotted and the corresponding \(F_X\).
Show code for figure
1import numpy as np
2import matplotlib.pyplot as plt
3
4plt.clf()
5
6pX = np.array([0,0,0,0,1,2,4,5,7,6,5,3,1,0,0,0])
7pX = pX / np.sum(pX)
8x = np.arange(len(pX))-7
9cpX = np.cumsum(pX)
10
11plt.subplot(211)
12plt.title(r"Probability Mass Function")
13plt.stem(x, pX, use_line_collection=True)
14plt.subplot(212)
15plt.title(r"Cumulative Distribution Function")
16plt.step(x, cpX, where='post')
17plt.savefig('source/figures/cumprobfunc.png')
And a plot of a probability density function and its corresponding cumulative distribution function.
Show code for figure
1import numpy as np
2import matplotlib.pyplot as plt
3#
4# I am cheating by calculating things for a sampled function...
5# ONLY LOOK AT THE PLOTS...
6#
7
8x = np.linspace(-8, 8, 1000)
9pX = np.piecewise(x, [x<-4, np.logical_and(x>=-4, x<4), x>=4],
10 [0, lambda x: 16-x**2, 0])
11pX = pX / np.sum(pX)
12cpX = np.cumsum(pX)
13
14plt.clf()
15plt.subplot(211)
16plt.title(r"Probability Density Function")
17plt.plot(x,pX)
18plt.subplot(212)
19plt.title(r"Cumulative Distribution Function")
20plt.step(x, cpX, where='post')
21plt.savefig('source/figures/continuous_cdf.png')
The cumulative distribution function follows from the probability density function by integration. We can go the other way as well:
\[f_X = \frac{d}{dx} F_X\]
With some mathematical leniency we can say that this also holds for a discrete random variable. [1]
Footnotes