This section is aimed at students in upper secondary education in the Danish
school system, some objects will be simplified some details will be omitted.
Continuous Random Variables
Some random processes do not have discrete sample spaces, e.g. if we take
accurate enough measurements of almost anything, no two objects will have
the same. One classic approach is to group the data which in effect removes
the accuracy, but another approach is to consider it a continuous random variable
and do that analysis instead. In this case you describe the probability of
of the process with a so-called probability density function(pdf), \(f(x)\).
In the discrete case you can make histograms such that the area of each "bin",
or pillar, represents the probability of each outcome. In the continuous case,
this is also true, but the probability of each outcome is the area of a line
segment with height \(f(x)\) and 0 width, which obviously has an area
of 0. This makes sense since the probability of catching a fish that weighs
exactly 2,0001 kg is practically 0, even more-so if you add decimals. The
way we use the pdf to calculate probabilities then, is by integrating it,
and interpreting the area under the graph as the probability of getting outcomes
in that range, i.e.
Probability Density Function & Cumulative Distribution
Function
For a continuous random variable X, its probability density function(pdf),
\(f(x)\), and its corresponding "canonical" antiderivative and area function,
called the cumulative distribution function(cdf), \(F(x)\), describe the
probability of getting an outcome in the range \([a,b]\) by the following
formulas
$$P(a< X< b)=\int_a^b f(x)dx=F(b)-F(a)$$
where
$$F(x)=P(X< x)=\int_{-\infty}^xf(t)dt$$
In the continuous case, we can always use strict inequalities since specific
values have 0 probability.
The Standard Normal Distribution
The continuous random variable is standardly normally distributed, \(Z\sim
N(0,1)\), if it has the following pdf
$$\varphi(x)=\frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}$$
and cdf
$$\Phi(x)=\int_{-\infty}^x\varphi(t)dt$$
It's "standard" because the pdf is symmetric about \(x=0\), which is the expected
value while the standard deviation is 1, which is why we write \(Z\sim N(0,1)\).
It also has the beautiful property that
$$\varphi'(x)=-x\varphi(x)$$
Normal Distribution
An arbitrary normal distribution, \(X\sim N(\mu,\sigma)\), does obviously
not have this property in general but we can always shift and squish it into
a standard normal distribution by the following relation \(X=\sigma Z+\mu
\implies Z=\frac{X-\mu}{\sigma}\). Then it will have the following pdf
$$f(x)=\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}
\right)^2}$$
The cdf does not have a closed form and is always calculated numerically, but
since every normal distribution can be related to the standard normal distribution,
it has historically allowed us to calculate them from tables of values, but
these days they are mostly calculated by machines.
The Mean and Standard Deviation
To see that \(\mu\) and \(\sigma\) are actually the mean and standard deviation
respectively, we can use the formulas from random variables
$$E(X)=E(\sigma Z+\mu)=\cancel{\sigma E(Z)}+\mu$$
and
$$Var(X)=\sigma^2\cancel{Var(Z)}$$
Now to see that the variance of the standard normal distribution is actually
1, consider that.
\begin{align}
E(Z^2)=&\int x^2\varphi(x)dx\\
=&-\int x(-x\varphi(x))dx\\
=&-\int x\varphi'(x)dx\\
=&\cancel{-[x\varphi(x)]_{-\infty}^\infty}+\int \varphi(x)dx\\
=&1
\end{align}
by integration by parts, and therefore
$$Var(Z)=E(Z^2)-\cancel{E(Z)^2}=1$$
Binomial Approximation
Let \(X\sim B(n,p)\) be a binomially distributed discrete random variable.
For appropriate values of \(n\) and \(p\), it is approximately normal, with
mean \(\mu=np\) and SD \(\sigma=\sqrt{np(1-p)}\), i.e. \(X\sim N(np,\sqrt{
np(1-p)})\). The only question is what constitutes "appropriate values",
and I will present no mathematical criteria for this, but the ideal values
are high values of \(n\) and \(p=0.5\). Some statisticians use the criteria
that \(np\geq5\) and \(n(1-p)\geq5\).