There are several famous inequalities in the theory of probability. The simplest one is the Markov inequality:

$\Pr(|X|\ge a) \le \frac{E[|X|]}{a},$

for any random variable $$X$$ and real number $$a>0$$.

If we consider the random variable $$(X-E[X])^2$$, substitute into the Markov inequality, we have

\begin{align} \Pr((X-E[X])^2\ge a^2) &\le \frac{E[(X-E[X])^2]}{a^2} = \frac{Var[X]}{a^2} \\ \Pr(|X-E[X]|\ge a) &\le \frac{Var[X]}{a^2}. \end{align}

By substituting $$a=k\sigma$$ where $$Var[X]=\sigma^2$$, we have the Chebyshev inequality

$\Pr(|X-E[X]| \ge k\sigma) \le \frac{1}{k^2}.$

which it is required that $$k>0$$ and $$\sigma$$ is non-zero and finite.

Two other useful but more complicated probability inequalities are about deviation from mean. The Chernoff bound says that, for $$n$$ i.i.d. Bernoulli random variables $$X_1,X_2,\cdots,X_n$$, each having the probability $$p>\frac{1}{2}$$, the probability of having more than $$n/2$$ occurrences among the $$n$$ of them is

$\sum_{j=\lfloor n/2\rfloor+1}^{n} \binom{n}{j} p^j (1-p)^{n-j} \ge 1-\exp\big(-2n(p-\frac{1}{2})^2\big).$

While these $$n$$ Bernoulli random variable shall produce the expectation of $$np$$ occurrences, the probability of deviation from $$np$$ is bounded by the Hoefding’s inequality, saying that for $$\epsilon > 0$$, the probability of no less than $$n(p+\epsilon)$$ occurrences is

$P_{\ge n(p+\epsilon)} \le \exp(-2\epsilon^2 n),$

and the probability of no more than $$n(p-\epsilon)$$ occurrences is

$P_{\le n(p-\epsilon)} \le \exp(-2\epsilon^2 n),$

so the probability of having $$k$$ occurrences, which $$k\in[n(p-\epsilon),n(p+\epsilon)]$$, is

$P_{\in[n(p-\epsilon),n(p+\epsilon)]} > 1-2\exp(-2\epsilon^2 n).$

Hoefding’s inequality can be generalized so that, for $$X_i\in[a_i,b_i]$$ a.s. and $$X_1,X_2,\cdots,X_n$$ are independent, we have the empirical mean and its expectation

\begin{align} \bar{X} &= \frac{1}{n}(X_1+X_2+\cdots+X_n) \\ E[\bar{X}] &= \frac{1}{n}(E[X_1]+E[X_2]+\cdots+E[X_n]), \end{align}

then for $$t>0$$ we have

\begin{align} \Pr[\bar{X}-E[\bar{X}]\ge t] &\le \exp(-\frac{2t^2n^2}{\sum_{i=1}^n (b_i-a_i)^2}) \\ \Pr[E[\bar{X}]-\bar{X}\ge t] &\le \exp(-\frac{2t^2n^2}{\sum_{i=1}^n (b_i-a_i)^2}). \end{align}