Shannon defined the information rate as the number of bits transmitted per second, which bit is a unit for entropy, i.e., probability is involved. We can relate the information rate with $$R=rH$$, where $$r$$ is the rate that message is delivered and $$H$$ is the entropy of each message. So $$R$$ is the information rate. The paper by Kelly in 1956 is trying to find a physical example to the information rate and the gambler’s ruin is the topic of the paper.

In its simplest form, the paper consider the case of gamble which wins at a probability of $$q$$ and losts at $$p$$. Surely gambler’s ruin will happen for $$p>0.5$$ and the expected value of the fortune will grow exponentially on every bet otherwise. If we consider we bet only $$\ell$$ fraction of the total fortune each time, and introduce a logarithmic utility function to the fortune, then the expected utility after each bet will become

$G = q\log(1+\ell) + p\log(1-\ell)$

which, we can find the optimal $$\ell$$ to maximize $$G$$ to be $$1+\ell=2q$$, or

$G_\max = 1 + p\log p + q\log q$

This is the information rate of a binary channel of probability $$p$$ as well.

The paper later extended this binary channel example to $$n$$-ary channel. This solution to maximize the expected utility function is called the Kelly’s criterion. To paraphrase, if the bet will win the odd of $$b$$ then the expected utility will be

$G = q\log(1+\ell b) + (1-q)\log(1-\ell)$

and we can find its maximum with

\begin{align} \frac{dG}{d\ell} &= \frac{qb}{1+\ell b} - \frac{1-q}{1-\ell} = 0 \\ \therefore \ell &= \frac{qb+q-1}{b} \end{align}

If we choose a different utility function, the result will be different. For example, a linear utility $$f(x)=x$$ makes $$G$$ the expected return and the maximizer will be $$\ell=1$$ or $$\ell=0$$ depends on whether $$qb > 1-q$$. Alternatively, if we pick quadratic function $$f(x)=x^2$$ as the utility, we will have the maximizer to be $$\ell = \frac{1+q-qb}{1-b+qb^2}$$.

In the case of risk management (such as in stock market investment), quite often we compute the return of a stock using $$\log(S_t/S_{t-1})$$, instead of the more intuitive $$(S_t-S_{t-1})/S_{t-1}$$. The above reveals why: We are applying a logarithmic utility function to the amount of money which exhibits the law of diminishing return and penalize loss more than the reward in profit.

## Bibliographic data

@article{
author = "J. L. Kelly",
title = "A new interpretation of information rate",
journal = "Bell System Technical Journal",
volume = "35",
number = "4",
pages = "917--926",
year = "1956",
}