# Expected values

#### Prerequisite

One of the most important concepts in learning to use probabilities quantitatively is the idea of ** expected value**. When we consider probabilities we look at a situation where something can happen many times with a variety of different results. And those results can have different values.

This applies to a wide variety of physical situations, ranging from: the probability of a molecule moving from the cell in the body where it is produced to the cell in the body where it is detected, to creating a particular protein as a result of random interactions of molecules in a cell, to the probability of producing a number of offspring that survive to breed and produce the next generation of breeders. It is particularly relevant in medicine where the expected value can be the fraction of patients who respond positively to a treatment!

But these situations are fairly complex and require us to a set up a lot of contextual details. To keep the focus on the structure of the mathematical ideas, we'll choose a toy model that, while unrealistic, let's you clearly see how the various parts play a role.

**Example: Drawing balls out of a jar.**

A game at a county fair has a jar containing 1000 different colored balls. For $\$1$, you are blindfolded and allowed to reach into the jar to pull out one ball. If you pull out a white ball, you get nothing. If you pull out a red ball, you get your $\$1$ back. If you pull out a blue ball, you get $\$10$, and if you pull out a green ball, you get $\$100$. Would you pay $\$1$ to play this game?

You might want to play just for fun even if you expect to lose. But if you want to decide if it's really worth it (or if the real example is whether a patient can expect to benefit or be harmed by a treatment!) you might want to do some calculations.

There are a set of four possible results: {W, R, B, G}. We'll index these by the label "i" and create a set of values, *V*, corresponding to the four results. They are:

*V*_{W}= 0*V*_{R}= 1*V*_{B}= 10*V*_{G}= 100

What we really need to know is this: How many of each kind of ball is in the jar?

Suppose there is only 1 green ball in the jar. Then the probability of our drawing it is 1 in 1000. Suppose also that there are 50 blue balls, 100 red balls, and the rest (1000 - 100 - 50 - 1 = 849) are white. This means that the probabilities of drawing a white, red, blue, and green ball in our one try are:

*P*_{W}= 0.849 (849 out of 1000)*P*_{R}= 0.100 (100 out of 1000)*P*_{B}= 0.05 (50 out of 1000)*P*_{G}= 0.001 (1 out of 1000)

The **expected result** is defined as the result we might expect to get if we played the game a large number of times. It is the sum of the value of each possible result times the probability of getting that result.

Thus, if we only get a green ball 1 try out of 1000 on the average, it costs us (again, on the average) $\$$1000 to win $100. Not a great deal. But maybe the others make up for it? We see the overall result by adding up all the probabilities times the value of their result:

<*V*> = *V*_{W}*P*_{W} + *V*_{R}*P*_{R} + *V*_{B}*P*_{B} + *V*_{G}*P*_{G} = (0)(0.849) + (1)(0.100) + (10)(0.050) + (100)(0.001)

<*V*> = 0 + 0.1 + 0.5 + 0.1 = 0.7

So for paying $\$1$, on the average we might expect to gain back $\$0.7$ (70 cents). This gives an idea of the fairness of the game and the potential costs and benefits of playing.

**The expectation equation**

We want to generalize this so we can apply this idea to more relevant and important examples. In general, we'll have a set of results that we will number by the label *i* ∈ {1,2,3,4,....}. The curly brackets means a set of labels and "∈" means that *i* is one of the elements of the set. If the values of the i-th result is *V*_{i} and the probability of getting the i-th result is *P*_{i}, then the expected value of *V* over many trials is given by the expectation equation:

$$\langle V \rangle = \sum{V_iP_i} = V_1P_1 + V_2P_2 + V_3P_3 ...$$

We use the angle brackets $\langle ....\rangle$ to indicate the *average* of something, so $\langle V\rangle$ is the average (expected) value. The sum of all the probabilities, *P*_{i}, have to add up to 1 since *something *has to happen:

$$1 = \sum{P_i} = P_1 + P_2 + P_3 ...$$

Sometimes we will be interested in a number of different expected values. For example in our study of diffusion along a line (in 1D) we will be interested in both the average displacement of a diffusing molecule, $\langle x \rangle$, and in the square of the average displacement, $\langle x^2 \rangle$. These concepts will also play an important role in understanding entropy and the Boltzmann factor.

Joe Redish and Bill Dorland 2/4/18

Last Modified: May 12, 2019