The Boltzmann distribution


The Second Law of Thermodynamics says that in response to random interactions (such as molecular collisions), an isolated system will tend to move toward more probable states. 

For example, consider a system with two objects: object A with 5 degrees of freedom, in contact with object B with 10 degrees of freedom (where degrees of freedom can be considered as "bins" where energy is stored on the microscopic scale). If 3 packets of energy are deposited into the system and the system is bought into thermal equilibrium, the most probable state has one of the packets of energy in object A and two of the packets of energy in system B.  There are 5 microscopic arrangements for the first packet in object A, and 45 arrangements for the other two packets of energy in object B.  In other words, this "macrostate" can be generated by 225 different arrangements, or "microstates".  You can check for yourself that having one packet in A and two in B is a configuration with a higher number of microstates (or higher probability) than all three packets in object A or all three packets in object B, or one packet in B and two in A. 

In the discussion above we assume that all microstates have the same energy.  However, sometimes each microscopic arrangement is not equally probable.  In particular, in a physical system, the probability of an arrangement may depend on its energy.  For example, if in the example above objects A and B are still in thermal equilibrium but we now consider only object A as our system, then the energy in object A can vary from zero to three packets of energy. 

To figure out what's going to happen when there are different energies involved, we need to take into account both energy and entropy. Figuring this out is an excellent example of seeing how to construct mathematics that describes want we know to be true from our understanding of the physics — Building equations from the physics

Let's start with a simpler example:  Let's focus in on a molecule that collides with others in an thermal environment. Some of the kinetic energy might go into the internal energy of one or both of the molecules. It could set the molecule vibrating or spinning, putting energy into the internal degrees of freedom of the molecule. So for all the molecules of a particular type in a block of matter, we expect that some of them will have more energy in their rotational degree of freedom than others. Let's try to figure out how the probability of an arrangement depends on its energy.

What we can infer from how two systems must combine

Let's imagine splitting our system into two parts, which we'll call A and B.  Now we can make two observations:

  • The energy of the entire system equals the energy of subsystem A plus the energy of subsystem B, or if you like, $E_{AB} = E_A + E_B$ .  This should need no explanation!
  • The two parts of the system are each in some arrangement.  If $P_{AX}$ is the probability that subsystem A is in arrangement X, and $P_{BY}$ is the probability that subsystem B is in arrangement Y, then what is the probability that the entire system (AB) is in this combined arrangement (XY)?  From the rules of probability, we know that we have to multiply them: $P_{(AB)(XY)} = P_{AX} \times P_{BY}$.  For example, if you flip a coin and roll a die at the same time, the probability that you both flip tails and roll 5 is (1/2)*(1/6) = (1/12).  (This is related to the principle that you saw in this '80s music video.)

In summary,

When systems combine, energies add and probabilities multiply. 

So if we express the probability of a given state as a function of the energy, it must satisfy the relationship $P(E_A + E_B) = P(E_A) \times P(E_B)$ for all possible values of $E_A$ and $E_B$.

Creating a distribution that has the right properties

We need a probability distribution for energy such that when you add the energies in two parts of the system, the probabilities multiply. What types of functions satisfy this relationship?  Let's try exponential, since we know $e^{x+y} = e^x  e^y$ for all $x$ and $y$.

So can we just say the probability of arrangement i (with energy $E_i$) is just $P(E_i) = e^{E_i}$ ?  No way!  $E_i$ has units of energy, and you can't take an exponential of a quantity that has units. You wouldn't want to get a different result if you calculated in calories instead of Joules! So the exponent has to be dimensionless.  We have to divide $E_i$ in the exponent by something else with units of energy so that the units cancel. Where can we get another energy?  Is there a natural energy scale to which we can compare the energy of a given arrangement (so that we end up with just a dimensionless ratio)? Yes, there is:  As we saw before, we can use the Boltzmann constant $k_B$ to convert the temperature of the system into an energy, so $k_BT$ has units of energy, and $E_i/k_BT$ is dimensionless. This makes our probability a function of both the energy of our arrangement, $E_i$ and the temperature. Since energy is really a continuous variable, we'll write the probability as $P(E,T)$.

Then can we say that the probability function is $P(E,T) = e^{E/k_BT}$?  That's not obviously wrong in the way that $e^{E_i}$ is — at least the expression now makes mathematical sense. But let's play the implications game to see whether it makes physical sense.  If you plug in numbers to this expression, you see that an arrangement with greater energy has a greater probability than another arrangement (at the same temperature) with less energy.  But that can't be right.  It should be more difficult to get to an arrangement with greater energy (and therefore higher energy arrangements should be less probable).

Let's try putting in a negative sign: $P(E,T) = e^{-E/k_BT}$Now does this make physical sense?  A graph of the relationship between probability and energy (at a given temperature) is shown at the right.

Reading the graph, it tells us that as the energy of an arrangement increases, the probability decreases. At very high energies, the probability never quite reaches zero, but becomes very small. This makes good physical sense. It's hard to accumulate a lot more than the average energy through fluctuations.

This exponential is known as the Boltzmann factor. (As we'll see below, it's not the whole story.)

Now let's see how this probability distribution $P(E,T)$ depends on temperature. This means we are looking at the likelihood that we get a particularly energy as we change the temperature. We might expect that as $T$ goes up, the probability of getting an energy (especially a high energy) increases since there is more energy available.

In our expression $P(E,T) = e^{-E/k_BT}$, the temperature $T$ is in the denominator inside the exponent, not something we're accustomed to thinking about! (Check it out by plotting $y = Ae^{-1/x}$ on the Desmos graphing calculator.)  

As $T$ increases, the denominator gets larger, which means the fraction $E/k_BT$ gets smaller. But this fraction in the exponent is negative, so the exponent actually gets less negative, i.e. larger, so the probability of a given arrangement increases as you go to higher temperature. In other words, raising the temperature of the whole system makes it easier to get to higher-energy arrangements, just as we expected. This relationship looks like the figure at the right.

There's just one part of this result that can't possibly be right: Try plugging in 0 for the energy, and you get $e^0 = 1$ for the probability, suggesting that there is a 100% probability that the system will be in a state with zero energy! Impossible, because the sum of all the probabilities has to add up to 1. To fix this, we need to put a normalization constant out in front:  

$$P(E, T) = P_0 e^{-E/k_BT}$$ 

Where we get the value of $P_0$ by adding up (integrating over) all the energies and demanding that the total probability is equal to 1. Note that this makes $P_0$ a function of $T$!

Sometimes, we're mainly concerned about the relative probabilities of two different states, and when we take the ratio of two probabilities, the constant cancels out:  

$$\frac{P(E_1)}{P(E_2)} = e^{-(E_1-E_2)/k_BT}$$

so in that case we won't have to worry about what this constant actually is. 

Let's consider a specific example. At room temperature (300 K), $k_BT = 1/40 \mathrm{eV} = 4.14 \times 10^{-21} \mathrm{J}$ or $RT = N_Ak_BT = 2.5 \mathrm{kJ/mol}$.

So if there are two different arrangements with an energy difference of 5 kJ/mol between them, then the lower-energy arrangement is 7.4 times as probable as the higher-energy arrangement. (Run the numbers for yourself to check this.) However, if we heat the system up to 600 K, then the lower-energy arrangement is only 2.7 times as probable: it becomes easier to access the higher-energy arrangement.

Here's what the Boltzmann distribution looks like at different temperatures:


The Boltzmann distribution only applies to individual arrangements of particles. But if there are multiple arrangements that give the same overall energy (just as we showed that there are multiple ways to flip 10 coins and get 5 heads and 5 tails), we still need to multiply the probability of each arrangement by the number of arrangements.  We'll explore the consequences of combining energy and entropy (probability) in the follow-on page. 

Workout: The Boltzmann distribution


Ben Dreyfus 11/29/11 & Wolfgang Losert 2/8/2013


Article 611
Last Modified: April 11, 2019