These exercises are not tied to a specific programming language. Example implementations are provided under the Code tab, but the Exercises can be implemented in whatever platform you wish to use (e.g., Excel, Python, MATLAB, etc.).
2-State Systems: Statistical Mechanics in Sports and the Story of the Purple Pandas
-
2-state systems are simple opportunities to explore the implications of the _fundamental assumption of statistical physics_:
+ Individual events are independent of other events and thus that all accessible **microstates** are equally likely to occur.
Common examples of the 2-state system are coin flips, paramagnets, and even protein switching. But let us take a look at a familiar 2-state system: Sports seasons.
Imagine young children (four or five years old) playing pee-wee soccer. It is not beyond reason that no child is better at soccer than any other, and that no team is better than any other, so for the time being we will assume that the fundamental assumption of statistical mechanics does hold in this case (we'll think about the case of professional sports later). Let us follow one team in this youth league, the Purple Pandas, and consider how well they are likely to do this season.
Our goal is to learn how to do some simple combinatorics in Python, iterate over all possible macrostates, and use histograms of those numbers to make inferences about large systems.
In this activity, we'll refer to the final season outcome as a **macrostate**. For example, last season, the Purple Pandas played a 5 game season and won 3 of those games. This macrostate would be 3 - 2.
The exact sequence of wins and loses that produced this season outcome will be called a **microstate**. The Pandas lost their second and third game last season, so their microstate was WLLWW.
###A Test Case
Let's first analyze the Pandas' last season before we generalize this procedure in a computational model. Consider a season that contains 5 games.
**1.** How many possible sequences of wins and losses are there in a 5-game season? (That is, how many total *microstates* are there?)
**2.** Now lets look at a 3W - 2L season; write down all possible sequences of wins and losses that could produce this macrostate. How many combinations are there?
**3.** Recall the binomial combinatoric formula:
$$
\Omega(N,W) = \dfrac{N!}{W!L!} = \dfrac{N!}{W!(N-W)!},
$$
Where $N$ is the season length (5 games), and $W$ and $L$ are the number of wins and losses, respectively. Check that this gives you the same result that you got by explicitly writing down all of the combinations. If not, go back to part (B) is make sure that you didn't miss or double-count any microstates.
**4.** What are the odds of the Pandas winning 3 games in a 5 game season?
Computational Combinatorics
-
For small seasons (e.g. 5 games) it is simple to explicitly write down all possible combinations, but there are two obvious limitations to explicitly writing these out for large seasons (or large numbers of atomic dipoles, coints, etc.): First, the novelty quickly wears off and this becomes tedious for anything more than 20 or so combinations. Second, the total number of combinations quickly becomes very large, even for moderately-sized values of $N$ and $W$. For these reasons, we can get a computer to do the dirty work for us. (A nice bonus is that we can create nice plots!)
Start by launching your programming environment. Depending on which environment you are using, you may need to load a plotting library, a library that contains a combinatoric function, and a library that allows you to manipulate arrays.
Once you have done this, it is time to write the meat of the program. What we want is a program that will do the following:
+ Calculate the odds of winning a specific number of games, given a total season length (or, alternatively, the total number of ways of winning $W$ games given $N$ total games)
+ Display a histogram showing the total number of microstates for each macrostate---that is, the total number of ways of winning $W$ games given $N$ total games for each possible value of $W$.
**5.** In your program, create a section for ```parameters``` and another for ```calculations```; you'll want to organize the various parts of your program. Be sure, also, to make copious notes throughout your program about what various parameters are and what various algorithms are doing; this will make it easier for you to come back to the code and easy for others to follow along.
A. Write the following lines of code into your ```parameters``` section:
```Python
N_games = ??? #the total season length
Wins = ??? #Total number of wins considered
totalmicrostates = ??? #Calculate the total number of microstates
```
Fill in values or equations in place of the question marks so as to be consistent with the 5-game and 3-win situation from above.
There are two commands that we could use to calculate binomial combinatorics: We can directly calculate them using the factorial function but many environments also have a special function for calculating the binomial coefficient (sometimes called 'comb()' or 'combin()'):
$$N! \Longleftrightarrow \text{factorial}$$
$$\dfrac{N!}{W!(N-W)!} \Longleftrightarrow \text{N choose W}$$
Use the documentation of your programming environment to determine how to write both of these functions. The second of these will be easier to use, both because it's cleaner to write and because it will save on processing time (try them both and see!)
**6.** In the ```calculations``` section of your code, write a) a command that will calculate $\Omega(N,W)$, b) a command that will calculate the probability, and c) a command that will print this probability.
**Run the program** and verify that you get what you calculated earlier.
##Generalizing this to multiple games
Now that you are able to calculate the probability of the Purple Pandas winning exactly 3 games in a 5 games season, we would like to know the probability of every possible season result, given $N$ total games. To do this, we will need to repeat our calculation for each number of wins from 0 to the total number of games, $N$.
Rather than using one value for the number of wins, we should use a list or an array with each possible number of wins. For five games, making such a list is easy; we can just explicitly type
```Python
Wins = [0, 1, 2, 3, 4, 5]
```
For longer seasons, explicitly writing it out might get tedious.
**7.** Write an algorithm that will create a list of all possible numbers of wins for a given season length. How many items should be in the list?
**8.** Write an algorithm that will create an array that contains the binomial coefficient for each $W$ value in your wins list. Test your algorithm for the 5-game season (which you can directly calculate). Does this fit with your test calculations?
**9.** Create a bar-chart of your data.
**10.** Add a title, axes, label, etc. If you do not know how to do this, you can look it up on line or in the documentation of your programming language.
##Modify Your Program##
Modify your program to answer the following questions:
**11.** What are the odds that the Purple Pandas will win 3 games in a 5 game season?
**12.** What are the odds that the Purple Pandas will win 30 games in a 50 game season?
**13.** What are the odds that the Purple Pandas will win 300 games in a 500 game season?
**14.** In this last season, how would you compare the number of ways the Pandas might win 250 games to the number of ways they might win 300 games? What does this histogram tell you about the likelyhood of divergences from the mean? What effect does the total number of games have on this?
**15.** The fundamental assumption in Statistical Mechanics is: *each micro-state is equally as likely to occur as any other micro-state*. What are some specific assumptions that we had to make about the Purple Pandas in order for the Fundamental Assumption to hold true?
**16.** Could we apply these mathematics to professional sports teams? What assumptions can we continue to make and what assumptions can we no longer make? In what ways are these assumptions no longer valid?