These exercises are not tied to a specific programming language. Example implementations are provided under the Code tab, but the Exercises can be implemented in whatever platform you wish to use (e.g., Excel, Python, MATLAB, etc.).
# Stochastic Birth-Death Processes
___
Many processes in physics, biology, economics, etc. are characterized by some element of randomness: the outcome of a quantum event, for example, or the trajectory of a particle undergoing Brownian motion. These are called **stochastic** processes. Rather than try to model the details of the dynamics, it is often easier to zoom out to a coarser model based in random numbers.
For this exercise, we'll look at a simple model of population dynamics called a **birth-death processes**. In this model, population fluctuations---specifically the birth and death rates, $\beta_\text{birth}$ and $\beta_\text{death}$---depend only on the population size, $l$. A classic physics example is nuclear decay for which the rate of decay depends only on the number of remaining nuclei. For this exercise set, however, let's consider a different situation: protein synthesis.
In our simple model, we'll only consider two possible events: births (or synthesis) and deaths (or decays). Let's further imagine that this system has a synthesis rate---that is, the rate at which the population increases---that depends only on the presence of the ingredients and is thus is fixed ([^1]) and a clearance rate (deaths) that is proportional to the total population, $l$:
$$
\begin{aligned}
\beta_\text{birth} &= (\text{constant})\\
\beta_\text{death} &= (k_0)*l
\end{aligned}
$$
$\beta_\text{birth}$ and $k_0$ are parameters of the model that would, in principle, be determined by experimental measurements. Both $\beta$ rates have units of [events/time].
[^1]: In truth, birth rates are never truly constant, but there are some circumstances in which they are *approximately* constant. For the example of protein synthesis, since the "birth rate" (or rather, the "synthesis rate") is dependent only on the ingredient amino acids and enzymes---and not on the current protein population---then so long as the ingredients are maintained at a fixed amount, the synthesis rate of our protein will remain a constant.
To make this model stochastic, we'll let both the timing and the sequencing of individual births and deaths be probabilistic. This echos many physical and biological processes such as nuclear decay, protein synthesis, and predation where quantum, thermal, or social effects prevent us from precisely predicting individual events.
**The goal of this exercise set is to build a computational model of this simple birth-death process and analyze the results**
Once you start to see how this algorithm works, you can extend it to more complicated systems---for example systems that include multiple interacting populations, multiple decay paths, and/or different kinds of events (e.g. situations such as enzyme switching in which the rate parameters are altered.) But again, let's keep it simple for this exercise set.
## The Gillespie Algorithm
Computational approaches to modeling stochastic systems typically use **Monte-Carlo** algorithms, which involve generating and manipulating random numbers. In order to model a birth-death process, we'll turn to a particular Monte-Carlo algorithm called the **Gillespie Algorithm** which abstracts stochastic processes into a series of steps that can be modeled numerically.
The basic process is as follows:
1. Calculate the mean rate at which events (any event) occur
2. Calculate the wait time until the next event (a random process)
3. Determine which event actually occurs (a second random process)
4. Adjust population and time parameters based on the outcomes of the event
5. Repeat
Begin by loading any library required for generating random numbers, plotting, and manipulating arrays; Setting up the plots; and defining the model parameters:
```Python
beta_birth = 0.2
k_0 = 0.025
Initial_Sample = 20.0
```
### Exercise 1 -- Random wait times
Like the clicks of a Geiger counter, events in birth-death processes do not occur at regular time intervals; instead we can only say that the *odds* of an event occurring within any small time interval $\Delta t$ is equal to $\beta \Delta t$, where $\beta$ is the mean rate of event occurrences (and $1/\beta$ is the average time between events). A sequence of random wait times is an example of what is called a "**Poisson Process**."
Although the wait-times between sequential events fluctuate at random, they still follow an exponential distribution where the probability of the next event occurring after a duration $t_\text{w}$ (give-or-take $\Delta t$) is given by:
$$
\mathscr{P}(t_\text{w}) = \beta e^{-\beta t_\text{w}}
$$
In order to have a computer generate a random wait time that follows this distribution, you need to convert the *uniformly distributed* random numbers that the computer generates into an exponential distribution:
```WaitTime = -(1/beta)*log(random())```
**Create a function (or find an appropriate pre-written function) that will return a random, exponentially distributed wait time for a given mean rate.**
### Exercise 2 -- Calculate the mean rate of event occurrences
There are two possible events that might occur in a birth-death process and these might each happen at different rates: $\beta_\text{birth}$ and $\beta_\text{death}$.
**Determine a general relationship to calculate the total rate at which events occur. Write this in terms of parameters that have already been defined**
(*Hint: suppose births happen 2 times a minute and deaths happen 3 times a minute; how many events happen every minute? How did you combine these rates? Don't forget that your solution should be a general relationship.*)
### Exercise 3 -- Selecting between events
Since a particular event can either be a birth or a death, we need to be able determine which event actually occured. A **Bernoulli Trial** is the name of a process that randomly decides between two events with odds $\xi_1$ and $\xi_2 = 1-\xi_1$, respectively. The classic example is a coin flip for which $\xi_1 = \xi_2 = 1/2$.
**Create a function that will randomly pick between two events given a fairness parameter $\xi$** (the output can be true/false, 1/0, etc.)
### Exercise 4 -- What are the odds?
Given the rates at which births and deaths occur ($\beta_\text{birth}$ and $\beta_\text{death}$), as well as the total rate for which any event occurs $\beta$, what are the odds, $\xi$, that a particular event will be a birth? Can you use the same procedure to determine the odds that a particular event will be a death? Check that these odds are properly normalized, that is $\xi_\text{birth} + \xi_\text{death} = 1$
### Exercise 5 -- Building the Dynamics
Now we can combine everything together. **Inside the iterative loop, include code that:**
* Calculates the mean rate of event occurrence
* Calculates the wait time until the next event
* Determines which event (birth or death) occurs
* Updates the time and population accordingly
* Plots the population
Run the program and briefly describe the output. Does this fit with your understanding of randomness?
### Exercise 6 -- Continuous, Deterministic Solution
For the analytic solution, let's assume that the population $l$ is deterministic, continuously varying (i.e. events are continuous), and is continuously variable (i.e. the population $l$ is not restricted to integers).
* Write down a differential equation that describes the rate of change of $l$ (*Hint: what are the ways that $l$ might change? How can we take these into account?*)
* Use this differential equation to determine the theoretical steady-state population.
* Solve the differential equation to find the full theoretical population dynamics. (*Hint: one option is to look at the overall trend of the stochastic curve and 'guess' what the smooth analytic solution should be; you can then run it back through the differential equation and 'tune' it to fit. If you're comfortable enough with differential equations, you may also be able to 'guess' at its solution*)
* Edit your code to plot the analytic solution along side the stochastic model.
### Exercise 7 -- Making Sense of the Model
* How does the analytic solution compare to the stochastic model? What are the similarities and differences? Try running the model multiple times without changing any parameters; how do the stochastic simulations differ?
* Vary the parameters associated with this model (```beta_birth```, ```k_0```, and ```initial_sample```); what effect does this have on the simulation? According to your analytic solution, what effect *should* these have on the simulation? Are these consistent?
* The analytic solution for $l(t)$ resembles the velocity dynamics for a particle undergoing linear drag; why should this be?
* Qualitatively, why is the exponential component of the analytic solution $\text{e}^{-k_0t}$ and not $\text{e}^{-\beta t}$? (In other words, why should the parameter $k_0$ determine the relaxation rate to steady state and not the rate of event occurrences, $\beta$?)
### Excercise 8 -- Noise and Probability Distributions
Noise is an inherent part of a stochastic system; even once it reaches a steady state, the population still fluctuates about the analytical prediction. Because of this, we must talk about *probability distributions* rather than deterministic population dynamics. But what does this distribution look like? How big are the fluctuations? Are they dependent on parameters of the model (i.e. $\beta_\text{birth}$ or $k_0$)?
There are two ways that we might examine the steady-state noise: we can analyze the results of our model or we can come up with a theoretical prediction. For now, it is simplest to use our model to generate data that we can analyze.
Make sure you have loaded appropriate libraries for plotting and statistical analysis. In Python, for example, you'll need to make use of the following libraries:
```Python
import matplotlib.pyplot as plt
import numpy as np
```
In order to study the steady state of the model, **edit your code to**
- Start the population at the steady-state value
- Keep a list of every value the population reaches (one for each iteration!)
and then **write new code to**
- Calculate the mean, variance, and standard deviation of this list of population values. Calculate, also, the total number of population samples (i.e. how many data points make up your distribution?)
- Create a histogram of the population distribution. Make this look nice.
You may have to Google how to perform some of these functions. For example, you'll want the range of histogram bins to match the range of population values. To increase the precision of the statistical measures, you can increase the duration of your model (and increase the simulation rate so it doesn't take 10 times as long to run!) to increase the number of data points (or run it multiple times without overwriting data from previous runs).
**Discuss you results.** Your discussion should address the following points, but feel free to elaborate on anything that you find particularly compelling:
1. Does your sample mean precisely align with the theoretical steady-state population? Should we expect this to be the case? How different would they need to be before you expressed surprise?
2. Does you sample mean, variance, etc. precisely align with those calculated by another group using an identical model (or you running your model a second time)? how much difference would be surprising?
3. Your probability distribution should look fairly "normal;" Why does such regular pattern emerge from a stochastic model? More to the point: what does randomness "look like?"
4. Contrast how $l$ is used in the stochastic/deterministic model and how it is used when discussing probability distributions. (there are some subtle but important differences.)
###Exercise 9 - Theoretical Population Distributions
Once the population reaches steady state, the distribution of population values approaches what is called the "Poisson Distribution" (not to be confused with "Poisson Processes.") That is to say, given a distribution centered on $l_
\text{mean} = \mu = \dfrac{\beta_\text{birth}}{k_0}$, the odds of the population actually having $l$ total members at any given time is given by
$$
\mathcal{P}_\text{Poiss.}(l) = \dfrac{\mu^l}{l!}e^{-\mu}
$$
For this particular distribution, both the expectation (average) and variance (standard deviation squared) are equal to $\mu$:
$$
\begin{aligned}
\lt l \gt = \mu\\
\text{var}(l) = \mu
\end{aligned}
$$
**Plot this distribution as a function of $l$ and compare it to the histogram generated from your model. Discuss the results. Do your sample mean and variance align with the theoretical mean and variance? Don't forget that your histogram shows the number of counts and not the probability so you'll have to adjust your plots accordingly. Also don't forget that $l$ is a natural number (i.e. 0, 1, 2, 3, etc.).**