2.1 Using ordinary differential equations to describe chemical reactions inside the cell

In the last lecture, we learned that molecular interactions between protein molecules, other, small molecules and DNA binding sites can turn on and off the activities of proteins and genes. These regulatory interactions combine into complex regulatory networks that ultimately control how cells behave.

Here, we will use ordinary differential equations (ODEs) to describe how these regulatory networks work. ODEs provide a powerful tool for predicting how a regulatory network that is wired in a particular way will behave inside the cell. In this lecture, we will consider two rather simple but very important examples (an unregulated gene and a negatively autoregulated gene), but the same methods are used to analyse much more complicated networks with many tens of genes and proteins.

The interior of cells is complicated: eukaryotic cells contain different cell compartments (e.g. the nucleus), and the contents of these compartments can also be organised in complicated ways. Prokaryotic cells, such as bacteria, don’t have compartments but they are highly packed with proteins and DNA, and some proteins tend to occupy specific regions of the cell.

Although this spatial structure probably plays an important role in the ways in which cells function, we can understand many aspects of cell regulation without taking it into account. Here, we will make the important assumption that the interior of the cell (or a particular cellular compartment) is “well mixed”, as illustrated in Slide 2.

Slide 2 shows a well-mixed system comprising two types of molecule: A and B. Because the system is well mixed, type A molecules collide with type B molecules with a frequency that is proportional to the product of the number of molecules of A and the number of molecules of B. In this lecture, we will assume that the inside of the cell is well mixed, but we should keep in mind the fact that this may not always be the case.

Now let’s suppose that when an A molecule collides with a B molecule, the two can react to produce a molecule of type C. Starting from a mixture of A and B, we would like to know how many C molecules will have been produced at time t. We suppose that in a small interval of time dt, the probability of a C molecule being produced is qNANB/V, where V is the volume of the system, NA is the number of A molecules and NB is the number of B molecules (the probability scales with 1/V since a pair of A and B molecules will be less likely to meet each other in a larger volume). We can then write

Equation 1
Equation 1

The constant q is called the rate constant. (The usual symbol for a rate constant is k, but there are several rate constants in this lecture, so we are using q for this one to avoid having several different constants all called k.)

Slide 3 shows the output of a numerical simulation of the reaction

A + B → C

In these simulations, we have assumed that the volume, V, is set to 1. In the left-hand plot, we can see that the number of C molecules (NC) increases over time, and that the C molecules are produced at random points in time, whenever an A and a B molecule happen to collide. This randomness can be important if there are only a small number of A and B molecules, and we will return to this in the next lecture. The right-hand plot in Slide 3 shows the same reaction, but with many more A and B molecules. In this case, many collisions happen in a small time interval and the plot for the number of C molecules versus time is much smoother. In fact, we can assume that the number of A, B and C molecules (per unit volume) change continuously with time. This is an important assumption because it allows us to write ODEs to describe how the system changes with time. The variables in these ODEs are the concentrations (number per unit volume) of the chemical species, in this case A, B and C, which we denote cA, cB and cC (e.g. cA = NA/V). For example, the set of ODEs that represents the reaction of A and B to produce C is shown on Slide 4. It’s important to note that because this is a second order or bimolecular reaction (it involves two reacting molecules), the dimensions of the rate constant are (concentration–1)(time–1).

2.2 Production of protein from a gene

2.2.1 A simple model

We can use the same ordinary differential equation methods to understand how cells control the production of protein molecules from their genes. Here, we are interested in how the concentration, cP, of a specific protein molecule, P, changes with time inside the cell. Protein P is produced from its gene, gP, by transcription (to make messenger RNA) followed by translation (to make an amino-acid chain) and protein folding. We could model all of these processes in detail but for now let’s just suppose that protein P is produced at a constant rate, k, as long as the gene, gP, is active. This reaction is zeroth order: the protein P is created at a constant rate that does not depend on any other variables in the model. The dimensions of the rate constant for this reaction are therefore (concentration)(time–1).

We write this as a chemical reaction,

In this reaction, the “source” is actually the gene, gP, plus the whole machinery of transcription and translation. Here we just put this into a “black box” and assume that protein P is produced at a constant rate.

Protein molecules are also removed from the cell. This could be because another protein molecule actively degrades them or because the cell is growing and dividing into daughter cells (and every time the cell divides, a given protein molecule has a chance of being lost). For now, let’s just assume that there is a fixed probability per unit time, μ, that any given molecule of P is removed. We can also write this as a chemical reaction,

This is a first-order or unimolecular reaction: a single molecule of P reacts. For unimolecular reactions, the dimensions of the rate constant are (time)–1. The “sink” here is another black box; P might have been removed into a daughter cell or it might have degraded into unspecified products.

Combining the constant rate of production, k, and the constant rate, per molecule, of loss, μ, we can write a differential equation for the rate of change of the concentration, cP, of P molecules (see Slide 5),

Equation 2

2.2.2 The steady state

We can tell a lot about the system without actually solving this equation. Slide 6 shows the rate of change, dcP/dt, plotted for different concentrations of protein, cP, for parameter values k = 2 and μ= 1. When the concentration of protein is small (cP < k/μ), di>cP/dt is positive. This means that there will be net protein production (cP will increase). However, when the concentration of protein is large (cP > k/μ), dcP/dt is negative. This means there will be a net loss of protein (cP will decrease). We can also see that for cP = k/μ, dcP/dt is zero. When the protein concentration reaches this value, there will be no net change: production balances removal. This is the steady-state protein concentration, cP (ss),

Equation 3

We could have obtained the value of cP(ss) directly by setting dcP(t)/dtt to zero in our differential equation and solving for cP.

Steady-state concentrations are a very important property of regulatory networks, and quite often this is all that people focus on when they study a model for a particular regulatory network.

The size of cP(ss) depends on both k and μ. If protein removal is due to cell division and if the average time between cell divisions (the cell cycle time) is τ, then

Equation 4

Notice the similarity with radioactive half-life.

For the bacterium Escherichia coli (E. coli) on a good food source, τ is about 30 min, so μ is about 0.02/min. Protein production rates, k ,vary greatly, from virtually zero to about 50/min. So the number of protein molecules in a cell (assuming that there is only one copy of the gene) can vary from zero to several thousand.

2.2.3 Rise time

For the simple model discussed here, we can go beyond the steady state and also solve the model for the time-dependent protein concentration, cP(t). This is important because genes can be turned on or off in response to signals, and we’d like to know how fast the cell can respond to a given signal.

The time-dependent solution for protein concentration in this model can be found by simple integration,

Equation 5

where cP(0) is the concentration of protein present at time t = 0. (Obtaining this solution can be set as an exercise.)

Slide 7 shows the protein concentration as a function of time for two different starting concentrations, cP(0). In both cases, the protein concentration approaches the same steady-state value exponentially,

Equation 6

To work out how fast the cell can respond to a signal, let’s suppose that protein P is an enzyme that allows the cell to metabolise lactose. Initially, the gene, gP, is repressed because a repressor protein is bound to its promoter. We assume that initially no protein P is present: cP(0) = 0. At time zero, the cell detects some lactose and the repressor leaves the promoter, so the gene becomes activated. How quickly can the cell produce protein P and start metabolising lactose? If cP(0) = 0, then Equation 3 becomes

Equation 7

We define the rise time, trise, as the time it takes for protein P to reach half of its steady-state value. (Note here the similarities to equations that describe radioactive half-life and capacitor charging, which are well known from other areas of physics.)

Setting cP(t) to ½ cP (ss) and solving for trise, we obtain

Equation 8

which becomes, when we substitute in cP (ss) = k/μ,

trise = ln(2)/μ

Equation 9

This result tells us that the response time of this simple network is determined only by the protein-removal rate. For bacteria, protein removal is usually due to cell growth and division. As we saw earlier, the removal rate, μ, is typically ln(2)/τ, where τ is the cell cycle time (Equation 4). So the response time for bacterial gene networks is typically of the order of the cell cycle time, which is at least 30 min.

2.3 Negative autoregulation

We learned in the previous lecture that genes can be turned on and off by the binding of specific proteins to the DNA in the promoter region. In many cases, proteins actually turn off their own production (i.e. the protein product of a gene is a repressor that binds to its own gene and turns off protein production). This is an example of negative feedback and is called negative autoregulation. It turns out that for E. coli, and probably for other organisms too, negative autoregulation happens much more often than one would expect if the regulatory “connections” between genes were chosen at random. Why has negative autoregulation been selected by evolution as a favoured regulatory motif?

2.3.1 A differential equation model with negative autoregulation

To try to understand this, let’s write down the equivalent differential equation model for a protein that represses its own production. We recall from the previous lecture (Equation 3) that for a protein binding to a DNA binding site, the probability that the binding site is occupied is

Equation 10

where c/c0 is the concentration of protein (relative to some standard value, c0), β= 1/kBT (where kB is the Boltzmann constant) and δε is the change in energy when the protein binds. We can define a dissociation constant, Kd, as

Equation 11

For low concentrations (where c/c0 is very small), we can see that the probability, pbound, that the binding site is bound becomes proportional to the inverse of the dissociation constant: pboundc/Kd. This shows us that Kd is actually just the equilibrium constant for the dissociation of the protein from its binding site. The reason why this proportionality does not hold at higher concentrations is that the binding site becomes saturated with protein, as discussed in the previous lecture.

The more strongly the protein binds to its DNA binding site, the more negative δε will be. Strong negative autoregulation (large negative δε therefore corresponds to a small value of Kd.

Combining (10) and (11), we get

Equation 12

and the probability that the binding site is unoccupied is given by

Equation 13

Returning to our differential equation for the production and degradation of protein, the production rate is now proportional to the probability that the promoter binding site is not occupied by protein, so Equation 2 becomes

Equation 14

2.3.2 The steady-state protein concentration and robustness

We now have a nonlinear differential equation for the concentration of protein, cP(t). Let’s find out what the steady-state protein concentration is. Slide 8 shows a plot of the rate of change of cP versus cP for two values of the production rate, k. Also plotted are the results for a gene without negative autoregulation (which we saw in Slide 6).

As in Slide 6, we see that when the protein concentration, cP, is low, production dominates and dcP/dt is positive, while when the protein concentration is high, dcP/dt is negative, meaning that protein degradation dominates over production. For one particular value of protein concentration, cP = cP(ss), production and degradation are balanced (dcP/dt = 0), and this is the steady-state protein concentration.

We can see from Slide 8 that negative autoregulation affects the steady-state protein concentration in two important ways. First, the steady-state protein concentration is lower for the negatively autoregulated gene (shown in red) than for the unregulated gene (shown in blue). Second, when we compare the results for two different values of the production rate, k (solid and dotted lines), we can see that for the unregulated gene the steady-state protein concentration depends strongly on k (in fact, we know from our calculations above that it is proportional to k); while for the negatively autoregulated gene, cP(ss) changes only a little when k is changed by a factor of two. Both of these effects have important implications for the performance of the gene, as we shall see.

To get an expression for the steady-state protein concentration, cP(ss), for the negatively autoregulated gene, we set the rate of change of cP(t) (Equation 14) to zero,

Equation 15

The calculation is shown in Slide 9. The result is

Equation 16

For very strong autoregulation (where Kd is very small), the result reduces to

Equation 17

This result will be needed later.

Slide 9 shows cP(ss) as a function of the protein production rate, k, for several values of the dissociation constant, Kd. As the negative autoregulation gets stronger (as Kd decreases), the curves become flatter: the steady-state protein concentration becomes less dependent on the protein production rate.

In the cell, the protein production rate depends on the concentration of RNA polymerase, as well as the concentration of ribosomes, mRNA degradation enzymes, etc. All of these factors vary from cell to cell and over time inside any given cell. We therefore expect the protein production rate to fluctuate within and between cells. For a gene without negative autoregulation, this will cause the protein concentration to fluctuate, since cP(ss) is proportional to the production rate, k.

This problem can be avoided using negative autoregulation. Because the curve of cP(ss) versus k is much flatter in the case of negative autoregulation, the steady-state protein concentration will remain stable even if the intracellular environment (i.e. the protein production rate) fluctuates. In other words, negative autoregulation can make the performance of a gene robust to changes in protein production rate.

You may have noticed (as we see in Slide 9) that for negative autoregulation, cP(ss) does depend on the dissociation constant, Kd. Is this a problem for robustness? Probably not: we expect Kd to fluctuate much less than k because Kd depends only on how strongly the protein binds to its DNA binding site, which is determined by the structure of the protein and the sequence of the binding site.

2.3.3 Rise time

Negative autoregulation also has an important effect on the rise time, trise: the time the cell needs to turn the gene on (to the half-maximal protein level). We saw that for the unregulated gene this time was fixed by the protein-removal rate, trise = ln(2)/μ What happens for a negatively autoregulated gene? To calculate trise, in principle, we should solve the full version of Equation 15,

Equation 15

This is rather difficult. However, since we are interested in early times, when cP(t) is small, we can make the approximation that cP(t)/Kd <1,>

Equation 8

If we assume that the autoregulation is strong, we can substitute in our previous result,

Equation 17

to get

This result is plotted in Slide 10. As Kd decreases (i.e. as the negative autoregulation becomes stronger), trise decreases. This important result shows that negative autoregulation can help cells to respond more rapidly to changes in their environmental conditions than they would be able to without regulation. The units chosen in Slide 10 are rather arbitrary.

To get a feeling for some real numbers, a typical protein-removal rate, μ, in a bacterial cell would be 0.02/min, so the rise time for a typical protein without negative autoregulation would be ln(2)/μ (~35 min). While protein production rates and protein–DNA dissociation constants can vary enormously, a realistic value for k might be 0.2 molecules/min per cell volume and Kd might be 0.02 molecules per cell volume (for a protein that binds very strongly to its DNA binding site). The value of trise for a negatively autoregulated gene, assuming these parameter values, would then be 12.7 min: almost a factor of three faster than the gene without negative autoregulation.