A formal mathematical treatment of natural selection:

This basically follows the treatments given in Futuyama (1979) and Maynard-Smith (1989) with somewhat different symbols to cope with the ability of ASCII and HTML to represent symbols commonly used.

For simplicity we will deal with an asexual, haploid population of a single species with non-overlapping generations. However, mathematical treatments of more complex situations (eg diploid organisms with overlapping generations) exist (Futuyama, 1979; Maynard Smith, 1989)

First we will look at the characteristics of population growth. Once we have equations for growth of organisms in a population, we will then add in genetic variability and selection

In any given environment the density and persistence of a population of organisms depends on its capacity to increase and on factors that limits the organisms’ abundance.

The population size of the organism is N

The rate of increase in the population is r where r is the difference between the rates of birth/cell division b and the rate of mortality m. Thus the rate of change dN/dt of a population of size N is

1) dN/dt = bN-mN  since r = b-m then

2) dN/dt = rN

Then Nt the population size at time t depends on r and the initial size N0

3) Nt= N0ert

In this discussion we are limiting ourselves to organisms with non-overlapping generations, but modifications for organisms with overlapping generations exist.

Now, equation 3 represents an exponential growth curve, and is appropriate for modelling small populations early in their growth. However, no population can expand constantly, but will approach an equilibrium population size K. K is the carrying capacity of the environment and is determined by one or more of the following: resource limitation, predation and other environmental pressures such as toxins or parasites. Usually, both the birth/division rate decreases and the death rate increases. The effect of carrying capacity is dealt with by the logistic equation.

The observed rate of increase robs = r - cN, where c is a constant and r/c is K, the carrying capacity.

Substituting into equation 2 dN/dt = (r-cN)N, re-arreanging dN/dt = rN(1-cN/r) as r/c = K this gives us the logistic equation.

4) dN/dt = rN(1 – N/K). Note that when N is small compared to K, equation 4 reduces to equation 2. Plotting equation 4 gives a sigmoidal curve, as seen in figure 1 for total population. Logistic curves for population growth are seen under these conditions for organisms as diverse as bacteria and rabbits.


Now, in the above examples we have assumed that the population is genetically uniform, or at least any genetic variation is neutral and has no effect on the rate of increase r of the variants. We are also assuming that generations are non-overlapping (eg.  viruses, bacteria, yeast and many insect species), and we are dealing with haploid organisms (eg. viruses, bacteria, some yeast and some insects).

Now let us assume that our populations consist of organisms carrying one or other of two variants of a single gene locus L, allele A and allele B. In a given environment E, variant A has a rate of increase of Ra and variant B has a rate of increase of Rb. That is variant A has differential reproductive success to variant B. It doesn’t matter whether this is due to differences in birth/division rate or death rate or both, what matters is the combined effect (later we can treat these separately). Real world examples include the effect of streptomycin on S. typimurium, where streptomycin reduces R for bacteria the AAA42 variant of rsPL gene much, much more than its effect on R for bacteria with the ACA42 variant of rsPL gene, via an increase in death rate and a decrease in division rate.

The variants are initially present in number Na and Nb respectively, and the total population N=Na+Nb. The proportions of A and B at time 0 are Na/N= p and Nb/N = q. The growth rate of the entire population is the pRa + qRb = Rbar.

After one generation Na’ = NaRa and N’ = NRbar, so the new proportion of A is p’ = NaRa/NRbar. (Na =pN and Rbar = pRa + qRb).

5) p’ = pNRa/(N(pRa + qRb)) = pRa/(pRa + qRb)

The change in proportion of A is

6) Dp= (pRa/(pRa +qRb)) – p or pq(Ra-Rb)/Rbar

It can immediately be seen that as long as Ra > Rb p increases until A has replaced B.

Thus there is selective replacement of variant B by variant A. This can be visualized in terms of organism numbers by iteratively calculating Rbar from p’ at each generation and substituting this for r in equation 4.

Worked examples of these can be seen in figures 1 and 2 and in the table below for different values of Rb with constant Ra and K.

selection1.gif selection3.gif

Figure 1 A) The left hand panel shows the change over time of total organism number (N; yellow), number of organisms carrying Allele A (Na; purple) and the number of organisms carrying Allele B (Nb) when Ra =2, Rb =1.5 and K=100000 and the population starts with 5% of the organisms carrying allele A. B) The right hand panelly shows the proportions of of Allele A (purple) and B (blue) over time when Ra =2, Rb =1.5 and K=100000. Click on the images for the full sized figure.


Figure 2 This graph shows the change over time of the number of organisms carrying Allele A (Na; purple) and the number of organisms carrying Allele B (Nb) when Ra =2, and K=100000 are held constant, and Rb is either 1.0, 1.5 or 1.75 . The population starts with 5% of the organisms carrying allele A. Click on the images for the full sized figure.

It is also immediately apparent selective replacement can only occur if variant A and variant B of the organism are present in the same environment and limited by common environmental factors (K), it doesn’t matter if this is due to a common limiting resource, common predator or common environmental toxin. Note that there doesn’t have to be direct or indirect competition between the organisms with different alleles

This can be more easily seen by rearranging equation 4 to follow the individual populations.

7) dNa/dt= RaNa(1 –(Na+Nb)/K)

8) dNb/dt = RbNb(1-(Na+Nb)/K)

As K applies to both sub populations, and the total population must be considered when applying K. If the environmental factors are NOT shared then the equations become:

7) dNa/dt= RaNa(1 –Na/Ka)

8) dNb/dt = RbNb(1-Nb/Kb)

and the two populations will co-exist indefinitely, there will be no selective replacement of one by the other.

Again, as I have stated earlier, this is a simple formal treatment of selection in haploid organisms, it works well for predicting the course of natural selection in bacteria (for example the effect of streptomycin), haploid yeast and insects in simple (often experimental) environments.

I have presented equations that describe an expanding population in their present form, but these equations will also describe a population at equilibrium.

These equations also can be modified appropriately for more complex environments, diploid organisms, overlapping generations etc, but for space and simplicity I will ignore these and direct interested readers to Futuyama, Maynard Smith, or Li where these formulations are presented. Also there are a number of programs available on the web where the modification of these equations for diploid organisms is available, either directly ( or as a downloadable program. Nonetheless, this simple model still demonstrates selection, and is applicable to real world examples.


I have shown a simple, formal model of natural selection applicable to haploid organisms. In fact, it is generalisable to RNA strands in the Qbeta replicase system (Maynard-Smith, 1989) and computer programs.

TABLE: Frequencies of Allele A (p) and Allele B (q) at the beginning and after 35 generations with K=100000 and Rb as indicated.

Ra=2 Rb=1

Ra=2, Rb=1.5

Ra=2, Rb=1.75

















When Rb = 1 Ra becomes 1.0 at generation 13


Futuyama, DJ (1979) Evolutionary Biology, Sinauer Associates, Chapters 4 and 12

Maynard Smith, J (1989) Evolutionary Genetics, Oxford University Press, Chapter 2, pp14-27 (see also chapters 3-6)

Li, W-H (1997) Molecular Evolution, Sinauer Associates

Return to text

An Excel spreadsheet in Office 2000 format containing numerical examples of the equations presented here which you can play with. http://home.mira.net/~reynella/selec_pub.xls

A zip file which contains versions in Excel5.0/95 format, Excel97 format and Excel2000 format. http://home.mira.net/~reynella/selec.zip

Models of natural selection in diploid organisms :

Downloadable population genetics program using the Java virtual machine which includes simulations of the logistic equation, and selection in diploid organisms. http://www.cbs.umn.edu/populus/