In the standard coalescent E[δa] = 2(a - 1)/n for a < n (Saunderset al. For example, the middle term in the numerator of (9) is the covariance in coalescence time at site x for sequences i and j and at site y for sequences i and k; see Figure 2. This is the same result as given by Ohta and Kimura (1971) and Weir and Hill (1986) and is the expected linkage disequilibrium as the sample size tends to infinity.

2001), …

The mutation rate is a nuisance parameter that can be eliminated by taking the limit as μ → 0 (Nielsen 2000). (8) where the subscripts refer to the three configurations of sample chromosomes. Patterns of Linkage Disequilibrium (LD) across a genome has multiple implications for a population’s ancestral demography.

However, if the expectation is conditioned on intermediate alleles frequencies (e.g., >10%), the two are in close agreement (see Figure 3 and Hudson 1985). When there are just two alleles at both loci, the square of the disequilibrium coefficient is independent of how alleles are defined; hence we consider the identity coefficients between the derived mutations (denoted by an asterisk). Population structure: Population structure increases linkage disequilibrium because of the correlations in coalescence times induced by coalescent events within subpopulations.

I show that the effects of population growth, population bottlenecks, and population structure on linkage disequilibrium can be described through their effects on the covariance in coalescence times.

Also note that the identity-coefficient approach of Sved (1971) is quite different from that presented here, because he implicitly assumes that allele frequencies remain constant over time. Var(D)=E[fA(x)B(y)2]−2E[fA(x)B(y)fA(x)fB(y)]+E[fA(x)2fB(y)2]=Fx(ij)y(ij)−2Fx(ij)y(ik)+Fx(ij)y(kl). Under the standard coalescent, E[t] = 1, hence the ratio of the expectations (2) is [])), +((!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![]+[])+(!+[]+(!![])+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![])+(!+[]-(!![]))+(!+[]+(!![])+!![]))/+((!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![]+!![]+[])+(!+[]+(!![])+!![])+(!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![]+!![]+!![]+!![])+(!+[]+(!![])+!![]+!![]+!![])+(!+[]+(!![])-[])+(!+[]+(!![])+!! (7) σd2=10+ρ22+13ρ+ρ2. As a result, the pattern of linkage disequilibrium in a genome is a powerful signal of the population genetic processes that are structuring it. So the ratio E[t]2/Var(t) is reduced relative to the case of no bottleneck. [])).

The question of how statistics of linkage disequilibrium relate to aspects of the underlying genealogy is therefore of considerable interest.

Combining Equations 7 and 8 gives an expression for σd2 : where Ix(ij)m is the branch length (in generations) leading from the most recent common ancestor (MRCA) of sequences i and j to the MRCA of the entire sample and E[TxTy] is the expected product of the total tree length at sites x and y.

Most importantly, if mutations have no effect on organismal fitness, the genealogy of a sample can be separated entirely from the mutational process. (4) The result provides an intuitive basis for understanding how linkage disequilibrium behaves under different demographic scenarios.

2001). Consequently, in growing populations fewer recombination events will influence the history of a randomly chosen pair of chromosomes from the sample, leading to higher correlations in coalescence times; see Table 1. Associations between alleles are generated by the stochastic nature of mutation and sampling in a finite population, as well as certain forms of geographical structure (e.g., Ohta 1982), and natural selection (e.g., Strobeck 1983). (10) Note that for finite sample sizes, the possibility that i, j, k, and l are not all distinct has to be taken into account (Hudson 1985). (6)