Back to Search
Start Over
Population genetic analysis of dominant marker data
- Publication Year :
- 2011
-
Abstract
- Since its development in the mid-1990s the amplified fragment length polymorphism (AFLP) markers have been widely used in genetic studies because of the rapidity and ease with which a high number of polymorphic markers can be generated. Genetic variation is generally represented by the presence or absence of amplified DNA fragments, whose signals behave as dominant markers like in case of a number of other marker systems such as random amplified polymorphic DNA (RAPD), inter simple sequence repeat (ISSR) or directed amplification of minisatellite DNA polymerase chain reaction (DAMD-PCR). Each locus is thus less informative than a typical multi-allelic microsatellite (single sequence repeat ; SSR) locus, although the large number of AFLP markers available across the genome and their largely random distribution balance this drawback. Moreover, the use of dominant markers in population genetic studies is hampered by the lack of complete genotypic information. Recent advances in statistical methods make analyses of population diversity and structure based on dominant marker data possible although some problems of bias cannot be completely eliminated. Two basic approaches include the 'band-based' methods (direct study of AFLP band presences or absences on individual level) and 'frequency-based' methods (estimation of allelic frequencies from dominant biallelic data on population level). Genetic diversity within populations can be estimated using the Shannon's information index (Shannon, 1948 ; Lewontin, 1972) based on frequencies of band presence/absence at each marker within the population. Shannon’s information index can be also used to measure the total diversity (HT) as well as the average intrapopulation (HP) diversity and the proportions of diversity within (HP/HT) and among populations [(HT–HP)/HT] can be obtained. Pairwise genetic distances among individuals can be calculated from binary (presence/absence) data using Dice's coefficient (Dice, 1945) which takes into account only positive matches between two individuals and is therefore unaffected by potentially homoplasic absent bands. Subsequently, Dice's distance matrix can be used as an input for cluster analysis (using UPGMA or neighbour-joining method), principal co-ordinate analysis (PCoA) and the analysis of molecular variance (AMOVA). The analysis of molecular variance (AMOVA) can be used to partition the total variance into within and among population variance component. The variance components can be tested statistically by non-parametric randomisation tests. The simplest method for the estimation of allele frequencies from dominant data is the square-root procedure which uses the inbreeding coefficient (known a priori) and the square root of the frequency of null homozygotes (i.e. of band absences) to calculate the frequency of the null allele. However, a Bayesian method with non-uniform prior distribution of allele frequencies (Zhivotovsky, 1999) gives better estimates of marker allele frequencies. Genetic diversity and population genetic structure can be computed following the treatment of Lynch and Milligan (1994). Expected heterozygosity or Nei's gene diversity (Nei, 1973) can be calculated for each population and the total gene diversity (HT), the average gene diversity within populations (HW), the average gene diversity among populations in excess of that observed within populations (HB), and finally Wright's FST can be obtained. The marker allele frequencies can be used for as an input for cluster analysis on population level. Yet another Bayesian method developed by Holsinger et al. (2002) can be used in order to estimate FIS (inbreeding coefficient) from dominant marker data. The estimates of FIS often seem to be unreliable and have to be regarded with great caution. Finally, the model-based clustering program STRUCTURE (Pritchard et al., 2000) can be used to estimate the underlying population structure using individual multilocus genotypes. This Bayesian method enables identification of clusters of genetically similar individuals without prior knowledge of their population affiliation. In this approach, it is assumed that there are K populations contributing to the gene pool of the sampled populations. Spatial genetic structure can be analysed by a number of methods including spatial pattern analysis of genetic diversity (Hardy, 2003 ; based on kinship coefficient for dominant markers among individuals), isolation by distance analysis (Rousset, 1997 ; based on FST matrix among populations) and Bayesian spatial modelling of genetic population structure (Corander et al., 2008).
Details
- Language :
- English
- Database :
- OpenAIRE
- Accession number :
- edsair.57a035e5b1ae..0f5bd5fd96aca7cce399c51c85406ff6