Decay of linkage disequilibrium as a function of physical distance, Figure 2. Biorxiv. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or NRF. (2019) Discovery of ongoing selective sweeps within Anopheles mosquito populations using deep learning. Demographically, post-agriculture, India has experienced huge recent population expansion. National Center for Biotechnology Information, Unable to load your collection due to an error, Unable to load your delegates due to an error. Molecular Biology and Evolution 36 (9), 2040-2052. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. Interestingly, some of these highly differentiated SNPs are known to be associated to diseases/traits (as inferred from GWAS catalogue; Welter et al. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. PCA showing five KGP Phase 3 Indian populations and five caste stratified populations from Andhra Pradesh from Moorjani study (Moorjani et al. Role of social hierarchy in the population substructure observed in the ITU. Taedong Yun et al. The second, more common, genetic variants have a mild effect and are thought to be implicated in complex traits (e.g. 2011;43:1193–1201. 2016). This page provides information about data generated by phase 2 of the Anopheles gambiae 1000 Genomes Project (Ag1000G), an international collaboration working to discover natural genetic variation in malaria mosquito populations and build an open data resource for mosquito research and surveillance. They proposed that the GIH subgroup that falls outside the main cline of Indian groups, might harbor novel ancestry in addition to ASI and ANI ancestries but had no comparative data to test this hypothesis. 2016), and have been the melting-pot of disparate ancestries originating from different parts of Eurasia and South-East Asia (Basu et al. 2015). CUSTOMER SERVICE See this image and copyright information in PMC. 2015) and the results were visualized using GENESIS (Buchmann and Hazelhurst 2014). Power of discovery and heterozygote genotype discordance, Extended Data Figure 4. Summary of the callset generation pipeline, Extended Data Figure 2. The standardized number of variant sites per genome, partitioned by population…, Extended Data Figure 5. Molecular genetic studies of complex phenotypes. Supplementary figure S6a, Supplementary Material online, shows the GIH_1 and GIH_2 subgroups along with one representative population (least admixed) from each ancestral group from Basu 2016 study (the ASI is represented by PNY, ATB by JAM, AAA by BIR and the ANI by KSH) along with STU and BEB samples from KGP. 2016). We performed a PCA on the five KGP-IS populations to identify possible population structure. The proportion of inferred ancestral component for each subgroup was estimated using ADMIXTURE (table 2) which clearly showed differences in the proportions of ancestral components for the two north-western IS populations. Both of these datasets were used for PC analysis. 2012 Feb;159(2):64-79. doi: 10.1016/j.trsl.2011.08.001. Nature Biotechnology 36, 1062-1066. The PC analysis clearly demonstrates that the two GIH subgroups show a difference in distribution across the IS north-south cline (Supplementary Data, Supplementary Material online). Cell. A map of human genome variation from population-scale sequencing. CEU NA12878 (daughter) and mother NA12892 and father NA12891. Social customs and hierarchies have also led to a complex diversity of largely endogamous populations which tolerate some degree of porosity (Reich et al. If you use these data, please cite the following publication: If you use the CNV data, please also cite the following publication: If you have any technical or scientific questions regarding the data, or would like to report an issue, please email Chris Clarkson (cc28 [at] sanger.ac.uk) or raise an issue via GitHub. Kyros Kyrou et al. Weir and Cockerham’s FST statistic (Weir and Cockerham 1984) was calculated to estimate the genetic differentiation across all populations (GIH, PJL, BEB, STU, ITU, CEU and CHB) using PLINK (Chang et al. 2009; Narang et al. The upper caste has been shown to demonstrate significantly higher ANI ancestry in comparison to lower castes from the same geographic region suggesting a relationship between the history of caste-formation and admixture among ancestral components (Reich et al. Front Genet. Although the KGP populations have a wide ethno-linguistic spread, there is a need to investigate the effect of sampling multiple diaspora populations to assess the genetic diversity of India. (2018) The genetic architecture of target-site resistance to pyrethroid insecticides in the African malaria vectors Anopheles gambiae and Anopheles coluzzii. Therefore, it was also necessary to investigate whether the classification of populations based only on language and geography, as used in the KGP, is sufficient to define ethnolinguistic units for the IS populations. Genomic Analysis in the Age of Human Genome Sequencing. Interestingly, no such difference was observed between the ITU_1 and ITU_2. (2016) "Radical remodeling of the Y chromosome in a recent radiation of malaria mosquitoes." At k= 5, the colour codes green, yellow, magenta, blue and red to represent the ANI, AAA, ASI, ATB and the Andamanese archepelago ancestries. Genetic diversity on the Indian subcontinent is represented by four distinct ancestry components: ANI, ASI, ATB and AAA. Genome variation and population structure among 1,142 mosquitoes of the African malaria vector species, Whole-genome sequencing reveals high complexity of copy number variation at insecticide resistance loci in malaria mosquitoes, A high throughput multi-locus insecticide resistance marker panel for tracking resistance emergence and spread in Anopheles gambiae, Assessing connectivity despite high diversity in island populations of a malaria mosquito, In Silico Karyotyping of Chromosomally Polymorphic Malaria Mosquitoes in the Anopheles gambiae Complex, Discovery of ongoing selective sweeps within Anopheles mosquito populations using deep learning, Robust Estimation of Recent Effective Population Size from Number of Independent Origins in Soft Sweeps, A CRISPR–Cas9 gene drive targeting doublesex causes complete population suppression in caged Anopheles gambiae mosquitoes, The genetic architecture of target-site resistance to pyrethroid insecticides in the African malaria vectors Anopheles gambiae and Anopheles coluzzii, Improved non-human variant calling using species-specific DeepVariant models, Massive introgression drives species radiation at the range limit of Anopheles gambiae, Panoptes: web-based exploration of large scale genome variation data, Radical remodeling of the Y chromosome in a recent radiation of malaria mosquitoes, Anopheles gambiae 1000 Genomes Project - phase 2 data resource. 2015), respectively, as per PubMed accessed in September 2016). While the clusters for the other two populations show similar features and it may be reasonable to speculate that they also result from social hierarchy or caste stratification, it is not possible to assess the role of caste as the basis for the observed subgrouping conclusively because of the unavailability of caste stratified data from these regions. However, the ITU subgroups, once again correlating to the admixture analysis results, do not show any separation along the north-south cline (Supplementary Data, Supplementary Material online). This resource will support genome-wide association studies and other studies relating genetic variation to health and disease. Alexander T. Xue et al. 2003, 2016; Sengupta et al. All five IS populations showed varying proportions of “European like” contributions, whereas the BEB, as expected, showed the presence of a significant “East-Asian like” genetic component. 2016), there is a scarcity of publicly available whole genome sequence data from the Indian subcontinent. Comparative analyses show that despite the distinct geographic origins of the KGP-IS populations, the ANI component is predominantly represented in this dataset. Though differences in the sample sizes could have potentially introduced errors and inflated the difference (especially for the ITU) a significant portion of these differences can be expected to be real. eCollection 2020. USA.gov. To study the genetic distance between the sub-groups, we estimated the pairwise Weir and Cockerham’s FST statistic (Weir and Cockerham 1984) between the five KGP-IS populations and the subgroups of the three populations (table 1). Genes showing very…, Extended Data Figure 8. (2016), we have demonstrated that the five populations in the KGP, although aimed at representing the spectrum of Indian genetic diversity by wide geographic sampling, primarily captures genetic diversity from the ANI group, leaving the other three groups largely unrepresented. Ⓒ 2020 Coriell Institute. (2019) In Silico Karyotyping of Chromosomally Polymorphic Malaria Mosquitoes in the Anopheles gambiae Complex. D.S. Two LD pruned datasets were used for this analysis, the first one included the five KGP-IS populations only, whereas the second included two additional global populations CEU and CHB (both datasets had about 1 million variants). 2005; Reich et al. 2015).