Australia: The Land Where Time Began
DenisovanHuman Data Sequence analysis reveals 2 Pulses of Archaic Denisovan Admixture
Anatomically modern humans (AMH) interbred with Neanderthals and Denisovans, a related archaic population. Genomes have been sequenced of several Neanderthals and 1 Denisovan, and these reference genomes have been used to detect genetic material that has been introgressed into genetic material of human genomes of the present. Introgression segments can also be detected without the use of reference genomes, and can be advantageous by doing so for finding introgressed segments that are not as closely related to the archaic genomes that have been sequenced. Browning et al. applied a new reference-free method for the detection of archaic introgression to 5,639 whole-genome sequences from Eurasia and Oceania. In populations from East and South Asia and Papuans they found Denisovan ancestry comprises 2 components with differing similarity to the sequenced Altai Denisovan individual. It is indicated by this that at least 2 distinct instances of admixture from Denisovans that involved populations of Denisovans that had different levels of relatedness to the Altai Denisovan that had been sequenced.
It has been confirmed by sequencing the Neanderthal genome (Green et al., 2010; Prüfer et al., 2014), the Denisovan genome (Reich et al., 2010), as well as several modern human genomes from Eurasia (Fu et al., 2015) that archaic hominins left their mark in the genomes of modern humans (Plagnol & Wall, 2006; Sankararaman et al., 2014; Vernot & Akey, 2014; Vernot et al., 2016). Eurasian individuals from the present inherited ~2% of their genome from Neanderthals (Green et al., 2010), and individuals from Oceania inherited ~5% of their genome from Denisovans (Reich et al., 2010). It is indicated by suggestive evidence that admixture from other hominin species that have remained unidentified occurred in Africa (Hammer et al., 2011; Hsieh et al., 2016; Lachance et al., 2012; Plagnol & Wall, 2006; Wall et al., 2009). In order to understand the functional, phenotypic and evolutionary consequences of admixture, it is necessary to identify the specific haplotypes and alleles that were inherited from ancestral archaic hominins (Huerta-Sánchez et al., 2014; Juric et al., 2016; Sankararaman et al.,2014; Simonti et al., 2016; Vernot & Akey, 2014). Methods that specifically incorporate reference archaic hominin sequences and reference-free methods that do not utilise such information are included among approaches to identifying introgressed haplotypes. The method of Sankararaman et al. is an example of the former category, which identifies archaic haplotypes by comparing modern human haplotypes to a reference archaic sequence. Included among the latter category of methods is the S* statistic (Plagnol & Wall, 2006), which searches for the mutational signature that ancient admixture leaves in the genomes in humans of the present.
For finding introgressed haplotypes in the absence of an archaic reference genome the S* is powerful because it leverages the usual mutational characteristics of introgressed haplotypes. Neanderthals carry many alleles that are specific to their lineages, because of the long divergence time between Neanderthals and modern humans. Such alleles are present on introgressed haplotypes though they are absent or rare in genomes from Africa. Also, introgressed haplotypes are expected to be retained without recombination over distances of approximately 50 kb on average based on the recent timing of the admixture (Sankararaman et al., 2012), which results in high levels of linkage disequilibrium (LD) between alleles specific to Neanderthals in genomes of non-African humans.
In this study, Browning et al. developed an S*-like method that has increased power and is suitable for large-scale genome-wide data. They applied the method to large sets of sequenced data from Eurasia and Oceania and identified putative alleles that are Archaic specific. They examined the rate at which these alleles matched the sequenced archaic genomes and the role of the genes that contained these alleles, in order to obtain insight into the history of the admixture events and their impact on genomes of modern humans.
Detection of putative archaic introgression in human populations
Each non-African population from the 1000 Genomes Project (1000 Genome Project Consortium, 2015) were analysed. 1.36 Gb of the genome covered by putative introgressed segments across the 19 European, Asian and Native American populations. Coverage ranges in individual populations from 382 Mb in Peruvians (PEL) to 665 Mb in Bengalis (BEB), and the average proportion that carried a detected segment at a position ranged from 0.80% in Puerto Ricans (PUR) to 1.23 % IN Han Chinese (CHB and CHS). These rates of detection are about half of the introgression proportions that have been estimated obtained by the use of f4 statistics (Prüfer et al., 2017). This is in line with the results of simulation, in which half of the introgressed material can be detected, whereas the other introgressed segments cannot be confidently detected as they are too short. There are higher rates of detection of introgression in East Asian populations than in European populations, which is consistent with previous reports of higher rates of Neanderthal introgression in East Asians than in Europeans (Meyer et al., 2012; Sankararaman et al., 2014; Vernot & Akey, 2014; Wall et al., 2013). Populations from South Asia and Europe have similar rates of introgression, which has been reported previously (Vernot et al., 2016).
Browning et al. found in the UK10K study (UK 10K Consortium et al., 2015) 304 Mb of the genome covered by 1 or more segments that were detected, and the average proportion of haplotypes that carried a detected segment at a position is 0.63%. This is lower than was found in the 1000 Genomes European populations. Browning et al. suggest that characteristics of the methods used to generate this data set may be a reflection of this lower rate of detection in the UK10K. Papuans have significant amounts of Denisovan ancestry, as well as Neanderthal ancestry. 239 Mb of the genome covered by 1 or more segments that were detected in Papuans from the Simons Genome Diversity Project (SGDP) (Mallick et al., 2016), and the average proportion of haplotypes that carried a detected segment at a position in 1.48%.
In the 1000 Eurasian populations, the putative introgressed haplotypes have median lengths that range from 59 kb in Bengalis (BEB) to 71 kb in Finns (FIN). The full segments that were reported by this method can be much longer because of the tiling across individuals. The median length of segments from 205 kb in Iberians (IBS) to 239 kb in Telugus (ITU) in the Eurasian 1000 Genomes populations. The longest segment that was detected was 7.9 Mb.
Comparisons to sequences archaic genomes
Putative archaic-specific alleles are inferred by the method of Browning et al. The proportions of archaic alleles that match the reference sequence can be determined if the archaic reference exists. As a result of masking filters that are applied (see the STAR methods) in order to eliminate questionable regions that result from factors such as low coverage or poor mappability some putative archaic-specific alleles cannot be compared to the archaic genome. The rate of mapping that is reported in this paper is the proportion of matched alleles that are not masked.
The overall match rate to the sequenced Altai Neanderthal genome is 0.719 in the 1000 Genomes European populations. The effect of allele frequency can be investigated in detail by considering the larger UK10K sample. The rate of matching of the alleles that are detected to the Altai Neanderthal is fairly constant across the full range of allele frequencies, the overall rate being 0.743, when the UK10K analysis is used. Contrasting with this, the alleles which are selected that, as with the archaic-specific alleles, are at frequency <0.01 in the outgroup from West Africa have a very low rate (0.034) of matching to the Altai Neanderthal. It is demonstrated by this that the match rate that is achieved by the method of Browning et al. is much higher than would be found in a high proportion if the putative archaic-specific alleles were false positive. In the American populations the match rate to the Altai Neanderthal and Altai Denisovan genomes is lower than in the other 1000 Genomes populations. As a result of the American populations being admixed and therefore have higher background levels LD that could cause false positive results. Browning et al. plotted 2-way density of match rate to Altai Neanderthal and Altai Denisovan genomes for segments that had at least 10 positions that can be compared to the Altai Neanderthal and at least 10 positions that can be compared to the Altai Denisovan, in order to look more closely at the Neanderthal and Denisovan ancestry present in modern humans. They found a large cluster of segments that had high matching to the Altai Neanderthal and low matching to the Altai Denisovan in each population. This cluster corresponds to segments that had been introgressed from Neanderthals. The mode of matching to the Altai Neanderthal in each population for this cluster is about 0.8, whereas the mode of matching to the Altai Denisovan genome is approximately 0.2. Therefore, about 20% of the archaic-specific variants introgressed from Neanderthals are also carried by the Altai Denisovan, which is due to the relatedness of the Neanderthal and Denisovan populations, whereas 80% of the archaic-specific variants introgressed from Neanderthals are present in the Altai Neanderthal. In each population there was also a small cluster of segments that had almost no matching to the Altai Neanderthal or the Altai Denisovan; Browning et al. suggesting that these are likely to be false-positive results that do not correspond to archaic introgression. A 3rd cluster of segments is present in the Asian and Papuan populations. There is high matching to the Altai Denisovan and low matching to the Altai Neanderthal in the segments in the 3rd cluster. This cluster corresponds to segments that have been introgressed from Denisovans, which confirms the earlier findings of Denisovan admixture in Papuans and in Asians (Prüfer et al., 2014; Qin & Stoneking, 2015; Sankararaman et al., 2016; Skoglund & Jakobsson, 2011). It has been suggested by Browning et al. that other populations may carry a small proportion of segments that were introgressed from Denisovans. Included among these are the Finns, with about 7% of their ancestry being obtained from East Asia (Sikora et al., 2014), and admixed Native American populations whose ancestors are related to East Asians (Gutenkunst et al., 2009).
There are Denisovan cluster of segments In the Japanese and Chinese (Dai, Beijing, and Southern Han populations that have a wide bimodal distribution of match rates to the Altai Denisovan genome. A test for 2 distinct components of Denisovan ancestry (see the STAR methods) is statistically significant (p < 0.05) after adjustments have been made for multiple testing) in each of these 4 populations, though it is not significant in the other 1000 Genomes populations. There are about ⅓ of the Denisovan segments in the populations in China and Japan that came from the component with higher affinity to the Altai Denisovan genome. There is a match rate of about 80% to the Altai Denisovan genome in the putative archaic-specific alleles in the high-affinity component, which is similar to the match rate of putative archaic-specific alleles in segments that have been introgressed from Altai Neanderthals, whereas the in other (moderate affinity) component the putative archaic-specific alleles have a match rate of about 50% to the Altai Denisovan genome.
The 2-component mixture test was reran which excluded any segments that contained any Neanderthal-specific alleles (putative archaic-specific alleles that matched the matched the Neanderthal genome but not the Denisovan genome, in order to check that the moderate affinity component is not due to segments that are not a mosaic of Neanderthal & Denisovan ancestry. They found that the same 4 populations (the 3 Chinese populations and the Japanese population) still have statistically significant p values for a 2 component mixture after adjusting for multiple testing (p<0.0026), and the estimated mixture parameters are essentially unchanged.
Most of the Denisovan admixture in South Asian and Papuan populations is from the archaic component with moderate to the Altai Denisovan, based on the mode of matching to the Denisovan genome. According to Browning et al. this is consistent with previous work that noted that the Altai Denisovan is statistically more distantly relate to the introgressed Denisovans compared to the relationship between the Altai Neanderthal and the introgressing Neanderthals (Prüfer et al., 2014).
Browning et al. extracted subsets of segments that were based on the affinity to the Altai Neanderthal and to the Altai Denisovan in order to facilitate further analyses (see STAR Methods). They carried out several analyses to check for possible confounders of match rate to the Denisovan genome. They checked whether the divergence between the Altai Neanderthal and the Altai Denisovan are different between regions that were covered by the moderate-affinity Denisovan introgression and the high-affinity Denisovan introgression in case such differences were able to account for the 2 components. In the data from East Asia the mean relative divergence (number of homozygous discordances were present between the Altai Neanderthal and the Altai Denisovan divided by the number of variants from the 1000 Genomes) was 1.65 (SE 0.26) for segments of Denisovan high-affinity and 2.51 (SE 1.00) for segments of moderate-affinity Denisovans. The difference was not statically significant (p>0.05). As the power to detect segments with the length as well as the density of archaic-specific variants, they adjusted for the length of the detected segments. For the data from East Asia they adjusted mean inverse density (bp per archaic-specific variant) was 103 (SE 440) for the moderate-affinity Denisovan segments, and 1,164 (SE 72) for the high-affinity Neanderthal segments. The difference was not statistically significant (p > 0.05). Therefore they did not find confounding by divergence or by density of archaic-specific alleles.
The lengths of haplotypes within segments that were attributed to components were investigated in order to investigate potential differences in admixture time between components. Lengths of haplotype were analysed in units of centimorgans (cM) instead of base pairs as the centimorgan distances are a reflection of recombination and therefore are less variable. Adjustments were made for frequency and overall segment length as high frequency and high segment length increase power to detect a segment and are correlated with the length of haplotypes. In the data from East Asia the mean haplotype length was:
· 0.066 (SE 0.014) cM for Neanderthal segments,
· 0.19 (SE 0.13) cM for high affinity Denisovan segments,
· 0.072 (SE 0.13) cM for moderate affinity Denisovan segments, and
· 0.13 (SE 0.06) cM for Denisovan segments overall.
These are not significantly different. According to Browning they also checked for differences in Europeans, in South Asians, in Asians overall (East & South), and in Papuans, again finding no significant differences. Though it is probable that the Neanderthal and 2 waves of Denisovan admixture occurred at different times, there is not enough information in the data to determine the ordering of these events.
Overall, similar amounts of detected Denisovan ancestry is carried by East Asians and South Asians, though the Papuans carry much more detected Denisovan ancestry. In the East Asians about ⅓ of the Denisovan ancestry segments are from the high affinity component, whereas very little of the Denisovan ancestry in South Asians and Papuans if from the high affinity component.
A possible scenario that is consistent with this pattern would have the high affinity component introgressing into East Asia following the split between South and East Asia. Browning et al. suggest that it may be that this component was primarily introgressed into the ancestral Papuans after their split from Asia, and arrived in Asia via the ancestors of Papuans; however, other scenarios are also possible (Prüfer et al., 2014; Sankararaman et al., 2016).
Lack of evidence for multiple waves of Neanderthal ancestry
In East Asians the frequency of introgression from Neanderthals is substantially higher (~30%) than in Europeans (Meyer et al., 2012; Wall et al., 2013). The effects of differential selection cannot be used as an explanation of this difference, though the difference could be the result of an additional Neanderthal admixture event, into the ancestors of East Asians following the Europe-Asia split (Kim & Lohmueller, 2015; Vernot & Akey, 2015). Migration from a population that had not received any Neanderthal admixture has been suggested as another possible explanation, dilution being the result of admixture from this migration (Meyer et al., 2012; Vernot & Akey, 2015).
In the results obtained by Browning et al. Neanderthal-introgressed segments in East Asians and in Europeans showed levels of similarity to the Altai Neanderthal individual that were indistinguishable. Also, there is no clear difference between East Asians and Europeans in the similarity of their Neanderthal-introgressed segments to the Vindija 33.19 Neanderthal. Therefore, if East Asians received a large pulse of Neanderthal admixture following the split from Europeans, then the original (shared Eurasian) as well as additional (East-Asian-specific) admixing populations must have been closely related.
Signals of positive selection
Browning et al. looked in 1000 Genome populations for introgressed segments with highest frequency. Specifically they found the 2 regions of highest frequency in each population that had high matching to the Altai Neanderthal or Altai Denisovan genome (see the STAR methods). It appears all these regions have been introgressed from Neanderthals rather than Denisovans. Of the positively selected regions several have been previously described, including BNC2, Pou2F3, and KRT71, which are involved in skin and hair traits (Sankararaman et al., 2014; Vernot & Akey, 2014). Genomic regions that have been introgressed by Neanderthals and have been shown under positive selection to be enriched for genes that are involved in pigmentation and immunity (Deschamps et al., 2016; Gittelman et al., 2016; Racimo et al., 2015; Sankararaman et al., 2014, 2016; Vernot & Akey, 2014; Vernot et al., 2016).
As well as regions that have been described in previous studies of archaic introgression that have been positively selected, the results of this study include 2 regions that are immunity related, which have been highlighted here. The first of these regions that are immunity related is on chromosome 3p21.31. This region was included in a supplementary table of introgressed haplotypes of high frequency in (Gittelman et al., 2016), though was not discussed in that work. At this locus the introgressed alleles are at high frequency in South Asia (0.38). Contained in the regions CCR9 (C-C motif chemokine receptor 9) and CXCR6 (C-X-C motif chemokine receptors that are involved in immunity) Papadakis et al., 2000; Paust et al., 2010; Zlotnik & Yoshie, 2000).
The second of these regions that are related to immunity is on chromosome 14q32.33. In this region the introgressed alleles are very high frequency throughout Eurasia. This region is located in the immunoglobulin heavy locus, which contains multiple genes coding for antibodies (Schroeder & Cavacini, 2010). IGHA1, IGHG2, and IGHG3 are heavy genes that are located within the high-frequency region. Rs10144746 (PhyloP score 4.1) is the most highly conserved introgression position, and it is an expression quantitative trait locus (eQTL) for IGHG4 and several other immunoglobulin heavy genes in various tissues that include oesophagus and liver. The high frequency introgression is in a region with significant masking of the Altai Neanderthal and Altai Denisovan genomes due to poor quality sequence. E.g., for the segment that has been found in the Southern Han Chinese (CHS) population, of the 145 alleles that were putatively introgressed 119 are filtered in the Altai Neanderthal genome (see the STAR Methods). 22 of 26 unfiltered alleles match the Altai Neanderthal genome. Therefore the region appears to be derived from the Neanderthal admixture, though it would be difficult to find by use of the reference-based approach.
In order to detect archaic introgressed segments to non-African populations worldwide from the 1000 Genomes project, Papuans from the SGDP and individuals from the UK10K project Browning et al. used a new method. Their method is reference-free, so it can detect introgressing from archaic admixing populations without a reference sequence. They showed that when there is a reference sequence comparison of the detected segments to the reference genome can lead to new insights into population history.
Browning et al. found evidence that Asians have introgression form Denisovans, which confirms earlier reports that used alternative methods (Prüfer et al., 2014; Qin & Stoneking, 2015; Sankararaman et al., 2016; Skoglund & Jakobsson, 2011). Also they found 2 waves of admixture with Denisovans, 1 from a population that was closely related to the Altai Denisovan individual, and 1 from a population that was related more distantly related to the Altai Denisovan. The component that is closely related to the Altai Denisovan is mainly present in East Asians, while the component that is more distantly related to the Altai Denisovan forms a major part of the Denisovan ancestry in Papuans South Asians. The populations from East Asia are the only populations that have relatively equal and non-negligible contributions from both populations and it is in these populations that the 2 waves of Denisovan admixture are most evident.
Contrasting with this, they did not find evidence that there were 2 or more waves of Neanderthal admixture from diverged Neanderthal populations. In East Asians the higher rates of introgression from Neanderthals relative to Europeans was suggested by Browning et al. to possibly be due to the Dilution of Neanderthal admixture in Europeans that resulted from a population that had no Neanderthal admixture (Meyer et al., 2012; Vernot & Akey, 2015). If there was an additional pulse of Neanderthal admixture into East Asians following the Europe-Asia split, then it was from a population that was related closely to the main admixing Neanderthals.
Browning et al. found a number of high-frequency introgressed haplotypes that that appear to have been subject to positive selection. There are 2 of these regions that are involved immunity, and contain the immunoglobulin heavy locus and a cluster of chemokine receptors. As well as earlier reports of positively selected introgressed haplotypes in histocompatibility leucocyte antigen (HLA) genes (Abi-Rached et al., 2011), Toll-like receptors (Deschamps et al., 2016), and other genes associated with immunity (Abi-Rached et al., 201; Deschamps et al., 2016; Racimo et al., 2015) underscore the crucial role that is played by Neanderthal introgression in the adaptation of the human immune system to the pathogenic landscape of Eurasia.
The results obtained by Browning et al. by the use of a new S*-like algorithm for reference-free introgression detection that is genome-wide. Their method was implemented in a freely-available software package Sprime and is computationally efficient for the analysis of large sequenced datasets. E.g., only 4 hours of computing time on a 2.6 Ghz CPU was required for a genome-wide analysis of almost 4,000 UK10K individuals. Computationally efficient methods, such as the one described in this study, for the construction of a map that contains all surviving archaic hominin sequences in human populations of the present, as the number of sequences continues to grow.
The methods of Browning et al. report the set of putative archaic-specific alleles in each introgressed segment. It is useful for downstream analyses to have direct identification of the putative archaic-specific alleles. The degree of divergence between the introgressing and sequenced archaic individuals is indicated by the rates of matching of these alleles to a reference archaic genome. The usefulness of these match rates is shown in this study, where they have revealed 2 pulses of Denisovan admixture.
Browning, S. R., et al. (2018). "Analysis of human sequence data reveals two pulses of archaic Denisovan admixture." Cell 173.
|Author: M.H.Monroe Email: email@example.com Sources & Further reading|