Header

UZH-Logo

Maintenance Infos

Population genomics of the Alpine ibex (Capra ibex)


Leigh, Deborah Marie. Population genomics of the Alpine ibex (Capra ibex). 2018, University of Zurich, Faculty of Science.

Abstract

This thesis examined if signals of selection are present in the bottlenecked and reintroduced populations of the Alpine ibex (Capra ibex). By utilizing single nucleotide polymorphism (SNP) data, I identified weak signals of purifying selection and positive selection. Furthermore, I quantified the detection accuracy achievable for studies using genetic outliers or associations among allele frequencies and environmental variables to detect positive selection after a bottleneck. Additionally, I discussed the biases present in a high-throughput sequencing dataset due to batch effects that can arise in long-term studies where sequencing data is added incrementally. Finally, in this thesis I discussed the importance of considering genetics when planning the reintroduction of a species. Population bottlenecks can have profound and long-lasting genetic consequences. Due to their reduced effective population size, bottlenecked species experience strong genetic drift, loss of genetic variation, and increased inbreeding. They are therefore at risk of maladaptation and extinction. Despite these risks examples of thriving bottlenecked populations are known. Outside of conservation biology, population bottlenecks are important evolutionary forces because they can affect the rate and direction of adaptation. As a result, identifying examples of selection after a bottleneck is of importance to evolutionary and conservation biology. In this thesis, I utilised genome-wide SNP data for 27 populations of Alpine ibex, including the remnant population in the Gran Paradiso National Park. This data was generated with restriction site associated DNA sequencing (RADseq). RADseq is a high-throughput sequencing method that sequences genomic DNA around a restriction enzyme site. I identified over 6000 SNPs in the Alpine ibex genome. With this data, I identified putative signals of purifying selection by comparing exonic SNPs, which are likely under selection, with intronic SNPs and SNPs in intergenic regions, that are expected to be largely neutral. Furthermore I examined the ratio of non-synonymous to synonymous sites. The heterozygosity of exonic SNPs was significantly below that of introns and of intergenic SNPs. In addition, the ratio of non-synonymous to synonymous sites was below one. While this suggests purifyingii selection, due to marker and test limitations, these results are not conclusive and the presence of purifying selection should be viewed with caution. I then searched for signals of positive selection by scanning for large differences in allele frequencies among populations and for correlations between allele frequencies and an environmental variable. The high rates of genetic drift in bottlenecked populations can create false signals of positive selection when using such methods. Therefore, I used a population genetic (forward-time) simulation approach that followed Alpine ibex demography, to generate a simulated set of SNPs including neutral loci and loci that were under selection. I then used these loci to quantify the accuracy of three selection detection methods. To this end, I examined the number of false positive neutral SNPs identified by each method, as well as the number of true positive and false negative simulated selected SNPs. I found that a true discovery rate of over 70% can be achieved by combining three selection detection methods to identify “triple positive” SNPs, and an environmental correlation detection approach. When I applied the selection detection methods to the Alpine ibex empirical RADseq dataset no triple positive SNPs were identified by the triple positive environmental correlation approach. Thus there are no SNPs I confidently identified as under selection, though weak candidates were found by the lower accuracy methods (30- 50% true discovery rate) that may be suitable for further examination. High-throughput sequencing is maturing and an increasing number of studies have used data obtained over time or generated by different investigators. In this thesis, I also discuss the biases and errors, so-called ‘batch effects’, that can be introduced into a study if subsets of data differ in how they were obtained and contain different technical artefacts. I present a case study in the Alpine ibex where batch effects lead to a misleading biological conclusion. Finally in an additional co-authored publication, the importance of considering the long-term genetics of a population during a reintroduction was presented and discussed.

Abstract

This thesis examined if signals of selection are present in the bottlenecked and reintroduced populations of the Alpine ibex (Capra ibex). By utilizing single nucleotide polymorphism (SNP) data, I identified weak signals of purifying selection and positive selection. Furthermore, I quantified the detection accuracy achievable for studies using genetic outliers or associations among allele frequencies and environmental variables to detect positive selection after a bottleneck. Additionally, I discussed the biases present in a high-throughput sequencing dataset due to batch effects that can arise in long-term studies where sequencing data is added incrementally. Finally, in this thesis I discussed the importance of considering genetics when planning the reintroduction of a species. Population bottlenecks can have profound and long-lasting genetic consequences. Due to their reduced effective population size, bottlenecked species experience strong genetic drift, loss of genetic variation, and increased inbreeding. They are therefore at risk of maladaptation and extinction. Despite these risks examples of thriving bottlenecked populations are known. Outside of conservation biology, population bottlenecks are important evolutionary forces because they can affect the rate and direction of adaptation. As a result, identifying examples of selection after a bottleneck is of importance to evolutionary and conservation biology. In this thesis, I utilised genome-wide SNP data for 27 populations of Alpine ibex, including the remnant population in the Gran Paradiso National Park. This data was generated with restriction site associated DNA sequencing (RADseq). RADseq is a high-throughput sequencing method that sequences genomic DNA around a restriction enzyme site. I identified over 6000 SNPs in the Alpine ibex genome. With this data, I identified putative signals of purifying selection by comparing exonic SNPs, which are likely under selection, with intronic SNPs and SNPs in intergenic regions, that are expected to be largely neutral. Furthermore I examined the ratio of non-synonymous to synonymous sites. The heterozygosity of exonic SNPs was significantly below that of introns and of intergenic SNPs. In addition, the ratio of non-synonymous to synonymous sites was below one. While this suggests purifyingii selection, due to marker and test limitations, these results are not conclusive and the presence of purifying selection should be viewed with caution. I then searched for signals of positive selection by scanning for large differences in allele frequencies among populations and for correlations between allele frequencies and an environmental variable. The high rates of genetic drift in bottlenecked populations can create false signals of positive selection when using such methods. Therefore, I used a population genetic (forward-time) simulation approach that followed Alpine ibex demography, to generate a simulated set of SNPs including neutral loci and loci that were under selection. I then used these loci to quantify the accuracy of three selection detection methods. To this end, I examined the number of false positive neutral SNPs identified by each method, as well as the number of true positive and false negative simulated selected SNPs. I found that a true discovery rate of over 70% can be achieved by combining three selection detection methods to identify “triple positive” SNPs, and an environmental correlation detection approach. When I applied the selection detection methods to the Alpine ibex empirical RADseq dataset no triple positive SNPs were identified by the triple positive environmental correlation approach. Thus there are no SNPs I confidently identified as under selection, though weak candidates were found by the lower accuracy methods (30- 50% true discovery rate) that may be suitable for further examination. High-throughput sequencing is maturing and an increasing number of studies have used data obtained over time or generated by different investigators. In this thesis, I also discuss the biases and errors, so-called ‘batch effects’, that can be introduced into a study if subsets of data differ in how they were obtained and contain different technical artefacts. I present a case study in the Alpine ibex where batch effects lead to a misleading biological conclusion. Finally in an additional co-authored publication, the importance of considering the long-term genetics of a population during a reintroduction was presented and discussed.

Statistics

Downloads

4 downloads since deposited on 26 Jan 2018
4 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Dissertation
Referees:Wagner Andreas, Keller Lukas F, Ozgul Arpat, Wicker Thomas, Aeschbacher Simon
Communities & Collections:07 Faculty of Science > Institute of Evolutionary Biology and Environmental Studies
Dewey Decimal Classification:570 Life sciences; biology
590 Animals (Zoology)
Language:English
Date:2018
Deposited On:26 Jan 2018 14:06
Last Modified:30 Aug 2018 10:45
OA Status:Green

Download

Download PDF  'Population genomics of the Alpine ibex (Capra ibex)'.
Preview
Content: Published Version
Filetype: PDF
Size: 3MB
Content: Published Version
Filetype: PDF - Repository staff only until 7 June 2019
Size: 2MB
Embargo till: 2019-06-07