Abstract
This thesis investigated the genetic structure of the Walser people to understand the genetic consequences of their local range expansion in the Middle Ages. Additionally, this work assessed the performance of panels for biogeographical ancestry (BGA) in forensic genetics and showcased the classification performance of a newly developed BGA panel. The Walser are descendants of the Alemanni, who established their presence in Switzerland in the 8th and 9th century. Later, in the 12th and 13th century, the Walser expanded from their homeland in Upper Valais into various regions of the European Alps. Despite the historical significance of this migration, it remains unclear whether this range expansion has left detectable genetic signals in their descendants. To explore this question, we analyzed forensic autosomal STRs, Y-STRs, Y-SNPs, and whole mtDNA across four Walser-homeland, eight Walser, and four non-Walser communities to assess their genetic diversity and differentiation. In all communities, the analyses of the inbreeding coefficient did not indicate any inbreeding at the community level. Furthermore, we found overall low genetic differentiation between Walser-homeland, Walser and non-Walser communities as well as a simulated Swiss reference population (Ref-Pop). Stronger differentiation was only observed in more isolated areas such as Lötschental, Vals, and Gressoney, and moderate differentiation in Avers and Törbel. Notably, the mitochondrial haplogroup W6, which is largely absent in central Europe, was prevalent among the Walser community of Vals, which further highlights the distinctive evolutionary history of this community. This study is the first of its kind in Switzerland and while it provides interesting insights into the genetic diversity and differentiation of the Walser people, it also underscores the importance of conducting more comprehensive analyses using genome-wide data to better understand their evolutionary history. Further, this thesis addresses the application of biogeographical ancestry (BGA) inference in forensics, which is crucial for providing investigative leads. Central to this application is the use of ancestry informative markers (AIMs), which are typically limited to a panel of fewer than 200 markers. Given the vast number of available panels and the scarcity of comparative studies, it is challenging for forensic laboratories to select one panel for validation and subsequent accreditation. To address this challenge, we compared the performance of three forensic panels (MAPlex, Precision ID Ancestry Panel [PIDAP], and VISAGE Basic Tool [VISAGE BT]) designed for use with the Massively Parallel Sequencing technology. Our goal was to identify the best-performing panel among three. We did so by comparing STRUCTURE clustering patterns of the three panels, relative to a larger reference set comprising 10k SNPs and using 3,957 individuals. We employed two measures: i) the G′ similarity score to evaluate how well each panel’s clustering patterns matched those of the reference set, and ii) the area under the precision-recall curve (AUC-PR) to assess the classification performance of each panel. While the VISAGE BT panel performed best (G′ ≈ 90% and AUC-PR ≈ 97%, at K = 6; assuming six ancestral populations) out of the three panels, we noticed room for improvement in the resolution of BGA inference. Particularly, we observed that, at higher model complexities (K = 7 and K = 8), all three panels consistently failed to recover the STRUCTURE clustering patterns relative to those obtained with the 10k reference set, which resulted in a suboptimal performance. In response to the challenges observed with existing BGA panels, chapter thee of this thesis focuses on the development of a larger SNP panel for BGA inference. By compiling data from over 6,500 individuals from studies that used the Human Origins (HO) array, we selected and evaluated the performance of an expanded panel comprising 1,900 SNPs. We selected the most informative SNPs with unsupervised K-means clustering with Feature Ranking (SKFR) implemented in OpenADMIXTURE. Our evaluations, using G′ similarity scores, showed that the new panel not only significantly outperformed a randomly selected set of equal size but also achieved G′ similarity scores above 88% at K = 8, 9, and 10. Furthermore, the analyses with the GENOGEOGRAPHER tool demonstrated high classification accuracy (92.33%) in correctly assigning individuals to their respective geographic regions, defined as 21 predefined metapopulations. In contrast, panels with fewer than 200 AIMs typically allow classification into only 6-8 continental regions. Consequently, this panel represents a significant advancement in the accuracy of BGA inference, marking a leap forward in our ability to determine ancestry with unprecedented resolution and at an affordable cost. In conclusion, this thesis showcases the utility of genetic information for both population genetic studies and contemporary forensic applications, emphasizing the ongoing relevance of genetic research in reconstructing human evolutionary history but also to solve real-world problems.