Navigation auf zora.uzh.ch

Search

ZORA (Zurich Open Repository and Archive)

Wiring the Microbial Web of Earth – New Approaches for Scalable Computational Prediction of Interpretable Microbial Ecosystem Structure

Tackmann, Janko. Wiring the Microbial Web of Earth – New Approaches for Scalable Computational Prediction of Interpretable Microbial Ecosystem Structure. 2019, University of Zurich, Faculty of Science.

Abstract

Microorganisms are the principal biotic driver of life on Earth. They shape virtually every aspect of the planet’s biosphere, through both the maintenance of global biogeochemical cycles and via essential symbiotic relationships with multi-cellular organisms. Questions related to how individual microbial species form interacting communities (ecosystems)—with a drastic impact on their hosts and environments— are being studied with rapidly accelerating intensity by the Microbial Ecology field. Enabled through recent innovations in sequencing technologies, staggering amounts of knowledge are lately being generated, which already yielded fascinating insights: microorganisms have now been identified in even the most extreme environments, all across the globe, and intriguing connections between hosts and their microbiota are continuously being discovered, including for instance links to disease development and host behavior.
So far, insights have mainly been gained through comparatively straightforward, descriptive analyses of static microbial community snapshots. Such workflows employ for instance diversity-based comparisons of community profiles or the identification of community members that are strongly associated with a condition of interest (e.g. a disease or lifestyle factor). Less research has however focused on disentangling the underlying interaction structures, which dictate ecosystem dynamics and thus ultimately mold the observed community patterns. Elucidating these complex relationships would allow a system-level understanding of microbial communities and inform experiments aimed at mechanistic understanding.
Unfortunately, experimental validation of ecological interactions is currently impossible for all but the smallest communities and is furthermore restricted to microbes culturable under laboratory conditions, which constitute only a tiny fraction of the known diversity. Nonetheless, modern quantities of culture-independent microbial sequencing data offer a wealth of information to fuel computational prediction approaches and alleviate these shortcomings. In particular, such data can be mined for statistical co-occurrence or co-avoidance patterns—indicative of positive (mutualist, commensal) or negative (competitive, parasitic, predatory or amensal) ecological interactions—to enable the prediction of microbial interaction network models. Applying this approach to globally distributed sequencing data, covering diverse habitats and conditions, would result in a model of the microbial web of Earth that could allow a first glimpse at global microbial interaction patterns.
Throughout the last decade, many methods have been proposed for the statistical prediction of ecological interactions. However, these approaches typically do not account for a variety of artifacts, including for instance shared ecological and environmental dependencies. Such artifacts are particularly widespread in global, heterogeneous data sets and thus severely hamper the ecological interpretation of networks inferred from such data. Moreover, current methods generally do not scale to modern (cross-study) sequencing data quantities, which seriously limits the comprehensiveness of predicted models.
In this thesis, I present a new approach to address these shortcomings: FlashWeave. The method uses a flexible Probabilistic Graphical Modeling (PGM) framework to infer direct associations. These predictions are depleted of indirect (i.e. spurious) associations and thus enable sparser and more interpretable ecosystem models. In contrast to the majority of current methods, FlashWeave furthermore scales to data sets with hundreds of thousands of samples and can explicitly integrate environmental and technical factors into model inference. We found that FlashWeave outperformed other approaches in recovering the structure of simulated microbial ecosystems and, additionally, surpassed them in detecting verified interactions within a real-world data set of marine sequencing samples. We furthermore used FlashWeave to predict the to date largest microbial interaction network of the human gastrointestinal tract, which revealed striking signals of potential biological relevance. These include for instance unusually pronounced phylogenetic assortativity, extensive interactions within the rare biosphere and novel mutualist hub species. Moreover, FlashWeave allowed us to infer a global cross-biome interaction network, based on more than half a million sequencing samples that cover highly diverse habitats. In-depth analysis of this network in future studies promises interesting ecological insights.
In the second part of this thesis, I present a parallel line of work that demonstrates how biomarker discovery can also strongly benefit from the removal of indirect associations. In this context, spurious associations may appear between microbes and non-microbial variables of interest (driven, for instance, by ecological microbe-microbe interactions) and can result in numerous redundant biomarkers, which negatively impact prediction quality and complicate biological interpretation. We found that FlashWeave, applied to the exemplary task of identifying microbes directly association to a selection of human body sites, generated highly parsimonious and interpretable biomarker sets. The resulting biomarkers furthermore yielded outstanding predictive performance on both pure and mixed body site microbiota.
The work presented in this thesis is a major step towards a better understanding of global microbial interaction trends, with potential applications for instance in probiotics development, next-generation culturing efforts and ecosystem engineering. It furthermore highlights approaches for more parsimonious and interpretable biomarker discovery, which can be crucial for instance in clinical or forensic applications.

Additional indexing

Item Type:Dissertation (cumulative)
Referees:von Mering Christian, Raes Jeroen, Furrer Reinhard
Communities & Collections:07 Faculty of Science > Institute of Molecular Life Sciences
UZH Dissertations
Dewey Decimal Classification:570 Life sciences; biology
Uncontrolled Keywords:metagenomics, microbiome, ecological network, probabilistic graphical models, compositional-data, confounders, co-occurence, bioinformatics, high performance computing
Language:English
Date:10 May 2019
Deposited On:08 Sep 2020 12:34
Last Modified:21 Apr 2022 16:36
Number of Pages:247
OA Status:Green
Download PDF  'Wiring the Microbial Web of Earth – New Approaches for Scalable Computational Prediction of Interpretable Microbial Ecosystem Structure'.
Preview
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Downloads

180 downloads since deposited on 08 Sep 2020
58 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications