Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Identification of Relevant Features in Complex Biomedical Datasets Using Artificial Intelligence

Zhakparov, Damir. Identification of Relevant Features in Complex Biomedical Datasets Using Artificial Intelligence. 2024, University of Zurich, Faculty of Science.

Abstract

The exponential growth of large-scale datasets in biomedical research, driven by technological advancements, presents significant challenges in data analysis. This thesis explores the application of machine learning (ML) and advanced computational techniques to high-dimensional biomedical data, demonstrating their potential in various domains. The research focuses on four main areas: (1) using ML to combat the COVID-19 pandemic through early detection of SARS-CoV-2 infection and COVID-19 disease outcome prediction, (2) monitoring the spread of SARS-CoV-2 variants at major events based on wastewater sequencing data, (3) developing a robust feature selection pipeline for bulk RNA sequencing (RNA-seq) datasets, and (4) integrating multi-modal data for biomarker discovery in atopic dermatitis (AD).

The first set of studies highlighted the effectiveness of ML models, particularly gradient boosting, in early detection of SARS-CoV-2 infection; and in predicting severe outcomes in COVID-19 patients. These models identified critical predictive features and provided practical decision-support tools for clinical use. The second study addresses the management of COVID-19 pandemic through wastewater-based epidemiology, and showed an effective strategy for an unbiased estimation of the viral spread in the communities.

In the third area, the thesis introduces GeneSelectR, an R package designed for feature selection in large-scale RNA-seq datasets. GeneSelectR combines a number of ML feature selection algorithms and the option to include the results of a traditional differential gene expression analysis with an assessment of the biological relevance of the lists of selected features, offering a comprehensive framework. Its application to large-scale RNA-seq datasets on cancer and AD datasets revealed important transcript subsets, showcasing its utility in the analysis of complex datasets.

The fourth focus area demonstrated the power of ML in integrating multi-modal data to discover biomarkers for AD. By combining RNA-seq, clinical questionnaire, and cytokine profiling data, the study identified robust biomarkers, providing deeper insights into protective and susceptibility features associated with AD.

In conclusion, this thesis demonstrates the potential of ML techniques, offering robust solutions for complex data analysis in the biomedical field.

Additional indexing

Item Type:Dissertation (monographical)
Referees:Akdis Cezmi A, Bärenfaller Katja, Wenger Roland H, Beerenwinkel Niko
Communities & Collections:04 Faculty of Medicine > Department of Biochemistry
07 Faculty of Science > Department of Biochemistry

04 Faculty of Medicine > Institute of Physiology
07 Faculty of Science > Institute of Physiology

07 Faculty of Science > Institute of Mathematics
05 Vetsuisse Faculty > Veterinärwissenschaftliches Institut > Department of Molecular Mechanisms of Disease
07 Faculty of Science > Department of Molecular Mechanisms of Disease

UZH Dissertations
Dewey Decimal Classification:610 Medicine & health
510 Mathematics
570 Life sciences; biology
Language:English
Place of Publication:Zürich
Date:16 October 2024
Deposited On:16 Oct 2024 13:49
Last Modified:19 Oct 2024 03:18
Number of Pages:265
OA Status:Green
Download PDF  'Identification of Relevant Features in Complex Biomedical Datasets Using Artificial Intelligence'.
Preview
  • Content: Published Version
  • Language: English

Metadata Export

Statistics

Downloads

58 downloads since deposited on 16 Oct 2024
58 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications