Header

UZH-Logo

Maintenance Infos

Characterization of cancer genomes through systematic analyses of oncogenomic data assemblies


Cai, Haoyang. Characterization of cancer genomes through systematic analyses of oncogenomic data assemblies. 2013, University of Zurich, Faculty of Science.

Abstract

Cancer is the most common genetic disease in humans. It has been estimated that more than 10 million new cancer patients are detected worldwide each year. In the last decades, many efforts have been made by the research community to contribute to the fight against cancer. These works greatly expanded our understanding of the disease. However, the exact mechanisms of cancer initiation and progression remain elusive. The research on cancer genomes has focused on the identification of DNA sequence mutations and chromosomal rearrangements. Some of these somatic alterations can confer a growth advantage to cancer cells and promote cancer development. Mutated genes in cancer genomes can be potential new drug targets or serve as biomarkers for the improvement of diagnostics and therapy.
Today, high-throughput genome-wide profiling technologies allow us to characterize the molecular profiles of cancer samples on various levels, including copy number alterations, gene expression, point mutations and epigenetic marks. Cancer research has gradually shifted from single experiments to large-scale “omics” data analysis approaches. It is an exciting, but challenging work. Our group aims to develop reliable and robust methods to characterize cancer genomes by analyzing large-scale oncogenomic datasets.
During the last 4 years, I have focused my efforts on using systems biology and statistical methods to model and annotate genomic array data in human cancer. My research is based on a data collection and re-analysis project that generates very large amounts of microarray data. Computational biology approaches were applied on this dataset for data mining. We collected more than 40000 arrays, including comparative genomic hybridization (CGH) and SNP (single nucleotide polymorphism) arrays, from several public databases. A pipeline was developed to process raw data and determine copy number aberrations (CNAs). All data was converted to a unified and structured format, and stored in our arrayMap database, together with available clinical information. We also set up an online website for providing this resource to the research community.
Based on the large-scale CNA data in our database, the second project aimed to explore the correlation between CNAs and local gene density across cancer genomes. Through a genome binning method, I found that focal CNAs are significantly enriched in gene-rich regions. In addition, this positive correlation is not only driven by cancer genes. Since this result is derived from more than 16000 cancer samples, it provides a global insight into the relationship between cancer genome instability and structure from a new perspective. The enrichment reveals that there may be a non-neutral selection pressure for CNA regions across the genome. The observed significant positive correlation in this project may enable a better elucidation of mechanisms by which CNAs contribute to tumor development, and promote a more systematic understanding of cancer.
The third project presented here is related to a new phenomenon, termed “chromothripsis”, found in cancer development. In this type of events, contiguous chromosomal regions are fragmented into many pieces, and the cell’s DNA repair machinery randomly fuses these segments together to rescue the genome. This is quite different from the classical step-by- step model of cancer development. We developed an algorithm based on scan statistics to automatically detect chromothripsis-like patterns, and identify both size and location of the involved regions. From our input of 22,347 high quality arrays, we identified 918 chromothripsis cases, representing 132 cancer types. The results from this dataset provide several new insights regarding the distribution of chromothripsis-like patterns and a comprehensive estimation of chromothripsis incidence in a large range of cancer entities. Importantly, our work partly overcomes the limitation of individual research projects resulting from the relatively low incidence of chromothripsis in cancer samples available. An investigation into the affected chromosomal regions supports breakage-fusion-bridge cycles as one of the potential underlying mechanisms. Finally, we evaluated the clinical associations of chromothripsis and found that this event may be associated with a poor outcome. The observed chromothripsis events in our project may reflect on heterogenous biological phenomena, and probably vary in their specific impact on oncogenesis. Taken together, the results presented in this thesis characterize the cancer genome by large-scale oncogenomic array data, and further elucidate the potential mechanisms underlying cancer development.

Abstract

Cancer is the most common genetic disease in humans. It has been estimated that more than 10 million new cancer patients are detected worldwide each year. In the last decades, many efforts have been made by the research community to contribute to the fight against cancer. These works greatly expanded our understanding of the disease. However, the exact mechanisms of cancer initiation and progression remain elusive. The research on cancer genomes has focused on the identification of DNA sequence mutations and chromosomal rearrangements. Some of these somatic alterations can confer a growth advantage to cancer cells and promote cancer development. Mutated genes in cancer genomes can be potential new drug targets or serve as biomarkers for the improvement of diagnostics and therapy.
Today, high-throughput genome-wide profiling technologies allow us to characterize the molecular profiles of cancer samples on various levels, including copy number alterations, gene expression, point mutations and epigenetic marks. Cancer research has gradually shifted from single experiments to large-scale “omics” data analysis approaches. It is an exciting, but challenging work. Our group aims to develop reliable and robust methods to characterize cancer genomes by analyzing large-scale oncogenomic datasets.
During the last 4 years, I have focused my efforts on using systems biology and statistical methods to model and annotate genomic array data in human cancer. My research is based on a data collection and re-analysis project that generates very large amounts of microarray data. Computational biology approaches were applied on this dataset for data mining. We collected more than 40000 arrays, including comparative genomic hybridization (CGH) and SNP (single nucleotide polymorphism) arrays, from several public databases. A pipeline was developed to process raw data and determine copy number aberrations (CNAs). All data was converted to a unified and structured format, and stored in our arrayMap database, together with available clinical information. We also set up an online website for providing this resource to the research community.
Based on the large-scale CNA data in our database, the second project aimed to explore the correlation between CNAs and local gene density across cancer genomes. Through a genome binning method, I found that focal CNAs are significantly enriched in gene-rich regions. In addition, this positive correlation is not only driven by cancer genes. Since this result is derived from more than 16000 cancer samples, it provides a global insight into the relationship between cancer genome instability and structure from a new perspective. The enrichment reveals that there may be a non-neutral selection pressure for CNA regions across the genome. The observed significant positive correlation in this project may enable a better elucidation of mechanisms by which CNAs contribute to tumor development, and promote a more systematic understanding of cancer.
The third project presented here is related to a new phenomenon, termed “chromothripsis”, found in cancer development. In this type of events, contiguous chromosomal regions are fragmented into many pieces, and the cell’s DNA repair machinery randomly fuses these segments together to rescue the genome. This is quite different from the classical step-by- step model of cancer development. We developed an algorithm based on scan statistics to automatically detect chromothripsis-like patterns, and identify both size and location of the involved regions. From our input of 22,347 high quality arrays, we identified 918 chromothripsis cases, representing 132 cancer types. The results from this dataset provide several new insights regarding the distribution of chromothripsis-like patterns and a comprehensive estimation of chromothripsis incidence in a large range of cancer entities. Importantly, our work partly overcomes the limitation of individual research projects resulting from the relatively low incidence of chromothripsis in cancer samples available. An investigation into the affected chromosomal regions supports breakage-fusion-bridge cycles as one of the potential underlying mechanisms. Finally, we evaluated the clinical associations of chromothripsis and found that this event may be associated with a poor outcome. The observed chromothripsis events in our project may reflect on heterogenous biological phenomena, and probably vary in their specific impact on oncogenesis. Taken together, the results presented in this thesis characterize the cancer genome by large-scale oncogenomic array data, and further elucidate the potential mechanisms underlying cancer development.

Statistics

Downloads

32 downloads since deposited on 10 Apr 2019
32 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Dissertation (monographical)
Referees:von Mering Christian, Baudis Michael
Communities & Collections:UZH Dissertations
Dewey Decimal Classification:Unspecified
Language:English
Place of Publication:Zürich
Date:2013
Deposited On:10 Apr 2019 12:21
Last Modified:07 Apr 2020 07:17
Number of Pages:175
OA Status:Green
Related URLs:https://www.recherche-portal.ch/primo-explore/fulldisplay?docid=ebi01_prod010046099&context=L&vid=ZAD&search_scope=default_scope&tab=default_tab&lang=de_DE (Library Catalogue)

Download

Green Open Access

Download PDF  'Characterization of cancer genomes through systematic analyses of oncogenomic data assemblies'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 20MB