Header

UZH-Logo

Maintenance Infos

CNARA: reliability assessment for genomic copy number profiles


Ai, Ni; Cai, Haoyang; Solovan, Caius; Baudis, Michael (2016). CNARA: reliability assessment for genomic copy number profiles. BMC Genomics, 17:799.

Abstract

BACKGROUND

DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly.

RESULTS

Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at https://github.com/baudisgroup/CNARA .

CONCLUSIONS

We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research.

Abstract

BACKGROUND

DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly.

RESULTS

Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at https://github.com/baudisgroup/CNARA .

CONCLUSIONS

We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research.

Statistics

Citations

Dimensions.ai Metrics
1 citation in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Downloads

11 downloads since deposited on 21 Feb 2020
6 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Molecular Life Sciences
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Life Sciences > Biotechnology
Life Sciences > Genetics
Language:English
Date:12 October 2016
Deposited On:21 Feb 2020 06:44
Last Modified:11 May 2020 16:44
Publisher:BioMed Central
ISSN:1471-2164
Funders:Swiss Federation through the Swiss Contribution to the enlarged European Union
OA Status:Gold
Free access at:PubMed ID. An embargo period may apply.
Publisher DOI:https://doi.org/10.1186/s12864-016-3074-7
PubMed ID:27733115

Download

Gold Open Access

Download PDF  'CNARA: reliability assessment for genomic copy number profiles'.
Preview
Content: Published Version
Filetype: PDF
Size: 5MB
View at publisher
Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)