Header

UZH-Logo

Maintenance Infos

Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data


Forster, Michael; Szymczak, Silke; Ellinghaus, David; Hemmrich, Georg; Rühlemann, Malte; Kraemer, Lars; Mucha, Sören; Wienbrandt, Lars; Stanulla, Martin; Franke, Andre (2015). Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data. Scientific Reports, 5:11534.

Abstract

Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses.

Abstract

Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses.

Statistics

Citations

3 citations in Web of Science®
5 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

20 downloads since deposited on 26 Nov 2015
14 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:04 Faculty of Medicine > University Children's Hospital Zurich > Medical Clinic
Dewey Decimal Classification:610 Medicine & health
Language:English
Date:July 2015
Deposited On:26 Nov 2015 14:39
Last Modified:04 Aug 2017 12:59
Publisher:Nature Publishing Group
ISSN:2045-2322
Free access at:PubMed ID. An embargo period may apply.
Publisher DOI:https://doi.org/10.1038/srep11534
PubMed ID:26166306

Download

Preview Icon on Download
Preview
Content: Published Version
Filetype: PDF
Size: 603kB
View at publisher

Article Networks

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations