Header

UZH-Logo

Maintenance Infos

Count-based differential expression analysis of RNA sequencing data using R and Bioconductor


Anders, Simon; McCarthy, Davis J; Chen, Yunshun; Okoniewski, Michal; Smyth, Gordon K; Huber, Wolfgang; Robinson, Mark D (2013). Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols, 8(9):1765-1786.

Abstract

RNA sequencing (RNA-seq) has been rapidly adopted for the profiling of transcriptomes in many areas of biology, including studies into gene regulation, development and disease. Of particular interest is the discovery of differentially expressed genes across different conditions (e.g., tissues, perturbations) while optionally adjusting for other systematic factors that affect the data-collection process. There are a number of subtle yet crucial aspects of these analyses, such as read counting, appropriate treatment of biological variability, quality control checks and appropriate setup of statistical modeling. Several variations have been presented in the literature, and there is a need for guidance on current best practices. This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR. Hands-on time for typical small experiments (e.g., 4-10 samples) can be <1 h, with computation time <1 d using a standard desktop PC.

Abstract

RNA sequencing (RNA-seq) has been rapidly adopted for the profiling of transcriptomes in many areas of biology, including studies into gene regulation, development and disease. Of particular interest is the discovery of differentially expressed genes across different conditions (e.g., tissues, perturbations) while optionally adjusting for other systematic factors that affect the data-collection process. There are a number of subtle yet crucial aspects of these analyses, such as read counting, appropriate treatment of biological variability, quality control checks and appropriate setup of statistical modeling. Several variations have been presented in the literature, and there is a need for guidance on current best practices. This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR. Hands-on time for typical small experiments (e.g., 4-10 samples) can be <1 h, with computation time <1 d using a standard desktop PC.

Statistics

Citations

289 citations in Web of Science®
287 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

1 download since deposited on 16 Sep 2013
0 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:04 Faculty of Medicine > Functional Genomics Center Zurich
07 Faculty of Science > Institute of Molecular Life Sciences
Dewey Decimal Classification:570 Life sciences; biology
610 Medicine & health
Language:English
Date:24 April 2013
Deposited On:16 Sep 2013 06:43
Last Modified:05 Apr 2016 16:58
Publisher:Nature Publishing Group
ISSN:1750-2799
Publisher DOI:https://doi.org/10.1038/nprot.2013.099
PubMed ID:23975260

Download

Preview Icon on Download
Content: Published Version
Filetype: PDF - Registered users only
Size: 1MB
View at publisher