Header

UZH-Logo

Maintenance Infos

Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data


Weber, Lukas M; Robinson, Mark D (2016). Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data. Cytometry. Part A, 89(12):1084-1096.

Abstract

Recent technological developments in high-dimensional flow cytometry and mass cytometry (CyTOF) have made it possible to detect expression levels of dozens of protein markers in thousands of cells per second, allowing cell populations to be characterized in unprecedented detail. Traditional data analysis by "manual gating" can be inefficient and unreliable in these high-dimensional settings, which has led to the development of a large number of automated analysis methods. Methods designed for unsupervised analysis use specialized clustering algorithms to detect and define cell populations for further downstream analysis. Here, we have performed an up-to-date, extensible performance comparison of clustering methods for high-dimensional flow and mass cytometry data. We evaluated methods using several publicly available data sets from experiments in immunology, containing both major and rare cell populations, with cell population identities from expert manual gating as the reference standard. Several methods performed well, including FlowSOM, X-shift, PhenoGraph, Rclusterpp, and flowMeans. Among these, FlowSOM had extremely fast runtimes, making this method well-suited for interactive, exploratory analysis of large, high-dimensional data sets on a standard laptop or desktop computer. These results extend previously published comparisons by focusing on high-dimensional data and including new methods developed for CyTOF data. R scripts to reproduce all analyses are available from GitHub (https://github.com/lmweber/cytometry-clustering-comparison), and pre-processed data files are available from FlowRepository (FR-FCM-ZZPH), allowing our comparisons to be extended to include new clustering methods and reference data sets.

Abstract

Recent technological developments in high-dimensional flow cytometry and mass cytometry (CyTOF) have made it possible to detect expression levels of dozens of protein markers in thousands of cells per second, allowing cell populations to be characterized in unprecedented detail. Traditional data analysis by "manual gating" can be inefficient and unreliable in these high-dimensional settings, which has led to the development of a large number of automated analysis methods. Methods designed for unsupervised analysis use specialized clustering algorithms to detect and define cell populations for further downstream analysis. Here, we have performed an up-to-date, extensible performance comparison of clustering methods for high-dimensional flow and mass cytometry data. We evaluated methods using several publicly available data sets from experiments in immunology, containing both major and rare cell populations, with cell population identities from expert manual gating as the reference standard. Several methods performed well, including FlowSOM, X-shift, PhenoGraph, Rclusterpp, and flowMeans. Among these, FlowSOM had extremely fast runtimes, making this method well-suited for interactive, exploratory analysis of large, high-dimensional data sets on a standard laptop or desktop computer. These results extend previously published comparisons by focusing on high-dimensional data and including new methods developed for CyTOF data. R scripts to reproduce all analyses are available from GitHub (https://github.com/lmweber/cytometry-clustering-comparison), and pre-processed data files are available from FlowRepository (FR-FCM-ZZPH), allowing our comparisons to be extended to include new clustering methods and reference data sets.

Statistics

Citations

Dimensions.ai Metrics
237 citations in Web of Science®
252 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

347 downloads since deposited on 07 Feb 2017
24 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Molecular Life Sciences
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Health Sciences > Pathology and Forensic Medicine
Health Sciences > Histology
Life Sciences > Cell Biology
Uncontrolled Keywords:Pathology and Forensic Medicine, Cell Biology, Histology
Language:English
Date:December 2016
Deposited On:07 Feb 2017 12:13
Last Modified:16 Jul 2024 01:40
Publisher:Wiley-Blackwell Publishing, Inc.
ISSN:1552-4922
OA Status:Hybrid
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1002/cyto.a.23030
PubMed ID:27992111
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)