Navigation auf


ZORA (Zurich Open Repository and Archive)

ARPEGGIO: Automated Reproducible Polyploid EpiGenetic GuIdance workflOw

Milosavljevic, Stefan; Kuo, Tony; Decarli, Samuele; Mohn, Lucas; Sese, Jun; Shimizu, Kentaro K; Shimizu-Inatsugi, Rie; Robinson, Mark D (2021). ARPEGGIO: Automated Reproducible Polyploid EpiGenetic GuIdance workflOw. BMC Genomics, 22:547.


Background: Whole genome duplication (WGD) events are common in the evolutionary history of many living organisms. For decades, researchers have been trying to understand the genetic and epigenetic impact of WGD and its underlying molecular mechanisms. Particular attention was given to allopolyploid study systems, species resulting from an hybridization event accompanied by WGD. Investigating the mechanisms behind the survival of a newly formed allopolyploid highlighted the key role of DNA methylation. With the improvement of high-throughput methods, such as whole genome bisulfite sequencing (WGBS), an opportunity opened to further understand the role of DNA methylation at a larger scale and higher resolution. However, only a few studies have applied WGBS to allopolyploids, which might be due to lack of genomic resources combined with a burdensome data analysis process. To overcome these problems, we developed the Automated Reproducible Polyploid EpiGenetic GuIdance workflOw (ARPEGGIO): the first workflow for the analysis of epigenetic data in polyploids. This workflow analyzes WGBS data from allopolyploid species via the genome assemblies of the allopolyploid's parent species. ARPEGGIO utilizes an updated read classification algorithm (EAGLE-RC), to tackle the challenge of sequence similarity amongst parental genomes. ARPEGGIO offers automation, but more importantly, a complete set of analyses including spot checks starting from raw WGBS data: quality checks, trimming, alignment, methylation extraction, statistical analyses and downstream analyses. A full run of ARPEGGIO outputs a list of genes showing differential methylation. ARPEGGIO was made simple to set up, run and interpret, and its implementation ensures reproducibility by including both package management and containerization.

Results: We evaluated ARPEGGIO in two ways. First, we tested EAGLE-RC's performance with publicly available datasets given a ground truth, and we show that EAGLE-RC decreases the error rate by 3 to 4 times compared to standard approaches. Second, using the same initial dataset, we show agreement between ARPEGGIO's output and published results. Compared to other similar workflows, ARPEGGIO is the only one supporting polyploid data.

Conclusions: The goal of ARPEGGIO is to promote, support and improve polyploid research with a reproducible and automated set of analyses in a convenient implementation. ARPEGGIO is available at .

Keywords: Allopolyploids; Automation; Bisulfite-sequencing; Dna-methylation; Epigenetics; Polyploidy; Reproducibility; Snakemake; Whole-genome-bisulfite-sequencing; Workflow.

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Molecular Life Sciences
08 Research Priority Programs > Evolution in Action: From Genomes to Ecosystems
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Life Sciences > Biotechnology
Life Sciences > Genetics
Uncontrolled Keywords:Genetics, Biotechnology
Date:1 December 2021
Deposited On:27 Aug 2021 15:20
Last Modified:26 Aug 2024 01:35
Publisher:BioMed Central
OA Status:Gold
Free access at:PubMed ID. An embargo period may apply.
Publisher DOI:
PubMed ID:34273949
Project Information:
  • Funder: University Research Priority Program (URPP) Evolution in Action of the University of Zurich
  • Grant ID:
  • Project Title:
Download PDF  'ARPEGGIO: Automated Reproducible Polyploid EpiGenetic GuIdance workflOw'.
  • Content: Published Version
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export


Citations Metrics
4 citations in Web of Science®
4 citations in Scopus®
Google Scholar™



22 downloads since deposited on 27 Aug 2021
3 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications