Publication:

Tree reconciliation combined with subsampling improves large scale inference of orthologous group hierarchies

Date

Date

Date
2019
Journal Article
Published version
cris.lastimport.scopus2025-06-03T03:30:55Z
cris.lastimport.wos2025-07-22T01:31:40Z
cris.virtual.orcidhttps://orcid.org/0000-0001-7734-9102
cris.virtualsource.orcid2555623e-228d-47f5-9e95-a606b4b3a224
dc.contributor.institutionUniversity of Zurich
dc.date.accessioned2020-02-14T10:27:30Z
dc.date.available2020-02-14T10:27:30Z
dc.date.issued2019-12-01
dc.description.abstract

Background: An orthologous group (OG) comprises a set of orthologous and paralogous genes that share a last common ancestor (LCA). OGs are defined with respect to a chosen taxonomic level, which delimits the position of the LCA in time to a specified speciation event. A hierarchy of OGs expands on this notion, connecting more general OGs, distant in time, to more recent, fine-grained OGs, thereby spanning multiple levels of the tree of life. Large scale inference of OG hierarchies with independently computed taxonomic levels can suffer from inconsistencies between successive levels, such as the position in time of a duplication event. This can be due to confounding genetic signal or algorithmic limitations. Importantly, inconsistencies limit the potential use of OGs for functional annotation and third-party applications. Results: Here we present a new methodology to ensure hierarchical consistency of OGs across taxonomic levels. To resolve an inconsistency, we subsample the protein space of the OG members and perform gene tree-species tree reconciliation for each sampling. Differently from previous approaches, by subsampling the protein space, we avoid the notoriously difficult task of accurately building and reconciling very large phylogenies. We implement the method into a high-throughput pipeline and apply it to the eggNOG database. We use independent protein domain definitions to validate its performance. Conclusion: The presented consistency pipeline shows that, contrary to previous limitations, tree reconciliation can be a useful instrument for the construction of OG hierarchies. The key lies in the combination of sampling smaller trees and aggregating their reconciliations for robustness. Results show comparable or greater performance to previous pipelines. The code is available on Github at: https://github.com/meringlab/og_consistency_pipeline.

dc.identifier.doi10.1186/s12859-019-2828-z
dc.identifier.issn1471-2105
dc.identifier.scopus2-s2.0-85065656024
dc.identifier.urihttps://www.zora.uzh.ch/handle/20.500.14742/168508
dc.identifier.wos000467049700001
dc.language.isoeng
dc.subjectBiochemistry
dc.subjectApplied Mathematics
dc.subjectMolecular Biology
dc.subjectStructural Biology
dc.subjectComputer Science Applications
dc.subject.ddc570 Life sciences; biology
dc.title

Tree reconciliation combined with subsampling improves large scale inference of orthologous group hierarchies

dc.typearticle
dcterms.accessRightsinfo:eu-repo/semantics/openAccess
dcterms.bibliographicCitation.journaltitleBMC Bioinformatics
dcterms.bibliographicCitation.number1
dcterms.bibliographicCitation.originalpublishernameBioMed Central
dcterms.bibliographicCitation.pagestart228
dcterms.bibliographicCitation.pmid31060495
dcterms.bibliographicCitation.volume20
dspace.entity.typePublicationen
uzh.contributor.affiliationUniversity of Zurich, Swiss Institute of Bioinformatics
uzh.contributor.affiliationUniversity of Zurich, Swiss Institute of Bioinformatics
uzh.contributor.affiliationUniversity of Zurich, Swiss Institute of Bioinformatics
uzh.contributor.authorHeller, Davide
uzh.contributor.authorSzklarczyk, Damian
uzh.contributor.authorvon Mering, Christian
uzh.contributor.correspondenceNo
uzh.contributor.correspondenceNo
uzh.contributor.correspondenceYes
uzh.document.availabilitypublished_version
uzh.eprint.datestamp2020-02-14 10:27:30
uzh.eprint.lastmod2025-07-22 01:37:15
uzh.eprint.statusChange2020-02-14 10:27:30
uzh.harvester.ethYes
uzh.harvester.nbNo
uzh.identifier.doi10.5167/uzh-185060
uzh.jdb.eprintsId13783
uzh.oastatus.unpaywallgold
uzh.oastatus.zoraGold
uzh.publication.citationHeller, D., Szklarczyk, D., & von Mering, C. (2019). Tree reconciliation combined with subsampling improves large scale inference of orthologous group hierarchies. BMC Bioinformatics, 20, 228. https://doi.org/10.1186/s12859-019-2828-z
uzh.publication.freeAccessAtpubmedid
uzh.publication.originalworkoriginal
uzh.publication.publishedStatusfinal
uzh.scopus.impact1
uzh.scopus.subjectsStructural Biology
uzh.scopus.subjectsBiochemistry
uzh.scopus.subjectsMolecular Biology
uzh.scopus.subjectsComputer Science Applications
uzh.scopus.subjectsApplied Mathematics
uzh.workflow.doajuzh.workflow.doaj.true
uzh.workflow.eprintid185060
uzh.workflow.fulltextStatuspublic
uzh.workflow.revisions46
uzh.workflow.rightsCheckkeininfo
uzh.workflow.sourceCrossRef:10.1186/s12859-019-2828-z
uzh.workflow.statusarchive
uzh.wos.impact1
Files

Original bundle

Name:
ZORA185060.pdf
Size:
1.12 MB
Format:
Adobe Portable Document Format
Publication available in collections: