Header

UZH-Logo

Maintenance Infos

Graph-based modeling of tandem repeats improves global multiple sequence alignment


Szalkowski, Adam M; Anisimova, Maria (2013). Graph-based modeling of tandem repeats improves global multiple sequence alignment. Nucleic Acids Research, 41(17):e162.

Abstract

Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lost or inserted at any position of a TR region by replication slippage or recombination, but current methods assume fixed unit boundaries, and yet are of high complexity. We present a new global graph-based alignment method that does not restrict TR unit indels by unit boundaries. TR indels are modeled separately and penalized using the phylogeny-aware alignment algorithm. This ensures enhanced accuracy of reconstructed alignments, disentangling TRs and measuring indel events and rates in a biologically meaningful way. Our method detects not only duplication events but also all changes in TR regions owing to recombination, strand slippage and other events inserting or deleting TR units. We evaluate our method by simulation incorporating TR evolution, by either sampling TRs from a profile hidden Markov model or by mimicking strand slippage with duplications. The new method is illustrated on a family of type III effectors, a pathogenicity determinant in agriculturally important bacteria Ralstonia solanacearum. We show that TR indel rate variation contributes to the diversification of this protein family

Abstract

Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lost or inserted at any position of a TR region by replication slippage or recombination, but current methods assume fixed unit boundaries, and yet are of high complexity. We present a new global graph-based alignment method that does not restrict TR unit indels by unit boundaries. TR indels are modeled separately and penalized using the phylogeny-aware alignment algorithm. This ensures enhanced accuracy of reconstructed alignments, disentangling TRs and measuring indel events and rates in a biologically meaningful way. Our method detects not only duplication events but also all changes in TR regions owing to recombination, strand slippage and other events inserting or deleting TR units. We evaluate our method by simulation incorporating TR evolution, by either sampling TRs from a profile hidden Markov model or by mimicking strand slippage with duplications. The new method is illustrated on a family of type III effectors, a pathogenicity determinant in agriculturally important bacteria Ralstonia solanacearum. We show that TR indel rate variation contributes to the diversification of this protein family

Statistics

Citations

Dimensions.ai Metrics
19 citations in Web of Science®
17 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

17 downloads since deposited on 19 Nov 2018
4 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:National licences > 142-005
Dewey Decimal Classification:570 Life sciences; biology
610 Medicine & health
Scopus Subject Areas:Life Sciences > Genetics
Uncontrolled Keywords:Genetics
Language:English
Date:1 September 2013
Deposited On:19 Nov 2018 18:16
Last Modified:15 Apr 2021 14:51
Publisher:Oxford University Press
ISSN:0305-1048
OA Status:Gold
Free access at:PubMed ID. An embargo period may apply.
Publisher DOI:https://doi.org/10.1093/nar/gkt628
PubMed ID:23877246

Download

Gold Open Access

Download PDF  'Graph-based modeling of tandem repeats improves global multiple sequence alignment'.
Preview
Content: Published Version
Language: English
Filetype: PDF (Nationallizenz 142-005)
Size: 1MB
View at publisher