Header

UZH-Logo

Maintenance Infos

Gene duplication detection with frequent pattern analysis


Konaka, Yoshiko; Fukuzaki, Mutsumi; Yoshida, Masaaki; Shimizu, Kentaro K; Ogura, Atsushi; Sese, Jun (2010). Gene duplication detection with frequent pattern analysis. IPSJ SIG Technical Reports, 23:1-8.

Abstract

Gene duplication is one of the important events for the gain-of-function. The reason is that mutation of one of the duplicated genes will not affect on the function of cells because the alternative duplicated gene will work and can keep the cellular function. On the other hand, it is difficult to determine the duplicated genes from gene sequences in non-model species because of the high similarities of gene sequences between duplicated genes. Therefore, most of known duplicated genes have been found in species whose whole genome sequences are known. In this study, to avoid high cost and time consuming whole genome sequencing, we propose techniques to determine duplicate genes by using large amount of mRNA sequences observed by next-generation sequencer and their mutation positions. We applied frequent pattern mining technique for detecting mutated regions, and the method allows us to compute gene sequence of the duplicated genes and mutated positions from closely related species. In this paper, we applied the algorithm for four different mollusks data observed by next-generation sequencers, and successfully predicted more than hundred duplicated genes, including zinc finger protein whose both sequences and functions are diverged from related species.

Abstract

Gene duplication is one of the important events for the gain-of-function. The reason is that mutation of one of the duplicated genes will not affect on the function of cells because the alternative duplicated gene will work and can keep the cellular function. On the other hand, it is difficult to determine the duplicated genes from gene sequences in non-model species because of the high similarities of gene sequences between duplicated genes. Therefore, most of known duplicated genes have been found in species whose whole genome sequences are known. In this study, to avoid high cost and time consuming whole genome sequencing, we propose techniques to determine duplicate genes by using large amount of mRNA sequences observed by next-generation sequencer and their mutation positions. We applied frequent pattern mining technique for detecting mutated regions, and the method allows us to compute gene sequence of the duplicated genes and mutated positions from closely related species. In this paper, we applied the algorithm for four different mollusks data observed by next-generation sequencers, and successfully predicted more than hundred duplicated genes, including zinc finger protein whose both sequences and functions are diverged from related species.

Statistics

Downloads

2 downloads since deposited on 18 Feb 2013
1 download since 12 months
Detailed statistics

Additional indexing

Other titles:頻出パターン解析による重複遺伝子の同定手法
Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Department of Plant and Microbial Biology
07 Faculty of Science > Institute of Evolutionary Biology and Environmental Studies
Dewey Decimal Classification:570 Life sciences; biology
590 Animals (Zoology)
580 Plants (Botany)
Language:Japanese
Date:9 December 2010
Deposited On:18 Feb 2013 15:45
Last Modified:05 Apr 2016 16:22
Publisher:Information Processing Society of Japan
ISSN:0919-6072
Related URLs:http://ci.nii.ac.jp/vol_issue/nels/AA1221543X_en.html

Download

Preview Icon on Download
Content: Published Version
Filetype: PDF - Registered users only
Size: 815kB