Abstract
Gene duplication is one of the important events for the gain-of-function. The reason is that mutation of one of the duplicated genes will not affect on the function of cells because the alternative duplicated gene will work and can keep the cellular function. On the other hand, it is difficult to determine the duplicated genes from gene sequences in non-model species because of the high similarities of gene sequences between duplicated genes. Therefore, most of known duplicated genes have been found in species whose whole genome sequences are known. In this study, to avoid high cost and time consuming whole genome sequencing, we propose techniques to determine duplicate genes by using large amount of mRNA sequences observed by next-generation sequencer and their mutation positions. We applied frequent pattern mining technique for detecting mutated regions, and the method allows us to compute gene sequence of the duplicated genes and mutated positions from closely related species. In this paper, we applied the algorithm for four different mollusks data observed by next-generation sequencers, and successfully predicted more than hundred duplicated genes, including zinc finger protein whose both sequences and functions are diverged from related species.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IPSJ SIG technical reports
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.