Abstract

Bioinformatics and computational biology are rooted in life sciences as well as computer and information sciences and technologies. Bioinformatics applies principles of information sciences and technologies to make the vast, diverse, and complex life sciences data more understandable and useful. Computational biology uses mathematical and computational approaches to address theoretical and experimental questions in biology. Short read sequence assembly is one of the most important steps in the analysis of biological data. There are many open source software’s available for short read sequence assembly where MAQ is one such popularly used software by the research community. In general, biological data sets generated by next generation sequencers are very huge and massive which requires tremendous amount of computational resources. The algorithm used for the short read sequence assembly is NP Hard which is computationally expensive and time consuming. Also MAQ is single threaded software which doesn't use the power of multi core and distributed computing and it doesn't scale. In this paper we report HPC-MAQ which addresses the NP-Hard related challenges of genome reference assembly and enables MAQ parallel and scalable through Hadoop which is a software framework for distributed computing. In this paper we try to perform thread level parallelism using openMP .and it can reduce computational time.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.