Abstract

The pre-processing ‎ ‎ phase‎ is a crucial step to prepare any data for deep considerable ‎ analysis. ‎Genome-wide data ‎is considered ‎ big data; dealing with such data is not an easy task and still poses ‎a significant challenge. The ‎genome-wide association study (GWAS) ‎ is based on enormous high-‎density data with high throughput. This paper has illustrated the main pre-processing ‎ steps on data ‎from North American Rheumatoid Arthritis Consortium ‎‎(NARAC) for preparing it for haplotype ‎block partitioning using different methods and with different platforms. This paper’s main ‎objective is to summarize the steps of pre-processing the raw genotyped dataset to prepare it for ‎haplotype block partitioning and further analyses. Besides, we present each practical step by clear ‎tables for better visualizing, elucidation, and workflow interpretation. Besides, we aimed to ‎overcome the missing data and normalize the output in a standardized format. Eventually, this will ‎improve the understanding of such data formats and build the foundation stone of critical genome-wide experiments and studies. Thus, this work could a guide for other researchers who use similar ‎data. The pre-processed data will be applied to imputation, BigLD block partitioning under R and ‎Haploview methods. Our sequence of ‎pre-processing steps includes preparing the characters to be ‎in a form that is suitable for imputation. The next step is ‎recording data in 0,1,2 format to be ‎proper for the BigLD. We were finally preparing data for Haploview to ‎provide clear haplotype ‎block partitioning, association analysis, and furthermore.‎

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.