Abstract

Association mapping is the process of linking phenotypes with genotypes. In genome wide association studies (GWAS), individuals are first genotyped using microarrays or by aligning sequenced reads to reference genomes. However, both these approaches rely on reference genomes which limits their application to organisms with no or incomplete reference genomes. To address this, reference free association mapping methods have been developed. Here we present the protocol of an alignment free method for association studies which is based on counting k-mers in sequenced reads, testing for associations between k-mers and the phenotype of interest, and local assembly of the k-mers of statistical significance. The method can map associations of categorical phenotypes to sequence and structural variations without requiring prior sequencing of reference genomes.

Highlights

  • [Background] Association mapping, i.e., the process of associating genotypes to phenotypes is most frequently performed in the form of genome wide association studies (GWAS) with single nucleotide polymorphisms (SNP)

  • Microarrays are used to genotype individuals at a large number of known SNP locations and each SNP is tested for association with the phenotype of interest

  • We present the protocol of the reference free association mapping tool HAWK, which was developed by Rahman et al, 2018 and extended by Mehrab et al, 2020

Read more

Summary

Introduction

[Background] Association mapping, i.e., the process of associating genotypes to phenotypes is most frequently performed in the form of genome wide association studies (GWAS) with single nucleotide polymorphisms (SNP). Microarrays are used to genotype individuals at a large number of known SNP locations and each SNP is tested for association with the phenotype of interest. This approach requires prior sequencing of a reference genome and determining the locations of the SNPs. this precludes mapping associations to structural variations such as insertion-deletions (indels) and copy number variations, and to variations outside of the reference genome. With advances in sequencing technologies, the use of whole genome sequenced reads for association mapping is increasingly becoming more widespread.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call