Abstract

Copy number variants (CNVs) are the most common form of structural genetic variation, reflecting the gain or loss of DNA segments compared with a reference genome. Studies have identified CNV association with different diseases. However, the association between the sequential order of CNVs and disease-related traits has not been studied, to our knowledge, and it is still unclear that CNVs function individually or whether they work in coordination with other CNVs to manifest a disease or trait. Consequently, we propose the first such method to test the association between the sequential order of CNVs and diseases. Our sequential multi-dimensional CNV kernel-based association test (SMCKAT) consists of three parts: (1) a single CNV group kernel measuring the similarity between two groups of CNVs; (2) a whole genome group kernel that aggregates several single group kernels to summarize the similarity between CNV groups in a single chromosome or the whole genome; and (3) an association test between the CNV sequential order and disease-related traits using a random effect model. We evaluate SMCKAT on CNV data sets exhibiting rare or common CNVs, demonstrating that it can detect specific biologically relevant chromosomal regions supported by the biomedical literature. We compare the performance of SMCKAT with MCKAT, a multi-dimensional kernel association test. Based on the results, SMCKAT can detect more specific chromosomal regions compared with MCKAT that not only have CNV characteristics, but the CNV order on them are significantly associated with the disease-related trait.

Highlights

  • Speaking, all humans are 99.9 percent the same and the 0.1 percent that makes us all unique is called genetic variation [1]

  • This indicates that both sequential multi-dimensional CNV kernel-based association test (SMCKAT) and multi-dimensional copy number variant kernel association test (MCKAT) can properly handle the type I and II error rate under different nominal significance levels even as low as 10−5 when dealing with the rare Copy number variants (CNVs) dataset

  • This paper presents a sequential multi-dimensional CNV association test identifying associations between CNVs and disease-rated traits using all multi-dimensional CNV characteristics and CNV sequential order

Read more

Summary

Introduction

Speaking, all humans are 99.9 percent the same and the 0.1 percent that makes us all unique is called genetic variation [1]. Genetic variation has two main forms: structural alteration and sequence variation. Copy number variant (CNV) and DNA sequence variation are the most common form of structural alteration and sequence variation in the human genome, respectively [2]. A sequence variation or single nucleotide polymorphism (SNP) represents a difference in a single nucleotide. SNPs within a coding sequence do not necessarily change the amino acid sequence of the protein that is produced, due to the degeneracy of the genetic code. SNPs in the coding region are of two types: nonsynonymous and synonymous SNPs. Nonsynonymous SNPs change the amino acid sequence of the protein, while synonymous SNPs do not affect the amino acid sequence of the protein. Many sequence studies have been done to test the association between SNPs and disease or traits

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.