Abstract

The rapid developments in high-throughput sequencing technologies have allowed researchers to analyze the full genomic sequence of organisms faster and cheaper than ever before. An important application of such advancements is to identify the impact of single nucleotide polymorphisms (SNPs) on the phenotypes and genotypes of the same species by discovering the factors that affect the occurrence of SNPs. The focus of this study is to determine whether climate factors such as the main climate, the precipitation, and the temperature affecting a certain geographical area might be associated with specific variations in certain ecotypes of the plant Arabidopsis thaliana. To test our hypothesis we analyzed 18 genes that encode Forkhead-Associated domain-containing proteins. They were extracted from 80 genomic sequences gathered from within 8 Eurasian regions. We used k-means clustering to separate the plants into distinct groups and evaluated the clusters using an innovative scoring system based upon the Köppen-Geiger climate classification system. The methods we used allow the selection of candidate clusters most likely to contain samples with similar polymorphisms. These clusters show that there is a correlation between genomic variations and the geographic distribution of those ecotypes.

Highlights

  • The potential use of Arabidopsis thaliana (L.) Heynh. (Brassicaceae) as a model system for genetic studies was first reported by Titova in 1935 [1]

  • We aim to determine whether single nucleotide polymorphisms (SNPs) appearing in these 18 gene sequences may be related to climate properties at the locations from which samples were collected

  • The mean, standard deviation (SD), median, and coefficient of variation (CV) for each individual K-score all assist in finding a stable k-value for the clustering

Read more

Summary

Introduction

The potential use of Arabidopsis thaliana (L.) Heynh. (Brassicaceae) as a model system for genetic studies was first reported by Titova in 1935 [1]. There are many advantages to using Arabidopsis as a model in research studies [2] that aim to understand the genetic, cellular, and molecular biological structure of plants. The goal of many association studies is to find genotype differences between and in some cases within certain species and examine how these changes are reflected in the phenotypic characteristics of those species. In the case of Arabidopsis, some studies have concentrated on finding phenotype and genotype associations related to alternative splicing and transposable element effects [13,14,15,16]. We seek to relate certain genotypic characteristics like SNPs within FHA domain genes to the distribution of those plant ecotypes collected from within different climate regions. Our clustering-based approach, combined with the climatic scoring, represents a unique approach in A. thaliana research studies

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call