Articles published on Exploiting Linkage Disequilibrium
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
24 Search results
Sort by Recency
- Research Article
- 10.1093/biomethods/bpag004
- Feb 5, 2026
- Biology Methods & Protocols
- Duc-Hau Le
Identifying disease-associated single-nucleotide polymorphisms (SNPs) is fundamental to understanding complex disease genetics, yet genome-wide association studies (GWAS) remain costly and data-intensive. Network-based approaches provide a complementary strategy by exploiting linkage disequilibrium (LD) structure- and disease-relatedness to prioritize candidate variants. We present DisSNPNet, a heterogeneous network-based framework that integrates chromosome-specific SNP LD networks derived from 1000 Genomes Project Phase 1 and Phase 3 data, a MeSH-based disease similarity network, and known disease–SNP associations from CAUSALdb. Random walk with restart was applied to rank SNPs for each disease. Predictive performance was evaluated using disease-wise 3-fold cross-validation with AUROC and AUPR. Biological plausibility was assessed by querying top-ranked SNPs in GWAS resources and by disease-specific KEGG pathway enrichment. A chromosome-matched random baseline was constructed to contextualize external GWAS evidence. DisSNPNet consistently outperformed SNP-only LD networks, with heterogeneous networks yielding higher AUROC and AUPR across chromosomes. Strong LD networks (r2 ≥ 0.8) improved precision, particularly in imbalanced settings. Top-ranked SNPs showed significantly greater GWAS evidence than random expectation across all chromosomes, indicating nonrandom enrichment. Disease-specific pathway enrichment revealed biologically coherent mechanisms across immune, metabolic, cardiovascular, and structural diseases. DisSNPNet provides a robust and interpretable framework for prioritizing disease-associated SNPs. While not a substitute for GWAS, it offers a scalable, evidence-supported approach for SNP prioritization and hypothesis generation, complementing experimental and population-based studies.
- Research Article
- 10.14719/pst.4170
- Mar 2, 2025
- Plant Science Today
- Y L Devi + 2 more
Wide association of genomes deals with identifying naturally occurring genetic variance with targeted traits or genes. Putative candidate genes had the capability for improvement in quality and resistance to biotic and abiotic stress by exploiting linkage disequilibrium. Plants of the Amaranthaceae family like Spinach, Amaranthus, Chenopodium, and Sugarbeet are packed with essential nutritional components and are resistant to several biotic and abiotic stress. Several candidate genes are identified for the improvement of floral development, early flowering, late flowering, bolting formation, and resistance to several biotic and abiotic stresses . Through GWAS study, the genetic basis of several complex trait phenotypes can be deciphered for important agricultural crop plants. Exploiting these plants through GWAS will allowed knowing the putative candidate genes present in them which could be identified and used for further improvement of the crops.
- Research Article
2
- 10.1111/jbg.12384
- Jun 27, 2019
- Journal of Animal Breeding and Genetics
- Peter M Visscher + 2 more
Through his own research contributions on the modelling and genetic analysis of quantitative traits and through his former students and postdocs, Robin Thompson has indirectly left a major legacy in human genetics. In this short note, we highlight examples of the long-lasting relevance and impact of Robin's work in human genetics. A lone early study of marker-assisted selection developed many of the tools and approaches later exploited (often after reinvention) by the human genetics community in GWAS studies and for prediction. Furthermore, a particularly clear example of the pervasive impact of Robin's work is that REML has become the default method to estimate variance components and that genetic predictions exploiting linkage disequilibrium in the population are starting to become used in precision medicine applications.
- Research Article
16
- 10.1093/nar/gkx505
- Jun 9, 2017
- Nucleic Acids Research
- Tyler Cowman + 1 more
Epistasis is defined as a statistical interaction between two or more genomic loci in terms of their association with a phenotype of interest. Epistatic loci that are identified using data from Genome-Wide Association Studies (GWAS) provide insights into the interplay among multiple genetic factors, with applications including assessment of susceptibility to complex diseases, decision making in precision medicine, and gaining insights into disease mechanisms. Since the number of genomic loci assayed by GWAS is extremely large (usually in the order of millions), identification of epistatic loci is a statistically difficult and computationally intensive problem. Even when only pairwise interactions are considered, the size of the search space ranges from hundreds of millions to billions of locus pairs. The large number of statistical tests performed also makes sufficient type one error correction imperative. Consequently, efficient algorithms are required to filter the tests that are performed and evaluate large GWAS data sets in a reasonable amount of computation time. It has been observed that many pairwise tests are redundant due to correlations in their genotype values across samples, known as linkage disequilibrium. However, algorithms that have been developed for efficient identification of epistatic loci do not systematically exploit linkage disequilibrium. Here, we propose a new algorithm for fast epistasis detection based on hierarchical representation of linkage disequilibrium (LinDen). We utilize redundancies in genotype patterns between neighboring loci to generate a hierarchical structure and execute a branch-and-bound search to prioritize loci testing based on approximations of a test statistic for pairs of locus groups. The hierarchical organization of tests performed by LinDen allows for efficient scaling based on the screened loci. We test LinDen comprehensively on three data sets obtained from the Wellcome Trust Case Control Consortium: type two diabetes, psoriasis, and hypertension. Our results show that, as compared other state-of-the-art tools for fast epistasis detection, LinDen drastically reduces the number of tests performed while discovering statistically significant locus pairs. LinDen is implemented in C++ and is available as open source at http://compbio.case.edu/linden/.
- Research Article
6
- 10.1534/genetics.115.179507
- Dec 9, 2015
- Genetics
- Michelle Carlsen + 3 more
Genome-wide data with millions of single-nucleotide polymorphisms (SNPs) can be highly correlated due to linkage disequilibrium (LD). The ultrahigh dimensionality of big data brings unprecedented challenges to statistical modeling such as noise accumulation, the curse of dimensionality, computational burden, spurious correlations, and a processing and storing bottleneck. The traditional statistical approaches lose their power due to [Formula: see text] (n is the number of observations and p is the number of SNPs) and the complex correlation structure among SNPs. In this article, we propose an integrated distance correlation ridge regression (DCRR) approach to accommodate the ultrahigh dimensionality, joint polygenic effects of multiple loci, and the complex LD structures. Initially, a distance correlation (DC) screening approach is used to extensively remove noise, after which LD structure is addressed using a ridge penalized multiple logistic regression (LRR) model. The false discovery rate, true positive discovery rate, and computational cost were simultaneously assessed through a large number of simulations. A binary trait of Arabidopsis thaliana, the hypersensitive response to the bacterial elicitor AvrRpm1, was analyzed in 84 inbred lines (28 susceptibilities and 56 resistances) with 216,130 SNPs. Compared to previous SNP discovery methods implemented on the same data set, the DCRR approach successfully detected the causative SNP while dramatically reducing spurious associations and computational time.
- Research Article
27
- 10.1371/journal.pone.0130497
- Jul 7, 2015
- PLOS ONE
- Mahmood Gholami + 7 more
An increasing interest is being placed in the detection of genes, or genomic regions, that have been targeted by selection because identifying signatures of selection can lead to a better understanding of genotype-phenotype relationships. A common strategy for the detection of selection signatures is to compare samples from distinct populations and to search for genomic regions with outstanding genetic differentiation. The aim of this study was to detect selective signatures in layer chicken populations using a recently proposed approach, hapFLK, which exploits linkage disequilibrium information while accounting appropriately for the hierarchical structure of populations. We performed the analysis on 70 individuals from three commercial layer breeds (White Leghorn, White Rock and Rhode Island Red), genotyped for approximately 1 million SNPs. We found a total of 41 and 107 regions with outstanding differentiation or similarity using hapFLK and its single SNP counterpart FLK respectively. Annotation of selection signature regions revealed various genes and QTL corresponding to productions traits, for which layer breeds were selected. A number of the detected genes were associated with growth and carcass traits, including IGF-1R, AGRP and STAT5B. We also annotated an interesting gene associated with the dark brown feather color mutational phenotype in chickens (SOX10). We compared FST, FLK and hapFLK and demonstrated that exploiting linkage disequilibrium information and accounting for hierarchical population structure decreased the false detection rate.
- Research Article
2
- 10.1186/s12859-015-0479-2
- Feb 22, 2015
- BMC Bioinformatics
- Juan P Steibel + 2 more
BackgroundAllelic specific expression (ASE) increases our understanding of the genetic control of gene expression and its links to phenotypic variation. ASE testing is implemented through binomial or beta-binomial tests of sequence read counts of alternative alleles at a cSNP of interest in heterozygous individuals. This requires prior ascertainment of the cSNP genotypes for all individuals. To meet the needs, we propose hidden Markov methods to call SNPs from next generation RNA sequence data when ASE possibly exists.ResultsWe propose two hidden Markov models (HMMs), HMM-ASE and HMM-NASE that consider or do not consider ASE, respectively, in order to improve genotyping accuracy. Both HMMs have the advantages of calling the genotypes of several SNPs simultaneously and allow mapping error which, respectively, utilize the dependence among SNPs and correct the bias due to mapping error. In addition, HMM-ASE exploits ASE information to further improve genotype accuracy when the ASE is likely to be present.Simulation results indicate that the HMMs proposed demonstrate a very good prediction accuracy in terms of controlling both the false discovery rate (FDR) and the false negative rate (FNR). When ASE is present, the HMM-ASE had a lower FNR than HMM-NASE, while both can control the false discovery rate (FDR) at a similar level. By exploiting linkage disequilibrium (LD), a real data application demonstrate that the proposed methods have better sensitivity and similar FDR in calling heterozygous SNPs than the VarScan method. Sensitivity and FDR are similar to that of the BCFtools and Beagle methods. The resulting genotypes show good properties for the estimation of the genetic parameters and ASE ratios.ConclusionsWe introduce HMMs, which are able to exploit LD and account for the ASE and mapping errors, to simultaneously call SNPs from the next generation RNA sequence data. The method introduced can reliably call for cSNP genotypes even in the presence of ASE and under low sequencing coverage. As a byproduct, the proposed method is able to provide predictions of ASE ratios for the heterozygous genotypes, which can then be used for ASE testing.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0479-2) contains supplementary material, which is available to authorized users.
- Research Article
49
- 10.1007/s12041-015-0469-1
- Feb 20, 2015
- Journal of Genetics
- Mohsen Gholizadeh + 2 more
The body weight is an economically important trait in sheep. We performed a genomewide association study using Ovine 50 K SNP chip to identify the genes and chromosome regions associated with body weight in Baluchi sheep. A total of 96 blood samples from two herds along with data on weight at birth (BW), weaning (WW), six month (SMW) and yearling (YW) were collected. Markers were tested for association based on linear regression using the PLINK software. Thirteen different SNP markers reached 5% Bonferroni chromosome-wide significance levels. In this study we detected one SNP with genomewide significance effect on yearling weight on chromosome 8. All significant SNPs at chromosome-wide significance level were within or close to known ovine genes. The SNP at genomewide significance level was within gene SYNE1. Thus, we suggest more investigation to prove these genes as candidate genes for body weight traits in sheep. Growth traits are economically important traits for sheep. Mapping of quantitative trait loci (QTL) is an appropriate approach and provides useful information for marker assisted selection (MAS) and gene-based selection in sheep breeding strategies. Relatively few number of QTLs have been reported in sheep. The current release (February 2014) of the Sheep QTLdb (http://www.animalgenome.org/cgi-bin/ QTLdb) contains 129 QTLs for growth traits reported from a genomic study based on marker-QTL linkage analysis. QTL mapping using linkage map QTLs to large confidence intervals on the genome. As a result, use of QTL in MAS is complicated. It would be possible to exploit linkage disequilibrium (LD) to map QTL if dense markers were available. With the advent of next-generation sequencing
- Research Article
- 10.26076/3140-596d
- Jan 1, 2015
- Digital Commons - USU (Utah State University)
- Michelle Carlsen
This paper presents improved methods for analysis of genome-wide association studies in contemporary genetic research. Thanks to current sequencing methods, half to one million single-nucleotide polymorphisms (SNPs) can be feasibly generated within any given population, and there are often correlations among SNPs that cause truly causative loci to be confounded by correlated neighboring loci. Additionally, complex traits are often jointly affected by multiple genetic variants with each having small or moderate individual effects. To address these issues in genome-wide association studies, we propose a novel statistical approach, DCRR, to detect significant associations between large numbers of SNPs and phenotypes. We applied DCRR on simulations of that varied in marker allele frequencies, linkage disequilibrium, and the numbers of SNPs considered; and we analyzed a previously published Arabidopsis thaliana dataset of an AvrRpm1 binary trait. Our distance correlation was effective in ranking SNPs while the logistic ridge regression detected causative SNPs without including spurious correlated neighbors. Our results indicate that DCRR is an effective and reliable method that can improve the accuracy and efficiency of large association datasets.
- Research Article
- 10.1158/1940-6207.prev-11-ss01-01
- Oct 1, 2011
- Cancer Prevention Research
- Kenneth Offit
Abstract The heritable fraction of human cancers, estimated from genetic and epidemiologic approaches, is 21–42%. Heritable cancers include common adult onset malignancies (e.g. prostate, colon, breast cancers) as well as rarer cancers. During past decades, genetic approaches (e.g. linkage analysis and positional cloning) identified rare genetic mutations associated with markedly elevated cancer risk, with subsequent translation of these findings into effective preventive interventions. Recently genomic approaches (e.g. genome wide association studies [GWAS] exploiting linkage disequilibrium between single nucleotide polymorphisms in the human genome) have identified common variants of lower disease risk, as well as common variants which modify the risk of high-penetrance rare alleles. Ongoing studies are utilizing next-generation massively parallel sequencing (NGS) of whole exomes and genomes to identify the “missing heritability” of human cancer. The status GWAS and NGS approaches to cancer risk assessment will be reviewed. The large sample sizes required for GWAS will be emphasized; a case example of our recent consortium effort to identify genomic modifiers of BRCA2 penetrance will be cited. While the methods of identification of genomic variants are distinct, the principles of biomarker identification and clinical translation to preventive oncology are shared. These similarities include: the use of clinically validated variants that may not be functionally characterized; the segregation of these variants in non-Mendelian as well as Mendelian patterns; the role of gene–environment interactions; the dependence on evidence for clinical utility; the critical translational role of behavioral science; and common ethical considerations. It will be emphasized that during the current period of transition from investigation to practice, consumers must also be protected from harms of premature translation of research findings, while encouraging the innovative and cost-effective application of genomic discoveries that improve personalized oncologic care. Citation Information: Cancer Prev Res 2011;4(10 Suppl):SS01-01.
- Research Article
20
- 10.1186/1471-2164-10-338
- Jan 1, 2009
- BMC Genomics
- Elena Bosch + 12 more
BackgroundIt is well known that the pattern of linkage disequilibrium varies between human populations, with remarkable geographical stratification. Indirect association studies routinely exploit linkage disequilibrium around genes, particularly in isolated populations where it is assumed to be higher. Here, we explore both the amount and the decay of linkage disequilibrium with physical distance along 211 gene regions, most of them related to complex diseases, across 39 HGDP-CEPH population samples, focusing particularly on the populations defined as isolates. Within each gene region and population we use r2 between all possible single nucleotide polymorphism (SNP) pairs as a measure of linkage disequilibrium and focus on the proportion of SNP pairs with r2 greater than 0.8.ResultsAlthough the average r2 was found to be significantly different both between and within continental regions, a much higher proportion of r2 variance could be attributed to differences between continental regions (2.8% vs. 0.5%, respectively). Similarly, while the proportion of SNP pairs with r2 > 0.8 was significantly different across continents for all distance classes, it was generally much more homogenous within continents, except in the case of Africa and the Americas. The only isolated populations with consistently higher LD in all distance classes with respect to their continent are the Kalash (Central South Asia) and the Surui (America). Moreover, isolated populations showed only slightly higher proportions of SNP pairs with r2 > 0.8 per gene region than non-isolated populations in the same continent. Thus, the number of SNPs in isolated populations that need to be genotyped may be only slightly less than in non-isolates.ConclusionThe "isolated population" label by itself does not guarantee a greater genotyping efficiency in association studies, and properties other than increased linkage disequilibrium may make these populations interesting in genetic epidemiology.
- Research Article
141
- 10.1007/s10709-008-9307-1
- Aug 10, 2008
- Genetica
- Frédéric Hospital
The basic principle of Marker-Assisted Selection (MAS) is to exploit Linkage Disequilibrium (LD) between markers and QTLs. With strong enough LD, MAS should in theory be easier, faster, cheaper, or more efficient than classical (phenotypic) selection. I briefly review the major MAS methods, describing some 'success stories' where MAS was applied successfully in the context of plant breeding, and detailing other cases where efficiency was not as high as expected. I discuss the possible causes explaining the difference between theoretical expectations and practical observations. Finally, I review the principal challenges and issues that must be tackled to make marker-assisted selection in plants more effective in the future, namely: managing and controlling QTL stability to apply MAS to complex traits, and integrating MAS in traditional breeding practices to make it more economically attractive and applicable in developing countries.
- Research Article
- 10.1111/j.1466-9218.2000.00024.pp.x
- Jun 28, 2008
- GeneScreen
- Andrew Collins
One strategy for detection of disease genes is to exploit linkage disequilibrium in the hope that in candidate regions there will be detectable association between disease and marker alleles. Maps of single nucleotide polymorphisms (SNPs) will be used for this purpose but a recent simulation suggests that a useful level of linkage disequilibrium is unlikely to extend beyond an average distance of 3 Kb in the general population. This implies that very high marker densities will be required to detect disease: SNP associations. The evidence from published data comprising 877 SNP pairs is presented. For comparison, associations between other pairs of markers, principally microsatellites, are examined in a large sample of haplotypes from the fragile X (FRAX) region in Xq27–28. Association ρ is estimated from haplotype frequencies and the decline in linkage disequilibrium with distance is described using the model originally described by Malecot. The evidence from SNP pairs suggest that linkage disequilibrium extends to at least 263 Kb in random haplotypes, but with a considerable amount of variation particularly at small distances. For microsatellites in the FRAX region disequilibrium extends to at least 435 Kb. This suggests that a genome scan with markers spaced every 100 Kb would be powerful (30 000 markers per genome). Higher densities might be required in some genomic regions and presumably will be required to determine causal SNPs.
- Research Article
1,600
- 10.1038/nrg2361
- Jun 1, 2008
- Nature Reviews Genetics
- Montgomery Slatkin
Linkage disequilibrium--the nonrandom association of alleles at different loci--is a sensitive indicator of the population genetic forces that structure a genome. Because of the explosive growth of methods for assessing genetic variation at a fine scale, evolutionary biologists and human geneticists are increasingly exploiting linkage disequilibrium in order to understand past evolutionary and demographic events, to map genes that are associated with quantitative characters and inherited diseases, and to understand the joint evolution of linked sets of genes. This article introduces linkage disequilibrium, reviews the population genetic processes that affect it and describes some of its uses. At present, linkage disequilibrium is used much more extensively in the study of humans than in non-humans, but that is changing as technological advances make extensive genomic studies feasible in other species.
- Research Article
2
- 10.5713/ajas.2008.70474
- May 6, 2008
- Asian-Australasian Journal of Animal Sciences
- Jong-Joo Kim
A fine-mapping method exploiting linkage disequilibrium was used to detect quantitative trait loci (QTL) on the X chromosome affecting milk production, body conformation and productivity traits. The pedigree comprised 22 paternal half-sib families of Black-and-White Holstein bulls in the Netherlands in a grand-daughter design for a total of 955 sons. Twenty-five microsatellite markers were genotyped to construct a linkage map on the chromosome X spanning 170 Haldane cM with an average inter-marker distance of 7.1 cM. A covariance matrix including elements about identical-by-descent probabilities between haplotypes regarding QTL allele effects was incorporated into the animal model, and a restricted maximum-likelihood method was applied for the presence of QTL using the LDVCM program. Significance thresholds were obtained by permuting haplotypes to phenotypes and by using a false discovery rate procedure. Seven QTL responsible for conformation types (teat length, rump width, rear leg set, angularity and fore udder attachment), behavior (temperament) and a mixture of production and health (durable prestation) were detected at the suggestive level. Some QTL affecting teat length, rump width, durable prestation and rear leg set had small numbers of haplotype clusters, which may indicate good classification of alleles for causal genes or markers that are tightly associated with the causal mutation. However, higher maker density is required to better refine the QTL position and to better characterize functionally distinct haplotypes which will provide information to find causal genes for the traits.
- Research Article
- 10.1111/j.1469-1809.2007.00365.x
- May 28, 2007
- Annals of Human Genetics
European Mathematical Genetics Meeting, Heidelberg, Germany, 12<sup>th</sup>–13<sup>th</sup> April 2007
- Research Article
18
- 10.1186/1471-2156-6-s1-s31
- Dec 1, 2005
- BMC Genetics
- Xiaoyun Zhong + 1 more
Complex disease mapping usually involves a combination of linkage and association techniques. Linkage analysis can scan the entire genome in a few hundred tests. Association tests may involve an even greater number of tests. However, association tests can localize the susceptibility genes more accurately. Using a recently developed combined linkage and association strategy, we analyzed a subset of the Collaborative Study on the Genetics of Alcoholism (COGA) data for the Genetic Analysis Workshop 14 (GAW14). In this analysis, we first employed linkage analysis based on frailty models that take into account age of onset information to establish which regions along the chromosome are likely to harbor disease susceptibility genes for alcohol dependence. Second, we used an association analysis by exploiting linkage disequilibrium to narrow down the peak regions. We also compare the methods with mean identity-by-descent tests and transmission/disequilibrium tests that do not use age of onset information.
- Research Article
9
- 10.1186/1471-2156-6-s1-s10
- Dec 1, 2005
- BMC Genetics
- Jérémie Nsengimana + 2 more
The efficacy of linkage studies using microsatellites and single-nucleotide polymorphisms (SNPs) was evaluated. Analyzed data were supplied by the Collaborative Study on the Genetics of Alcoholism (COGA). Alcoholism was analyzed together with a simulated trait caused by a gene of known position, through a nonparametric linkage test (NPL). For the alcoholism trait, four densities of SNPs (1 SNP per 0.2 cM, 0.5 cM, 1 cM and 2 cM) showed higher peaks of NPL z scores and smaller significant p-values than the usual 10-cM density of microsatellites. However, the two highest densities of SNPs had unstable z score signals, and therefore were difficult to interpret. Analyzing a simulated trait with the same markers in the same pedigrees, we confirmed the higher power of all four densities of SNPs compared to the 10-cM microsatellites panel, although the existence of other confounding peaks was confirmed for maps that are denser than 1 SNP/cM. We further showed that estimating the gene position using SNPs is far less biased than using the usual panel of microsatellites (biases of 0–2 cM for SNPs vs. 8.9 cM for microsatellites). We conclude that using dense maps of SNPs in linkage analysis is more powerful and less biased than using the 10-cM maps of microsatellites. However, linkage signals can be unstable and difficult to interpret when several SNPs are genotyped per centimorgan. The power and accuracy of 1 SNP/cM or 1 SNP/2 cM may be sufficient in a genome-wide linkage scan while denser maps may be most useful in fine-gene mapping studies exploiting linkage disequilibrium.
- Research Article
67
- 10.1534/genetics.103.013227
- Jun 1, 2004
- Genetics
- Thomas Moen + 3 more
A multistage testing strategy to detect QTL for resistance to infectious salmon anemia (ISA) in Atlantic salmon is proposed. First, genotyping of amplified fragment length polymorphisms (AFLP) and a transmission disequilibrium test (TDT) were carried out using dead offspring from a disease resistance challenge test. Second, AFLP genotyping among survivors followed by a Mendelian segregation test was performed. Third, within-family survival analyses using all offspring were developed and applied to significant TDT markers with Mendelian inheritance. Maximum-likelihood methodology was developed for TDT with dominant markers to exploit linkage disequilibrium within families. The strategy was tested with two full-sib families of Atlantic salmon sired by the same male and consisting of 79 offspring in total. All dead offspring from the two families were typed for 64 primer combinations, resulting in 340 scored markers. There were 26 significant results out of 401 TDTs using dead offspring. In the second stage, only 17 marker families showed Mendelian segregation and were tested in survival analysis. A permutation test was performed for all survival analyses to compute experimentwise P-values. Two markers, aaccac356 and agccta150, were significant at P < 0.05 when accounting for multiple testing in the survival analyses. The proposed strategy might be more powerful than current mapping strategies because it reduces the number of tests to be performed in the last testing stage.
- Research Article
430
- 10.1093/genetics/163.1.253
- Jan 1, 2003
- Genetics
- Sarah Blott + 18 more
We herein report on our efforts to improve the mapping resolution of a QTL with major effect on milk yield and composition that was previously mapped to bovine chromosome 20. By using a denser chromosome 20 marker map and by exploiting linkage disequilibrium using two distinct approaches, we provide strong evidence that a chromosome segment including the gene coding for the growth hormone receptor accounts for at least part of the chromosome 20 QTL effect. By sequencing individuals with known QTL genotype, we identify an F to Y substitution in the transmembrane domain of the growth hormone receptor gene that is associated with a strong effect on milk yield and composition in the general population.