Abstract

BackgroundSmall insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome.ResultsWe present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk.ConclusionsFATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.

Highlights

  • Small insertions and deletions have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations

  • We considered variants recorded in individuals of African ancestry since European and Asian populations have been subject to bottlenecks which might have resulted in pathogenic indels with relatively high minor allele frequencies (MAFs) – see e.g. [7]

  • The vast majority of genetic alterations lie outside the exome, there is a lack of methods designed to predict the impact of indels throughout the whole non– coding genome

Read more

Summary

Introduction

Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. The advent of generation sequencing technologies has led to a rapid increase in identified genetic variation, including single nucleotide variants (SNVs), copy number variants, insertions and deletions (indels), in addition to larger scale DNA rearrangements. Interpretation of the functional impact of identified variants is of increasing importance This has led to the development of accurate methods for assessing genomic tolerance and predictive techniques for discriminating between harmful (pathogenic) and neutral mutations [1,2,3,4]. The vast majority of models for predicting the functional impact of indels have been restricted to their effect in the human exome – see e.g. [5,6,7]

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.