Abstract

BackgroundA quantitative trait locus (QTL) is a genomic region that correlates with a phenotype. Most of the experimental information about QTL mapping studies is described in tables of scientific publications. Traditional text mining techniques aim to extract information from unstructured text rather than from tables. We present QTLTableMiner++ (QTM), a table mining tool that extracts and semantically annotates QTL information buried in (heterogeneous) tables of plant science literature.QTM is a command line tool written in the Java programming language. This tool takes scientific articles from the Europe PMC repository as input, extracts QTL tables using keyword matching and ontology-based concept identification. The tables are further normalized using rules derived from table properties such as captions, column headers and table footers. Furthermore, table columns are classified into three categories namely column descriptors, properties and values based on column headers and data types of cell entries. Abbreviations found in the tables are expanded using the Schwartz and Hearst algorithm. Finally, the content of QTL tables is semantically enriched with domain-specific ontologies (e.g. Crop Ontology, Plant Ontology and Trait Ontology) using the Apache Solr search platform and the results are stored in a relational database and a text file.ResultsThe performance of the QTM tool was assessed by precision and recall based on the information retrieved from two manually annotated corpora of open access articles, i.e. QTL mapping studies in tomato (Solanum lycopersicum) and in potato (S. tuberosum). In summary, QTM detected QTL statements in tomato with 74.53% precision and 92.56% recall and in potato with 82.82% precision and 98.94% recall.ConclusionQTM is a unique tool that aids in providing QTL information in machine-readable and semantically interoperable formats.

Highlights

  • A quantitative trait locus (QTL) is a genomic region that correlates with a phenotype

  • Studies, it is possible to detect genomic regions that are Leveraging upon knowledge available in both scientific statistically associated with variation in non-Mendelian literature and molecular biology databases can help in phenotypic traits, termed as quantitative trait loci narrowing down the QTL regions to candidate genes (QTL) [1]

  • The results stored in the QTL table are written into a text file (CSV)

Read more

Summary

Introduction

A quantitative trait locus (QTL) is a genomic region that correlates with a phenotype. QTM is a command line tool written in the Java programming language This tool takes scientific articles from the Europe PMC repository as input, extracts QTL tables using keyword matching and ontology-based concept identification. Studies, it is possible to detect genomic regions that are Leveraging upon knowledge available in both scientific statistically associated with variation in non-Mendelian literature and molecular biology databases can help in phenotypic traits, termed as quantitative trait loci narrowing down the QTL regions to candidate genes (QTL) [1]. Singh et al BMC Bioinformatics (2018) 19:183 to create manually curated databases with QTL information; for example, AnimalQTLdb [3], MaizeGDB [4], Gramene QTL database [5] and SGN/solQTL [6]. There is a need to retrieve QTL data from publications efficiently, which can further reduce the cost of QTL database curation and QTL knowledge discovery process

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.