Abstract

Rapid development of high-performance genomic, transcriptomic, proteomic and metabolic technologies led to an information explosion in the field of plant biology and agrobiology. To date, the number of scientific publications on only one of the most important agricultural crops of Solanum tuberosum L. (potato) has exceeded 1.5 million. Effective access to knowledge distributed over such a multitude of non-formalized natural language textual sources requires the use of special computer-assisted intelligent methods of data mining (text-mining). However, in the literature, there is no data on the application of intellectual methods of automatic knowledge extraction from publications on agricultural crops, such as potato. Previously we have developed a pilot version of the SOLANUM TUBEROSUM knowledge base. SOLANUM TUBEROSUM is a computer platform for complex intellectual processing of large data bodies, including (1) automatic analysis of scientific publications and databases for extraction of information on genetics, markers, breeding, diagnostics, protection and storage technologies for potato, (2) formalized representation of extracted information in the knowledge base, (3) user access to these data, (4) analysis and visualization of query results. The ontology of the SOLANUM TUBEROSUM knowledge base contains dictionaries of molecular genetic objects (proteins, genes, metabolites, microRNAs, biomarkers); phenotypic characteristics of potato varieties; potato diseases and pests; biotic/abiotic environmental factors; potato agrobiotechnologies. This article describes the current version of the SOLANUM TUBEROSUM knowledge base developed from an extensive analysis of scientific publications on the moleculargenetic regulation of metabolic pathways in potatoes, as well as model plant organisms (maize, rice, Arabidopsis thaliana). In total, about 9,000 full-text articles and more than 130,000 abstracts of PubMed were analyzed. With the help of automatic analysis of scientific publications, more than 59,000 facts on molecular genetic interactions and genetic regulation were identified, and the analysis of factual databases revealed more than 380,000 such interactions in the examined organisms. It turned out that about 3 % of extracted facts about molecular genetic interactions and genetic regulation were related to Solanum tuberosum L. Thus, the inclusion of information on well-studied model species during the extraction of information on the molecular-genetic regulation of metabolic processes is important. It allows prediction of orthologous genes in potato and their further identification and analysis based on homology. An associative network of genetic regulation of starch biosynthesis in potatoes, including 33 metabolites, 36 proteins, 6 metabolic pathways and 132 interactions between them, 86 of which describe catalytic reactions, and the rest – regulatory events, was reconstructed. The reconstructed network is the basis for the search for target genes for directed mutagenesis and marker-oriented selection of potato varieties with specified starch properties. The trial version of the SOLANUM TUBEROSUM knowledge base is available at http://www-bionet.sysbio.cytogen.ru/and/ plant/.

Highlights

  • Ключевые слова: картофель; Solanum tuberosum L.; база знаний; ANDSystem; биосинтез крахмала; ассоциативные генные сети; генетическая регуляция

  • База знаний SOLANUM TUBEROSUM: раздел по молекулярно-генетической регуляции метаболических путей

  • Interactive reconstruction of associative gene networks on the base of data extracted from the knowledge base

Read more

Summary

Генетика и селекция картофеля

База знаний SOLANUM TUBEROSUM: раздел по молекулярно-генетической регуляции метаболических путей. В статье дано описание текущей версии базы знаний SOLANUM TUBEROSUM, полученной в результате расширенного анализа научных публикаций по молекулярно-генетической регуляции метаболических путей у картофеля, а также модельных растительных организмов (кукурузы, риса, арабидопсиса). How to cite this article: Ivanisenko T.V., Saik O.V., Demenkov P.S., Khlestkin V.K., Khlestkina E.K., Kolchanov N.A., Ivanisenko V.A. The SOLANUM TUBEROSUM knowledge base: the section on molecular-genetic regulation of metabolic pathways. Ранее нами впервые в мире была разработана компьютерная платформа для комплексного интеллектуального анализа научных публикаций в области картофелеводства – база знаний SOLANUM TUBEROSUM (Сайк и др., 2017). 1. Модуль автоматического анализа текстов (text mining) научных публикаций и фактографических баз данных предназначен для автоматической экстракции информации о взаимоотношениях между объектами согласно онтологии базы знаний.

Potato Pathogens and Pests
Methods and Technology
ANDVisio tools
Total number of PubMed abstracts**
Результаты и обсуждение
Findings
Interaction type
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call