Abstract

At the moment, a huge amount of scientific articles is available, referring to a wide variety of topics like medicine, technology, economics, finance, and so on. Scientific papers show results of scientific interest and also present the evaluation and interpretation of relevant arguments. Due to the fact that these papers are created with a high frequency it is feasible to analyze how people write in a given domain. Within the discipline of natural language processing there are different approaches to analyze large amounts of text corpus. Identification patterns with semantic elements in a text, let us classify and examine the corpus to facilitate interpretation and management of information through computers. At the moment, a semiautomatic or automatic way to generate natural language patterns is not available or quite complicated. In the paper, it is shown how a tool developed for this research is tested in a domain of public health. The results obtained – by means of a tool and aided by graphs – provide groups of words that are used (to determine if they come from a specific vocabulary), most common grammatical categories, most repeated words in a domain, patterns found, and frequency of patterns found. A domain of public health has been selected containing 800 papers concerning different topics referring to genetics. The topics include mutations, genetic deafness, DNA, trinucleotide, suppressor genes, among others. An ontology of public health has been used to provide the basis of the study.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.