RAKE Algorithm Research Articles

Background: While clinical medicine has exploded, electronic health records for Natural Language Processing (NLP) analyses, public health, and health policy research have not yet adopted these algorithms. We aimed to dissect the health chapters of the government plans of the 2016 and 2021 Peruvian presidential elections, and to compare different NLP algorithms. Methods: From the government plans (18 in 2016; 19 in 2021) we extracted each sentence from the health chapters. We used five NLP algorithms to extract keywords and phrases from each plan: Term Frequency–Inverse Document Frequency (TF-IDF), Latent Dirichlet Allocation (LDA), TextRank, Keywords Bidirectional Encoder Representations from Transformers (KeyBERT), and Rapid Automatic Keywords Extraction (Rake). Results: In 2016 we analysed 630 sentences, whereas in 2021 there were 1,685 sentences. The TF-IDF algorithm showed that in 2016, 26 terms appeared with a frequency of 0.08 or greater, while in 2021 27 terms met this criterion. The LDA algorithm defined two groups. The first included terms related to things the population would receive (e.g., ’insurance’), while the second included terms about the health system (e.g., ’capacity’). In 2021, most of the government plans belonged to the second group. The TextRank analysis provided keywords showing that ’universal health coverage’ appeared frequently in 2016, while in 2021 keywords about the COVID-19 pandemic were often found. The KeyBERT algorithm provided keywords based on the context of the text. These keywords identified some underlying characteristics of the political party (e.g., political spectrum such as left-wing). The Rake algorithm delivered phrases, in which we found ’universal health coverage’ in 2016 and 2021. Conclusion: The NLP analysis could be used to inform on the underlying priorities in each government plan. NLP analysis could also be included in research of health policies and politics during general elections and provide informative summaries for the general population.

Read full abstract

Background: While clinical medicine has exploded, electronic health records for Natural Language Processing (NLP) analyses, public health, and health policy research have not yet adopted these algorithms. We aimed to dissect the health chapters of the government plans of the 2016 and 2021 Peruvian presidential elections, and to compare different NLP algorithms. Methods: From the government plans (18 in 2016; 19 in 2021) we extracted each sentence from the health chapters. We used five NLP algorithms to extract keywords and phrases from each plan: Term Frequency–Inverse Document Frequency (TF-IDF), Latent Dirichlet Allocation (LDA), TextRank, Keywords Bidirectional Encoder Representations from Transformers (KeyBERT), and Rapid Automatic Keywords Extraction (Rake). Results: In 2016 we analysed 630 sentences, whereas in 2021 there were 1,685 sentences. The TF-IDF algorithm showed that in 2016, 22 terms appeared with a frequency of 0.05 or greater, while in 2021 27 terms met this criterion. The LDA algorithm defined two groups. The first included terms related to things the population would receive (e.g., ’insurance’), while the second included terms about the health system (e.g., ’capacity’). In 2021, most of the government plans belonged to the second group. The TextRank analysis provided keywords showing that ’universal health coverage’ appeared frequently in 2016, while in 2021 keywords about the COVID-19 pandemic were often found. The KeyBERT algorithm provided keywords based on the context of the text. These keywords identified some underlying characteristics of the political party (e.g., political spectrum such as left-wing). The Rake algorithm delivered phrases, in which we found ’universal health coverage’ in 2016 and 2021. Conclusion: The NLP analysis could be used to inform on the underlying priorities in each government plan. NLP analysis could also be included in research of health policies and politics during general elections and provide informative summaries for the general population.

Read full abstract

RAKE Algorithm Research Articles

Related Topics

Articles published on RAKE Algorithm

Suicidal Ideation Detection and Influential Keyword Extraction from Twitter using Deep Learning (SID)

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

And the Rest is History: Measuring the Scope and Recall of Wikipedia’s Coverage of Three Women’s Movement Subgroups

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

An assortment of Query Based Summarization technique (QBS) – A Study

Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments

Automatic scoring method of English composition based on language depth perception

Investigating the Relevant Agro Food Keyword in Malaysian Online Newspapers

The Development of IoT Compression Technique To Cloud

Efficiently Identification of Misrepresentation in Social Media Based on Rake Algorithm

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

RAKE Algorithm Research Articles

Related Topics

Articles published on RAKE Algorithm

Suicidal Ideation Detection and Influential Keyword Extraction from Twitter using Deep Learning (SID)

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

And the Rest is History: Measuring the Scope and Recall of Wikipedia’s Coverage of Three Women’s Movement Subgroups

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters

An assortment of Query Based Summarization technique (QBS) – A Study

Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments

Automatic scoring method of English composition based on language depth perception

Investigating the Relevant Agro Food Keyword in Malaysian Online Newspapers

The Development of IoT Compression Technique To Cloud

Efficiently Identification of Misrepresentation in Social Media Based on Rake Algorithm