Recall-Oriented Understudy For Gisting Evaluation Research Articles

ObjectiveAutomatic text summarization offers an efficient solution to access the ever-growing amounts of both scientific and clinical literature in the biomedical domain by summarizing the source documents while maintaining their most informative contents. In this paper, we propose a novel graph-based summarization method that takes advantage of the domain-specific knowledge and a well-established data mining technique called frequent itemset mining. MethodsOur summarizer exploits the Unified Medical Language System (UMLS) to construct a concept-based model of the source document and mapping the document to the concepts. Then, it discovers frequent itemsets to take the correlations among multiple concepts into account. The method uses these correlations to propose a similarity function based on which a represented graph is constructed. The summarizer then employs a minimum spanning tree based clustering algorithm to discover various subthemes of the document. Eventually, it generates the final summary by selecting the most informative and relative sentences from all subthemes within the text. ResultsWe perform an automatic evaluation over a large number of summaries using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics. The results demonstrate that the proposed summarization system outperforms various baselines and benchmark approaches. ConclusionThe carried out research suggests that the incorporation of domain-specific knowledge and frequent itemset mining equips the summarization system in a better way to address the informativeness measurement of the sentences. Moreover, clustering the graph nodes (sentences) can enable the summarizer to target different main subthemes of a source document efficiently. The evaluation results show that the proposed approach can significantly improve the performance of the summarization systems in the biomedical domain.

Read full abstract

ObjectiveAutomatic text summarization tools can help users in the biomedical domain to access information efficiently from a large volume of scientific literature and other sources of text documents. In this paper, we propose a summarization method that combines itemset mining and domain knowledge to construct a concept-based model and to extract the main subtopics from an input document. Our summarizer quantifies the informativeness of each sentence using the support values of itemsets appearing in the sentence. MethodsTo address the concept-level analysis of text, our method initially maps the original document to biomedical concepts using the Unified Medical Language System (UMLS). Then, it discovers the essential subtopics of the text using a data mining technique, namely itemset mining, and constructs the summarization model. The employed itemset mining algorithm extracts a set of frequent itemsets containing correlated and recurrent concepts of the input document. The summarizer selects the most related and informative sentences and generates the final summary. ResultsWe evaluate the performance of our itemset-based summarizer using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics, performing a set of experiments. We compare the proposed method with GraphSum, TexLexAn, SweSum, SUMMA, AutoSummarize, the term-based version of the itemset-based summarizer, and two baselines. The results show that the itemset-based summarizer performs better than the compared methods. The itemset-based summarizer achieves the best scores for all the assessed ROUGE metrics (R-1: 0.7583, R-2: 0.3381, R-W-1.2: 0.0934, and R-SU4: 0.3889). We also perform a set of preliminary experiments to specify the best value for the minimum support threshold used in the itemset mining algorithm. The results demonstrate that the value of this threshold directly affects the accuracy of the summarization model, such that a significant decrease can be observed in the performance of summarization due to assigning extreme thresholds. ConclusionCompared to the statistical, similarity, and word frequency methods, the proposed method demonstrates that the summarization model obtained from the concept extraction and itemset mining provides the summarizer with an effective metric for measuring the informative content of sentences. This can lead to an improvement in the performance of biomedical literature summarization.

Read full abstract

Recall-Oriented Understudy For Gisting Evaluation Research Articles

Related Topics

Articles published on Recall-Oriented Understudy For Gisting Evaluation

Graph-based biomedical text summarization: An itemset mining and sentence clustering approach

A Hybrid Approach for Arabic Text Summarization Using Domain Knowledge and Genetic Algorithms

Natural Language Description of Video Streams Using Task-Specific Feature Encoding

Different approaches for identifying important concepts in probabilistic biomedical text summarization.

Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach

Quantifying the informativeness for biomedical literature summarization: An itemset mining method

An Ontology-based Summarization System for Arabic Documents (OSSAD)

A semantic graph-based approach to biomedical summarisation

Metodologia de acesso a dissertações de mestrado de tradução por estrangeiros, uma abordagem preliminar

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Recall-Oriented Understudy For Gisting Evaluation Research Articles

Related Topics

Articles published on Recall-Oriented Understudy For Gisting Evaluation

Graph-based biomedical text summarization: An itemset mining and sentence clustering approach

A Hybrid Approach for Arabic Text Summarization Using Domain Knowledge and Genetic Algorithms

Natural Language Description of Video Streams Using Task-Specific Feature Encoding

Different approaches for identifying important concepts in probabilistic biomedical text summarization.

Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach

Quantifying the informativeness for biomedical literature summarization: An itemset mining method

An Ontology-based Summarization System for Arabic Documents (OSSAD)

A semantic graph-based approach to biomedical summarisation

Metodologia de acesso a dissertações de mestrado de tradução por estrangeiros, uma abordagem preliminar