Insertion of Ontological Knowledge to Improve Automatic Summarization Extraction Methods

Jésus Antonio Motta,Laurence Capus,Nicole Tourigny

doi:10.4236/jilsa.2011.33015

Jésus Antonio Motta, Laurence Capus + Show 1 more

Open Access

https://doi.org/10.4236/jilsa.2011.33015

Copy DOI

Abstract

The vast availability of information sources has created a need for research on automatic summarization. Current methods perform either by extraction or abstraction. The extraction methods are interesting, because they are robust and independent of the language used. An extractive summary is obtained by selecting sentences of the original source based on information content. This selection can be automated using a classification function induced by a machine learning algorithm. This function classifies sentences into two groups: important or non-important. The important sentences then form the summary. But, the efficiency of this function directly depends on the used training set to induce it. This paper proposes an original way of optimizing this training set by inserting lexemes obtained from ontological knowledge bases. The training set optimized is reinforced by ontological knowledge. An experiment with four machine learning algorithms was made to validate this proposition. The improvement achieved is clearly significant for each of these algorithms.

Highlights

Research works on automatic summarization have greatly increased in recent years
This paper proposes an original way of optimizing this training set by inserting lexemes obtained from ontological knowledge bases
When the process of classification is finished, we identify four categories of sentences among all those analyzed: True Positive Case (TP): if the function predicts correctly a sentence labeled as important; True Negative Case (TN): if the function predicts correctly a sentence labeled as non-important; False Positive Case (FP): if the function predicts incorrectly a sentence labeled as important; False Negative Case (FN): if the function predicts incorrectly a sentence labeled as non-important

Summary

Introduction

Research works on automatic summarization have greatly increased in recent years. digital sources of information have become increasingly available. A summary obtained by extraction is composed of a set of sentences selected from the source document(s) by using statistical or heuristic methods based on information entropy of sentences. Automatic summarization process by abstraction is usually decomposed into three steps: interpretation of source document(s) to obtain representation, transformation of this representation, and production of a textual synthesis [5]. Both approaches have their advantages and drawbacks. Data too scattered do not facilitate a good estimate, nor obtain good classification models This problem is tackled by using heuristic methods based on linear approximations, which optimize the training set by reducing it or constructing a new smaller set from another series of attributes [9].

Insert Ontological Knowledge in Summary Extraction Process

Summarization Process Considered

Insertion of Ontological Knowledge

Evaluation Method

Experiment and Results

Results for ROC Curves

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Intelligent Learning Systems and Applications	Publication Date: Jan 1, 2011
Citations: 15	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Insertion of Ontological Knowledge to Improve Automatic Summarization Extraction Methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Intelligent Learning Systems and Applications

Lead the way for us

Similar Papers

Automatic Creation of an Ontological Knowledge Base from Grid and Cloud-based Wikipages
Chaitali Gupta ... Madhusudhan Govindaraju
-
Chaitali Gupta, et. al.Chaitali Gupta ... Madhusudhan Govindaraju
01 Nov 2011
01 Nov 2011

Text Summarization Approaches Using Machine Learning & LSTM
Neeraj Kumar Sirohi ... Dr.S.N Rajan Rajan
Revista GEINTEC | VOL. 11
Neeraj Kumar Sirohi, et. al.Neeraj Kumar Sirohi ... Dr.S.N Rajan Rajan
01 Sep 2021
Revista GEINTEC | VOL. 11

Pushing the limits of solubility prediction via quality-oriented data selection.
Murat Cihan Sorkun ... J.M. Vianney A. Koelman
iScience | VOL. 24
Murat Cihan Sorkun, et. al.Murat Cihan Sorkun ... J.M. Vianney A. Koelman
17 Dec 2020
iScience | VOL. 24

Building an Ontological Base for Experimental Evaluation of Semantic Web Applications
Peter Bartalos ... Michal Tvarožek
-
Peter Bartalos, et. al.Peter Bartalos ... Michal Tvarožek
01 Jan 2007
01 Jan 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Insertion of Ontological Knowledge to Improve Automatic Summarization Extraction Methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Intelligent Learning Systems and Applications