Using AdaBoost Meta-Learning Algorithm for Medical News Multi-Document Summarization

Mahdi Gholami Mehr

doi:10.4236/iim.2013.56020

Abstract

Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss about multi-document summarization that differs from the single one in which the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Since the number and variety of online medical news make them difficult for experts in the medical field to read all of the medical news, an automatic multi-document summarization can be useful for easy study of information on the web. Hence we propose a new approach based on machine learning meta-learner algorithm called AdaBoost that is used for summarization. We treat a document as a set of sentences, and the learning algorithm must learn to classify as positive or negative examples of sentences based on the score of the sentences. For this learning task, we apply AdaBoost meta-learning algorithm where a C4.5 decision tree has been chosen as the base learner. In our experiment, we use 450 pieces of news that are downloaded from different medical websites. Then we compare our results with some existing approaches.

Highlights

Nowadays there are lots of online medical news on the web and study of these huge amount of information is not possible for experts in medical field [1]
We present a machine learning based model for a sentence extraction based, Multi document, and informative text summarization in the medical domain (This work is an improvement of the study proposed in [5])
We treat a document as a set of sentences, which must be classified as positive or negative examples of sentences based on the summary worthiness of sentences where a sentence is represented by a feature set, which includes a number of features used in the summarization literature and some other features specific to the medical domain

Summary

Introduction

Nowadays there are lots of online medical news on the web and study of these huge amount of information is not possible for experts in medical field [1]. We present a machine learning based model for a sentence extraction based, Multi document, and informative text summarization in the medical domain (This work is an improvement of the study proposed in [5]). We treat a document as a set of sentences, which must be classified as positive or negative examples of sentences based on the summary worthiness of sentences where a sentence is represented by a feature set, which includes a number of features used in the summarization literature and some other features specific to the medical domain. The first and better understood effect of boosting is that it generates a hypothesis whose error on the training set is small by combining many hypotheses whose error may be large (but still better than random guessing) It seems that boosting may be helpful to learning problems having either of the following two properties.

Related Work

Summarization Method

Using AdaBoost for Sentence Extraction

Summary generation

Summary Generation

Experimental Results

Proposed Method Mead

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Intelligent Information Management	Publication Date: Jan 1, 2013
Citations: 21	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Using AdaBoost Meta-Learning Algorithm for Medical News Multi-Document Summarization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Information Management

Lead the way for us

Similar Papers

Summarizing Online User Reviews Using Bicliques
Azam Sheikh Muhammad ... Peter Damaschke
-
Azam Sheikh Muhammad, et. al.Azam Sheikh Muhammad ... Peter Damaschke
01 Jan 2015
01 Jan 2015

Parallelizing a multi-objective optimization approach for extractive multi-document text summarization
Jesus M Sanchez-Gomez ... Carlos J Pérez
Journal of Parallel and Distributed Computing | VOL. 134
Jesus M Sanchez-Gomez, et. al.Jesus M Sanchez-Gomez ... Carlos J Pérez
07 Sep 2019
Journal of Parallel and Distributed Computing | VOL. 134

Multi-Document Summarization by Extended Graph Text Representation and Importance Refinement
Uri Mirchev ... Mark Last
-
Uri Mirchev, et. al.Uri Mirchev ... Mark Last
01 Jan 2014
01 Jan 2014

A Relationship Between CNF and DNF Systems Derivable from Examples
Evangelos Triantaphyllou ... Allen L Soyster
ORSA Journal on Computing | VOL. 7
Evangelos Triantaphyllou, et. al.Evangelos Triantaphyllou ... Allen L Soyster
01 Aug 1995
ORSA Journal on Computing | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using AdaBoost Meta-Learning Algorithm for Medical News Multi-Document Summarization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Information Management