Automatic Structured Abstract for Research Papers Supported by Tabular Format using NLP

Zainab Almugbel,Neda Bugshan,Nahla El

doi:10.14569/ijacsa.2019.0100231

Abstract

The abstract is an extensive summary of a scientific paper that supports making a quick decision about reading it. The employment of a structured abstract is useful to represent the major components of the paper. This, in turn, enhances extracting information about the study. Regardless of the importance of the structured abstract, many computer science research papers do not apply it. This may lead to weak abstracts. This paper aims at implementing the natural language processing (NLP) techniques and machine learning on conventional abstracts to automatically generate structured abstracts that are formatted using the IMRaD (Introduction, Methods, Results, and Discussion) format which is considered as a predominant in medical, scientific writing. The effectiveness of such sentence classiﬁcation, which is the capability of a method to produce an expected outcome of classifying unstructured abstracts in computer science research papers into IMRAD sections, depends on both feature selection and classiﬁcation algorithm. This can be achieved via IMRaD Classifier by measuring the similarity of sentences between the structured and the unstructured abstracts of different research papers. After that, it can be classified the sentences into one of the IMRaD format tags based on the measured similarity value. Finally, the IMRaD Classifier is evaluated by applying Naïve Bayes (NB) and Support Vector Machine (SVM) classiﬁers on the same dataset. To conduct this work, we use dataset contains 250 conventional Computer Science abstracts for periods 2015 to 2018. This dataset is collected from two main websites: DBLP and IOS Press content library. In this paper, 200 xml based files are used for training, and 50 xml based files are used for testing. Thus, the dataset is 4x250 files where each file contains a set of sentences that belong to different abstracts but belong to the same IMRaD sections. The experimental results show that Naïve Bayes (NB) can predict better outcomes for each class (Introduction, method, results, Discussion and Conclusion) than Support Vector Machine (SVM). Furthermore, the performance of the classifier depends on an appropriate number of the representative feature selected from the text.

Highlights

The abstract is crucial to state the aim and the content of papers for authors
A new technique was suggested by using Natural Language Processing (NLP) techniques and machine learning to generate automatic structuring of unstructured abstract according to IMRaD (Introduction, Methods, Results, and Discussion) format
This approach has been applied to short text for classification the unstructured abstracts measure the similarity between sentences unstructured and structured abstracts that are found in the other research papers

Summary

Objectives

This paper aims at implementing the natural language processing (NLP) techniques and machine learning on conventional abstracts to automatically generate structured abstracts that are formatted using the IMRaD (Introduction, Methods, Results, and Discussion) format which is considered as a predominant in medical, scientific writing. This paper aims at applying the natural language processing (NLP) techniques and machine learning to automatically generate structured abstracts that are formatted using the IMRaD (Introduction, Methods, Results, and Discussion) format

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2019
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Automatic Structured Abstract for Research Papers Supported by Tabular Format using NLP

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

How, and why, science and health researchers read scientific (IMRAD) papers.
Frances Shiely ... Seán R Millar
PLOS ONE | VOL. 19
Frances Shiely, et. al.Frances Shiely ... Seán R Millar
22 Jan 2024
PLOS ONE | VOL. 19

A Systematic Literature Review on Phishing Email Detection Using Natural Language Processing Techniques
Said Salloum ... Khaled Shaalan
IEEE Access | VOL. 10
Said Salloum, et. al.Said Salloum ... Khaled Shaalan
01 Jan 2021
IEEE Access | VOL. 10

Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion
Shashank Agarwal ... Hong Yu
Bioinformatics | VOL. 25
Shashank Agarwal, et. al.Shashank Agarwal ... Hong Yu
25 Sep 2009
Bioinformatics | VOL. 25

Semantic Roles: Towards Rhetorical Moves in Writing About Experimental Procedures
Mohammed Alliheedi ... Robert E Mercer
-
Mohammed Alliheedi, et. al.Mohammed Alliheedi ... Robert E Mercer
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Structured Abstract for Research Papers Supported by Tabular Format using NLP

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications