Abstract
Various approaches to text simplification have been proposed in an attempt to increase text readability. The rephrasing of syntactically and semantically complex structures is still challenging. A pedagogically motivated simplified version of the same text can have both positive and negative side effects. On the one hand, it can facilitate reading comprehension because of much shorter sentences and a limited vocabulary, but on the other hand, the simplified text often lacks coherence, unity and style. Therefore, reasonable trade-offs among linguistic simplicity, naturalness and informativeness are highly needed. This is a survey paper that discusses state-of-the-art approaches to sentence/text simplification and evaluation methods, along with an empirical evaluation of our approach. The quality of sentence splitting, using the knowledge extraction tool SAAT was compared to state-of-the-art syntactic simplification systems. The research was carried out on the WikiSplit, the HSplit and MinWikiSplit simplification corpora. Automatic metrics for the HSplit showed that the SAAT outperformed other TS systems in all categories. For the WikiSplit dataset, automatic metrics scores were slightly lower than that of the baseline system DisSim. However, the human evaluation showed that DisSim outperformed the SAAT in terms of simplicity and grammar. The quality of AG18copy output corresponded to that of the SAAT. The inter-annotator agreement was calculated. Research limitations as well as suggestions for future research were also provided.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.