Making It Simplext

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

The way in which a text is written can be a barrier for many people. Automatic text simplification is a natural language processing technology that, when mature, could be used to produce texts that are adapted to the specific needs of particular users. Most research in the area of automatic text simplification has dealt with the English language. In this article, we present results from the Simplext project, which is dedicated to automatic text simplification for Spanish. We present a modular system with dedicated procedures for syntactic and lexical simplification that are grounded on the analysis of a corpus manually simplified for people with special needs. We carried out an automatic evaluation of the system’s output, taking into account the interaction between three different modules dedicated to different simplification aspects. One evaluation is based on readability metrics for Spanish and shows that the system is able to reduce the lexical and syntactic complexity of the texts. We also show, by means of a human evaluation, that sentence meaning is preserved in most cases. Our results, even if our work represents the first automatic text simplification system for Spanish that addresses different linguistic aspects, are comparable to the state of the art in English Automatic Text Simplification.

Similar Papers
  • Conference Article
  • Cite Count Icon 3
  • 10.1109/iaeac50856.2021.9390937
Chinese Automatic Text Simplification Based on Unsupervised Learning
  • Mar 12, 2021
  • Yang Sen + 1 more

In this paper, a Chinese automatic text simplification(ATS) method based on unsupervised learning was introduced. Automatic text simplification is a research field of natural language processing. In terms of Chinese texts, the reliance on the hand-made simplified corpus or dictionary is not applicable due to a large number of texts. Chinese is a diverse language, and numerous factors need to be taken into consideration. An automatic simplification method based on Chinese text and a readability formula based on linear regression was proposed in this paper. Based on our method, just input a set of Chinese sentences and the more comprehensible sentences can be obtained through syntactic simplification and lexical simplification. Through the automatic evaluation of the hand-made simplified corpus, the readability score of our system increased by 3.68 compared with that of the original text, and the SARI score reached 36.02.

  • Conference Article
  • Cite Count Icon 44
  • 10.1145/3313831.3376563
Automatic Text Simplification Tools for Deaf and Hard of Hearing Adults: Benefits of Lexical Simplification and Providing Users with Autonomy
  • Apr 21, 2020
  • Oliver Alonzo + 3 more

Automatic Text Simplification (ATS), which replaces text with simpler equivalents, is rapidly improving. While some research has examined ATS reading-assistance tools, little has examined preferences of adults who are deaf or hard-of-hearing (DHH), and none empirically evaluated lexical simplification technology (replacement of individual words) with these users. Prior research has revealed that U.S. DHH adults have lower reading literacy on average than their hearing peers, with unique characteristics to their literacy profile. We investigate whether DHH adults perceive a benefit from lexical simplification applied automatically or when users are provided with greater autonomy, with on-demand control and visibility as to which words are replaced. Formative interviews guided the design of an experimental study, in which DHH participants read English texts in their original form and with lexical simplification applied automatically or on-demand. Participants indicated that they perceived a benefit form lexical simplification, and they preferred a system with on-demand simplification.

  • Research Article
  • Cite Count Icon 5
  • 10.1109/access.2022.3174846
Pattern-Based Syntactic Simplification of Compound and Complex Sentences
  • Jan 1, 2022
  • IEEE Access
  • Archana Praveen Kumar + 4 more

With the advent of new technologies, simplifying text automatically has been very popular and of high importance among natural language researchers during the last decade. The predominant research done in the area of Automatic Sentence Simplification(ASS) is inclined to either lexical or syntactical simplification of sentences. From the literature survey, it is observed that existing research in lexical simplification makes use of word substitution technique. This causes word sense ambiguity in cases where the word synonyms are not appropriate for a sentence in the given context. In contrast, syntactical simplification though accurate and applicable to Natural Language Processing (NLP) tasks, requires tremendous efforts to construct rules for a given domain. The research proposes a framework called Pattern-based Automatic Syntactic Simplification(PASS) which identifies sentences and applies rules based on grammatical patterns to simplify the sentences thereby making it more generic for NLP tasks. PASS is evaluated by human experts to rate the usefulness of the framework based on fluency, adequacy and simplicity of the sentences. Furthermore, the framework is automatically evaluated with the available online corpus using automatic metrics of SARI, BLEU, and FKGL. The proposed approach generates promising results in the field of ASS and could be used as a preliminary module for NLP tasks as well as other natural language-related applications like summarization, anaphora resolution, question-answering, and many more.

  • PDF Download Icon
  • Conference Article
  • Cite Count Icon 1
  • 10.5121/csit.2022.121518
GRASS: A Syntactic Text Simplification System based on Semantic Representations
  • Sep 17, 2022
  • Rita Hijazi + 2 more

Automatic Text Simplification (ATS) is the process of reducing a text's linguistic complexity to improve its understandability and readability while maintaining its original information, content, and meaning. Several text transformation operations can be performed such as splitting a sentence into several shorter sentences, substitution of complex elements, and reorganization. It has been shown that the implementation of these operations essentially at a syntactic level causes several problems that could be solved by using semantic representations. In this paper, we present GRASS (GRAph-based Semantic representation for syntactic Simplification), a rulebased automatic syntactic simplification system that uses semantic representations. The system allows the syntactic transformation of complex constructions, such as subordination clauses, appositive clauses, coordination clauses, and passive forms into simpler sentences. It is based on graph-based meaning representation of the text expressed in DMRS (Dependency Minimal Recursion Semantics) notation and it uses rewriting rules. The experimental results obtained on a reference corpus and according to specific metrics outperform the results obtained by other state of the art systems on the same reference corpus.

  • Conference Article
  • Cite Count Icon 8
  • 10.26615/978-954-452-056-4_131
Automated Text Simplification as a Preprocessing Step for Machine Translation into an Under-resourced Language
  • Oct 22, 2019
  • Sanja Štajner + 1 more

In this work, we investigate the possibility of using fully automatic text simplification system on the English source in machine translation (MT) for improving its translation into an under-resourced language. We use the state-of-the-art automatic text simplification (ATS) system for lexically and syntactically simplifying source sentences, which are then translated with two state-of-the-art English-to-Serbian MT systems, the phrase-based MT (PBMT) and the neural MT (NMT). We explore three different scenarios for using the ATS in MT: (1) using the raw output of the ATS; (2) automatically filtering out the sentences with low grammaticality and meaning preservation scores; and (3) performing a minimal manual correction of the ATS output. Our results show improvement in fluency of the translation regardless of the chosen scenario, and difference in success of the three scenarios depending on the MT approach used (PBMT or NMT) with regards to improving translation fluency and post-editing effort.

  • Research Article
  • Cite Count Icon 10
  • 10.22099/jtls.2017.26325.2324
The Effect of Reducing Lexical and Syntactic Complexity of Texts on Reading Comprehension
  • Oct 1, 2017
  • Journal of Teaching Language Skills
  • Mahmood Safari + 1 more

The present study investigated the effect of different types of text simplification (i.e., reducing the lexical and syntactic complexity of texts) on reading comprehension of English as a Foreign Language learners (EFL). Sixty female intermediate EFL learners from three intact classes in Tabarestan Language Institute in Tehran participated in the study. The intact classes were assigned to three experimental groups. Moreover, to homogenize the groups, the researchers administered a general proficiency test (TOEFL, 2003) to the participants. The results revealed no significant difference among the groups in general proficiency and reading ability. Then four reading comprehension texts from TOEFL test (2005) were simplified through lexical simplification, syntactic simplification or lexical-syntactic simplification techniques. The simplified texts, along with their reading comprehension (RC) questions, formed the three versions of the post-test, each version contained either lexically, syntactically or lexical-syntactically simplified texts. Each group took one version of the post-test. The scores were analyzed through one-way ANOVA. The results revealed a significant difference among the groups. The post hoc test indicated that the lexical-syntactic simplification group significantly outperformed the lexical simplification group and performed considerably better than the syntactic simplification group. There was no significant difference between the lexical and syntactic simplification groups, although the latter showed better results.

  • Book Chapter
  • Cite Count Icon 4
  • 10.1093/oxfordhb/9780199573691.013.52
Text Simplification
  • Feb 5, 2018
  • Horacio Saggion

Over the past decades, information has been made available to a broad audience thanks to the availability of texts on the Web. However, understanding the wealth of information contained in texts can pose difficulties for a number of people including those with poor literacy, cognitive or linguistic impairment, or those with limited knowledge of the language of the text. Text simplification was initially conceived as a technology to simplify sentences so that they would be easier to process by natural-language processing components such as parsers. However, nowadays automatic text simplification is conceived as a technology to transform a text into an equivalent which is easier to read and to understand by a target user. Text simplification concerns both the modification of the vocabulary of the text (lexical simplification) and the modification of the structure of the sentences (syntactic simplification). In this chapter, after briefly introducing the topic of text readability, we give an overview of past and recent methods to address these two problems. We also describe simplification applications and full systems also outline language resources and evaluation approaches.

  • Research Article
  • Cite Count Icon 4
  • 10.1162/tacl_a_00653
Do Text Simplification Systems Preserve Meaning? A Human Evaluation via Reading Comprehension
  • Apr 16, 2024
  • Transactions of the Association for Computational Linguistics
  • Sweta Agrawal + 1 more

Automatic text simplification (TS) aims to automate the process of rewriting text to make it easier for people to read. A pre-requisite for TS to be useful is that it should convey information that is consistent with the meaning of the original text. However, current TS evaluation protocols assess system outputs for simplicity and meaning preservation without regard for the document context in which output sentences occur and for how people understand them. In this work, we introduce a human evaluation framework to assess whether simplified texts preserve meaning using reading comprehension questions. With this framework, we conduct a thorough human evaluation of texts by humans and by nine automatic systems. Supervised systems that leverage pre-training knowledge achieve the highest scores on the reading comprehension tasks among the automatic controllable TS systems. However, even the best-performing supervised system struggles with at least 14% of the questions, marking them as “unanswerable” based on simplified content. We further investigate how existing TS evaluation metrics and automatic question-answering systems approximate the human judgments we obtained.

  • Book Chapter
  • 10.1007/978-3-031-02166-4_9
Conclusion
  • Jan 1, 2017
  • Synthesis lectures on human language technologies
  • Horacio Saggion

In recent years, automatic text simplification has attracted the attention of researchers in natural language processing. Research is improving steadily. It is a difficult task for human editors to produce a text that will match the reading abilities of a target population. Therefore, it is an even more difficult task for machines, which are, for the time being, deprived of the necessary linguistic and world knowledge. However, by addressing such an important societal challenge, researchers have created new methods and repurposed old ones. In this book, we have partially covered three relevant simplification topics: text readability, lexical simplification, and syntactic simplification.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/bip56202.2022.10032482
Towards Text Simplification in Spanish: A Brief Overview of Deep Learning Approaches for Text Simplification
  • Nov 15, 2022
  • Mario Romero + 5 more

Text simplification refers to the transformation of a specific source text into a target text aiming to increase understanding and readability for one or more specific audiences. This task demands large human efforts and specialized knowledge, which makes the usage of automated or semi-automated computational approaches appealing. The rise of deep learning as an unifying paradigm between seemingly different fields as image analysis, sound processing and natural language processing has considerably influenced the current state of the art approaches for automatic text simplification. Therefore, in this work, we focus on the study of deep learning based state of the art methods for automatic text simplification in the Spanish language. For this end, we first disentangle the different tasks which can be addressed in order to yield a simplified text in general. Later we review the latest deep learning-based approaches, along with the main datasets and performance metrics used in the field. We also describe approaches to deal with small datasets and technical words. Finally, we describe some lessons to build accurate automatic text simplification systems in Spanish, as in this language there is a noticeable shortage of work for text simplification.

  • Conference Article
  • Cite Count Icon 2
  • 10.5167/uzh-192839
A Corpus for Automatic Readability Assessment and Text Simplification of German
  • May 16, 2020
  • Alessia Battisti + 4 more

In this paper, we present a corpus for use in automatic readability assessment and automatic text simplification for German, the first of its kind for this language. The corpus is compiled from web sources and consists of parallel as well as monolingual-only (simplified German) data amounting to approximately 6,200 documents (nearly 211,000 sentences). As a unique feature, the corpus contains information on text structure (e.g., paragraphs, lines), typography (e.g., font type, font style), and images (content, position, and dimensions). While the importance of considering such information in machine learning tasks involving simplified language, such as readability assessment, has repeatedly been stressed in the literature, we provide empirical evidence for its benefit. We also demonstrate the added value of leveraging monolingual-only data for automatic text simplification via machine translation through applying back-translation, a data augmentation technique.

  • Book Chapter
  • Cite Count Icon 26
  • 10.1007/978-3-642-37256-8_40
Automatic Text Simplification in Spanish: A Comparative Evaluation of Complementing Modules
  • Jan 1, 2013
  • Biljana Drndarević + 4 more

In this paper we present two components of an automatic text simplification system for Spanish, aimed at making news articles more accessible to readers with cognitive disabilities. Our system in its current state consists of a rule-based lexical transformation component and a module for syntactic simplification. We evaluate the two components separately and as a whole, with a view to determining the level of simplification and the preservation of meaning and grammaticality. In order to test the readability level pre- and post-simplification, we apply seven readability measures for Spanish to three sets of randomly chosen news articles: the original texts, the output obtained after lexical transformations, the syntactic simplification output, and the output of both system components. To test whether the simplification output is grammatically correct and semantically adequate, we ask human annotators to grade pairs of original and simplified sentences according to these two criteria. Our results suggest that both components of our system produce simpler output when compared to the original, and that grammaticality and meaning preservation are positively rated by the annotators.

  • Conference Article
  • Cite Count Icon 16
  • 10.1145/3613904.3642772
ARTiST: Automated Text Simplification for Task Guidance in Augmented Reality
  • May 11, 2024
  • Guande Wu + 5 more

Text presented in augmented reality provides in-situ, real-time information for users. However, this content can be challenging to apprehend quickly when engaging in cognitively demanding AR tasks, especially when it is presented on a head-mounted display. We propose ARTiST, an automatic text simplification system that uses a few-shot prompt and GPT-3 models to specifically optimize the text length and semantic content for augmented reality. Developed out of a formative study that included seven users and three experts, our system combines a customized error calibration model with a few-shot prompt to integrate the syntactic, lexical, elaborative, and content simplification techniques, and generate simplified AR text for head-worn displays. Results from a 16-user empirical study showed that ARTiST lightens the cognitive load and improves performance significantly over both unmodified text and text modified via traditional methods. Our work constitutes a step towards automating the optimization of batch text data for readability and performance in augmented reality.

  • Conference Article
  • 10.18653/v1/2024.tsar-1.10
SciGisPy: a Novel Metric for Biomedical Text Simplification via Gist Inference Score
  • Jan 1, 2024
  • Chen Lyu + 1 more

Biomedical literature is often written in highly specialized language, posing significant comprehension challenges for non-experts.Automatic text simplification (ATS) offers a solution by making such texts more accessible while preserving critical information.However, evaluating ATS for biomedical texts is still challenging due to the limitations of existing evaluation metrics.General-domain metrics like SARI, BLEU, and ROUGE focus on surface-level text features, and readability metrics like FKGL and ARI fail to account for domain-specific terminology or assess how well the simplified text conveys core meanings (gist).To address this, we introduce SciGisPy, a novel evaluation metric inspired by Gist Inference Score (GIS) from Fuzzy-Trace Theory (FTT).SciGisPy measures how well a simplified text facilitates the formation of abstract inferences (gist) necessary for comprehension, especially in the biomedical domain.We revise GIS for this purpose by introducing domain-specific enhancements, including semantic chunking, Information Content (IC) theory, and specialized embeddings, while removing unsuitable indices.Our experimental evaluation on the Cochrane biomedical text simplification dataset demonstrates that SciGisPy outperforms the original GIS formulation, with a significant increase in correctly identified simplified texts (84% versus 44.8%).The results and a thorough ablation study confirm that SciGisPy better captures the essential meaning of biomedical content, outperforming existing approaches.Plain-language summary: The studies showed that neither gabapentin nor gabapentin enacarbil was more effective than placebo at reducing the frequency of migraine headaches.Gabapentin commonly caused side effects, especially dizziness and somnolence (sleepiness).No studies of pregabalin were identified, and research on this drug is desirable.GisPy score: 0.348 SciGisPy score: 3.599 involving 754 participants. Plain-Language Summary -PLSTechnical abstract: The pooled evidence derived from trials of gabapentin suggests that it is not efficacious for the prophylaxis of episodic migraine in adults.Since adverse events were common among the gabapentin-treated patients, it is advocated that gabapentin should not be used in routine clinical practice.Gabapentin enacarbil is not efficacious for the prophylaxis of episodic migraine in adults.There is no published evidence from controlled trials of pregabalin for the prophylaxis of episodic migraine in adults.GisPy score: -0.417 SciGisPy score: -5.

  • Conference Article
  • 10.5121/csit.2024.141412
A Smart Mobile Platform to Assist with Reading Comprehension using Machine Learning and Lexical Simplification
  • Jul 30, 2024
  • Jake Jin + 2 more

Our research tackles the pressing issue of making news articles accessible and understandable to diverse audiences, particularly those with low literacy levels or cognitive disabilities such as dyslexia or autism [1]. We introduce an innovative AI-driven news application that employs advanced text simplification techniques alongside dynamic user feedback loops to significantly enhance readability and comprehension. At the heart of our solution is the integration of cutting-edge natural language processing (NLP) and machine learning technologies, including BERT text simplification models for parsing and restructuring complex sentences, coupled with sentiment analysis to gauge the emotional tone of content [2][3]. Addressing challenges such as maintaining accuracy in text simplification and fine-tuning the user feedback mechanism were pivotal in our development process [4]. Through rigorous experimentation, including controlled tests and user trials, we observed marked improvements in the accessibility of news content, with enhanced readability scores and positive user feedback. Our application stands out by offering a scalable, user-centered approach to news consumption, adapting to individual preferences and reading abilities. This ensures a more inclusive, informed public discourse, making our app an indispensable resource for brididing the information divide and empowering all users to stay informed, regardless of their literacy level or cognitive capabilities.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant