RoBERTa-BiLSTM-Conv1D Deep Learning Model for Detecting Persuasive Content in News
The use of persuasive language is one of the defining features of native advertisements. Therefore, detecting persuasive content in news is essential, since native ads often appear disguised as legitimate news articles, it is crucial to identify and filter such content to maintain objectivity and improve the user experience. This study aims to detect news with persuasive content i.e. persuasive news in English language using a natural language processing (NLP) approach. The proposed method incorporates text summarization methods, pre-trained word embeddings, and deep learning models. An additional Conv1D layer has been added to improve the model’s performance. The model were trained on an Indonesian news dataset translated into English using Google Translate API. Experimental results show that our proposed RoBERTa–BiLSTM-Conv1D model, outperformed other models, achieving 92% accuracy in identifying persuasive news in English. These findings indicate that the persuasive content detection model can be used for application in mainstream media environments to detect native ads in English language. In the future, the model can incorporate Indonesian and English news as training data to develop a cross-lingual native ads detection model
- Conference Article
51
- 10.18653/v1/p19-2041
- Jan 1, 2019
Using pre-trained word embeddings in conjunction with Deep Learning models has become the “de facto” approach in Natural Language Processing (NLP). While this usually yields satisfactory results, off-the-shelf word embeddings tend to perform poorly on texts from specialized domains such as clinical reports. Moreover, training specialized word representations from scratch is often either impossible or ineffective due to the lack of large enough in-domain data. In this work, we focus on the clinical domain for which we study embedding strategies that rely on general-domain resources only. We show that by combining off-the-shelf contextual embeddings (ELMo) with static word2vec embeddings trained on a small in-domain corpus built from the task data, we manage to reach and sometimes outperform representations learned from a large corpus in the medical domain.
- Research Article
1
- 10.3390/informatics10040086
- Nov 21, 2023
- Informatics
Persuasive content in online news contains elements that aim to persuade its readers and may not necessarily include factual information. Since a news article only has some sentences that indicate persuasiveness, it would be quite challenging to differentiate news with or without the persuasive content. Recognizing persuasive sentences with a text summarization and classification approach is important to understand persuasive messages effectively. Text summarization identifies arguments and key points, while classification separates persuasive sentences based on the linguistic and semantic features used. Our proposed architecture includes text summarization approaches to shorten sentences without persuasive content and then using classifiers model to detect those with persuasive indication. In this paper, we compare the performance of latent semantic analysis (LSA) and TextRank in text summarization methods, the latter of which has outperformed in all trials, and also two classifiers of convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM). We have prepared a dataset (±1700 data and manually persuasiveness-labeled) consisting of news articles written in the Indonesian language collected from a nationwide electronic news portal. Comparative studies in our experimental results show that the TextRank–BERT–BiLSTM model achieved the highest accuracy of 95% in detecting persuasive news. The text summarization methods were able to generate detailed and precise summaries of the news articles and the deep learning models were able to effectively differentiate between persuasive news and real news.
- Research Article
- 10.25130/lang.8.9.7
- Sep 30, 2024
- JOURNAL OF LANGUAGE STUDIES
The current study investigates how private schools target native and non-native English speakers through YouTube advertisements using a multi-modal discourse analysis approach. The study examines persuasive strategies, how ads appeal to viewers, and influence decision making in native and non-native private school ads. Native school ads may emphasize communicating in English, prestige of private school, and high academic standards. Non-native ads may emphasize diversity of students and resources for learners.Native school ads may use formal, dramatic language while non-native ads use inclusive, accessible language. Ads reflect assumptions about language, education, values and status - native schools emphasize fluency, non-native emphasize inclusive learning. Ads use humor and emotion differently - native ads use culture-specific humor while non-native appeals to shared values like diversity. The study analyzed ads using models from Kress and van Leeuwen(2006), and Halliday(1979) then draw conclusions that verify the hypothesis. Both ad types aim to present a nurturing school but native ads use clearer language while non-native use aspirational, indirect language. Native ads value honesty, respect, collaboration while non-native value inclusiveness, diversity, cultural diversity. The study concludes that ads have similar visual, textual strategies but values and language use varies based on native or non-native audience.
- Book Chapter
1
- 10.1007/978-981-19-1076-0_13
- Jan 1, 2022
The significance of integrating Natural Language Processing (NLP) approaches in healthcare research has become more prominent in recent years, and it has had a transformational impact on the state-of-the-art. In healthcare, NLPs are developed as well as assessed on the basis of words, phrases, or record-level explanations based on patient reports such as side-effects of medications, Medicines for illnesses or semantic characteristics are prescribed (nullification, seriousness), etc. While some NLP projects take into account customer expectations at the level of an individual or a group, these projects are still in the minority. A special focus is placed on psychological wellness research, which is currently the subject of little research in healthcare NLP research networks but where NLP approaches are widely used. Although there have been significant advancements in healthcare NLP strategy improvement, we believe that in order for the profession to grow further, more emphasis should be placed on comprehensive evaluation. To help with this, we offer some helpful ideas, including one on a minor etiquette that may be used when announcing clinical NLP strategy improvement and assessment.KeywordsNatural language processingBig DataHealth careSemantic similaritiesElectronic health records (EHRs)ClassificationMental healthKawasaki diseaseHuntsman Cancer InstituteLinguamatics NLP platformGenomicBio-specimenMorphology
- Research Article
16
- 10.1016/j.jadr.2022.100430
- Dec 1, 2022
- Journal of affective disorders reports
Portability of natural language processing methods to detect suicidality from clinical text in US and UK electronic health records.
- Research Article
- 10.1093/jamia/ocaf141
- Sep 22, 2025
- Journal of the American Medical Informatics Association: JAMIA
ObjectiveRule-based structured data algorithms and natural language processing (NLP) approaches applied to unstructured clinical notes have limited accuracy and poor generalizability for identifying immunosuppression. Large language models (LLMs) may effectively identify patients with heterogenous types of immunosuppression from unstructured clinical notes. We compared the performance of LLMs applied to unstructured notes for identifying patients with immunosuppressive conditions or immunosuppressive medication use against 2 baselines: (1) structured data algorithms using diagnosis codes and medication orders and (2) NLP approaches applied to unstructured notes.Materials and MethodsWe used hospital admission notes from a primary cohort of 827 intensive care unit (ICU) patients at Northwestern Memorial Hospital and a validation cohort of 200 ICU patients at Beth Israel Deaconess Medical Center, along with diagnosis codes and medication orders from the primary cohort. We evaluated the performance of structured data algorithms, NLP approaches, and LLMs in identifying 7 immunosuppressive conditions and 6 immunosuppressive medications.ResultsIn the primary cohort, structured data algorithms achieved peak F1 scores ranging from 0.30 to 0.97 for identifying immunosuppressive conditions and medications. NLP approaches achieved peak F1 scores ranging from 0 to 1. GPT-4o outperformed or matched structured data algorithms and NLP approaches across all conditions and medications, with F1 scores ranging from 0.51 to 1. GPT-4o also performed impressively in our validation cohort (F1 = 1 for 8/13 variables).DiscussionLLMs, particularly GPT-4o, outperformed structured data algorithms and NLP approaches in identifying immunosuppressive conditions and medications with robust external validation.ConclusionLLMs can be applied for improved cohort identification for research purposes.
- Conference Article
39
- 10.1109/icoac48765.2019.246843
- Dec 1, 2019
Document classification is a prevalent task in Natural Language Processing (NLP), which has an extensive range of applications in the biomedical domains such as biomedical literature indexing, automatic diagnosis codes assignment, tweets classification for public health topics, and patient safety reports classification. Nevertheless, manual classification of biomedical articles published every year into specific predefined categories becomes a cumbersome task. Hence, building an automatic document classification for biomedical databases emerges as a significant task among the scientific community. In recent years, Deep Learning (DL) models like Deep Neural Networks (DNN), Convolution Neural Networks (CNN), Recurrent Neural Networks (RNN), and Ensemble Deep Learning models are widely used in the area of text document classification for better classification performance compared to Machine Learning (ML) algorithms. The major advantage of using DL models in document classification is that it provides rich semantic and grammatical information for document representation through pre-trained word embedding. Hence, this paper investigates the deployment of the various state-of-the-art DL based classification models in automatic classification of benchmark biomedical datasets. Finally, the performance of all the aforementioned constitutional classifiers is compared and evaluated through the well-defined performance evaluation metrics such as accuracy, precision, recall, and f1-measure.
- Research Article
49
- 10.1111/bjet.12875
- Aug 26, 2019
- British Journal of Educational Technology
In this study, we explore the potential of a natural language processing (NLP) approach to support discourse analysis of in‐situ, small group learning conversations. The theoretical basis of this work derives from Bakhtin’s notion of speech genres as bounded by educational robotics activity. Our goal is to leverage computational linguistics methods to advance and improve educational research methods. We used a parts‐of‐speech (POS) tagging program to automatically parse a transcript of spoken dialogue collected from a small group of middle school students involved in solving a robotics challenge. We grammatically parsed the dialogue at the level of the trigram. Then, through a deliberative process, we mapped the POS trigrams to our theoretically derived problem solving in computational environments coding system. Next, we developed a stacked histogram visualization to identify rich interactional segments in the data. Seven segments of the transcript were thus identified for closer analysis. Our NLP‐based approach partially replicated prior findings. Here, we present the theoretical basis for the work, our analytical approach in exploring this NLP‐based method, and our research findings. Practitioner Notes What is already known about this topic Over the last 10 years, several educational research papers indicate that natural language processing (NLP) techniques can be used to help interpret well‐structured, written dialogue, eg, conversations in online class discussions. Two recent papers indicate that NLP techniques can also be used to help interpret well‐structured, spoken dialogue, eg, replies to interview questions and/or comments made during think aloud protocols. Multimodal learning analytic techniques are being used to investigate collaborative learning. These studies use non‐verbal features of data (gaze, gesture, physical actions), prosodic features of verbal data (pitch and tone) and/or turn‐taking and duration of talk per speaker data, as means of predicting group success. None of the MMLA studies attempt semantic analysis of student talk in collaborative settings. What this paper adds A theoretical framework for why and how an automated NLP approach can support discourse analysis research on co‐located, computer‐based, collaborative problem solving interactions. This framework, entitled the Problem Solving in Computational Environment Speech Genre, links children’s physical interactions with computational devices to their verbal exchanges and presents a theoretical rationale for the use of NLP methods in educational research. Description of an interdisciplinary method that combines NLP techniques with qualitative coding approaches to support analysis of student collaborative learning with educational robotics. Identification of student learning outcomes derived from the semantic, PSCE Speech Genre and NLP approach. Implications for practice and/or policy Educational researchers will be able to expand upon our findings towards the goal of using computation and automation to support microgenetic analysis of large datasets. Robust microgenetic learning findings will provide curriculum developers, educational technology developers and teachers with guidance on how to construct and or create learning materials and environments. From an interdisciplinary perspective, this research can support more interdisciplinary exploration of conversational dialogues that are ill‐structured, indexical and referential. This research will support the further development of machine learning techniques and neural network models by computational linguists.
- Conference Article
15
- 10.1109/icsct53883.2021.9642565
- Aug 5, 2021
Fake news is invalid and misleading information that is conveyed as accurate news. Fake news detection has become indispensable in modern society because of the extreme propagation of false news on social platforms and news portals. Several studies have been released that use fake news on social platforms instead of news content for decision-making. Therefore, this paper introduces an automated model for detecting fake news relying on Deep Learning (DL) and Natural Language Processing (NLP) for a low-resource language like Bangla, utilizing news content and headline features. We propose an ensemble approach of Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) with a pre-trained GloVe embedding method that achieved an accuracy of 98.71% on the test data. For comparison, the combination of Long short-term memory (LSTM) and CNN with Globe is trained using the same dataset and parameters. We also experimented on a benchmark dataset containing English news with our suggested model and achieved an accuracy of 98.94%. Our model’s performance is evaluated using diverse evaluation metrics, including accuracy, recall, precision, fl-score, etc.
- Book Chapter
- 10.1007/978-3-030-97532-6_11
- Jan 1, 2022
Social media fuels fake news’ spread across the world. English news has dominated existing fake news research, and how fake news in different languages compares remains severely under studied. To address this scarcity of literature, this research examines the content and linguistic behaviors of fake news in relation to COVID-19. The comparisons reveal both differences and similarities between English and Spanish fake news. The findings have implications for global collaboration in combating fake news.KeywordsFake newsLanguageTopics modelingContent-based behavior linguistic behavior
- Research Article
- 10.55041/ijsrem47848
- May 14, 2025
- INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
ABSTRACT—The linguistic diversity of India is both a cultural asset and a major communication challenge. With 22 official languages and hundreds of dialects, the need for efficient translation tools is critical. While global solutions like Google Translate provide multilingual translation capabilities, there is still a shortage of localized, accessible, and user-friendly applications specifically for Indian languages.This work presents the Indian Language Translator, a desktop application based on Python that translates text in real-time between English and prominent Indian languages through a simple graphical user interface (GUI) developed using Tkinter and the Google Translate API. The project's motivation, system design, implementation approaches, evaluation, and its potential impacts are discussed in the research. Future work directions for offline translation, speech integration, and further language support are also explored. Keywords—Indian Languages, Language Translation, GUI Development, Machine Translation, Tkinter, Google Translate API, Python Applications, Natural Language Processing (NLP)
- Research Article
1
- 10.48001/jocnv.2024.221-5
- Mar 15, 2024
- Journal of Computer Networks and Virtualization
The purpose of this study is to investigate the enhancement of Financial Sentiment Analysis by conducting an in-depth investigation of Natural Language Processing (NLP) approaches for the purpose of improving market prediction. The purpose of this research is to investigate the potential of natural language processing (NLP) to improve the accuracy and efficiency of sentiment analysis. This is in response to the complex structure of financial markets and the crucial role that sentiment plays. The examination of the relevant literature highlights the limits of traditional methods and the urgent need for creative solutions in the field of financial sentiment research. The approach that we use entails the careful collecting of data from social media and financial news, with a particular emphasis on the utilization of strong pre-processing tools. The research assesses the performance parameters of accuracy, precision, recall, and correlation with market trends by using natural language processing (NLP) technologies such as algorithms for sentiment analysis, Named Entity Recognition, and deep learning models. The findings include a comparative examination of conventional methods and those based on natural language processing (NLP), therefore revealing insights into the significant influence that sentiment has on market patterns. The results not only provide a contribution to the theoretical knowledge of sentiment research, but they also offer real consequences for financial analysts who are looking to make market forecasts that are more accurate and timelier. The research suggests ways for refinement, with an emphasis on enhanced pre-processing and Explainable AI integration. These tactics are being proposed to address issues in data quality and bias. When looking to the future, the study provides an overview of potential future paths, which include the investigation of external influences and the development of deep learning models for accurate market forecasting respectively. To summaries, the findings of this research establish natural language processing (NLP) as a revolutionary force in the process of redefining financial sentiment analysis. Furthermore, it offers a path for future developments in the ever-changing world of market prediction.
- Research Article
265
- 10.1016/j.eswa.2018.08.044
- Sep 19, 2018
- Expert Systems with Applications
Sentiment analysis based on improved pre-trained word embeddings
- Research Article
7
- 10.1016/j.cirpj.2021.06.005
- Jul 27, 2021
- CIRP Journal of Manufacturing Science and Technology
Reprint of: Where and how to find bio-inspiration?: A comparison of search approaches for bio-inspired design
- Research Article
21
- 10.1016/j.cirpj.2020.09.013
- Oct 26, 2020
- CIRP Journal of Manufacturing Science and Technology
Where and how to find bio-inspiration?: A comparison of search approaches for bio-inspired design
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.