Integrated Named Entity Recognition and Identical-Entity Detection for Extracting Unique Information Sources in News Articles
Native advertising is often difficult to detect because it resembles regular news articles. One indicator is the absence of diverse information sources or the reliance on a single perspective. Therefore, it is necessary to employ an extraction technique capable of consolidating various forms of identical entity mentions. This study integrates an NER model based on XLNet+BiLSTM+CRF with identical entity classification using Levenshtein distance features and static and contextual vector representations. The results show an F1-score of 93.71% at the entity level and 92.84% for identical entity identification, along with a list of unique citation sources. These findings demonstrate that this unique list can be an additional feature in detecting native advertising, which often relies on a single source. With an average unique entity coverage of 97.40%, the proposed architecture can extract unique entities within news articles
- Book Chapter
3
- 10.1007/978-3-030-96878-6_15
- Jan 1, 2022
In recent years the native advertisement is becoming more and more prevalent in online spaces. Differentiating between genuine content and native advertisement using Natural Language Processing is therefore also becoming a very interesting research topic. In this paper, we examine the possibilities of using deep textual representation for the Slovak language to recognize the “PR (Public relations) articles” (that serve as a native advertisement in this context) from authentic news articles on popular Slovak news websites. We show that the BERT (Bidirectional Encoder Representations from Transformers) embeddings as a text representation are sufficient for this task (achieving accuracy over 80% even with a statistical model - Logistic Regression) and that the models generally perform better without prior lemmatization.We have scraped three Slovak news websites (for a total of 5455 news articles containing both paid-for content and a wide variety of genuine categories), and we have evaluated multiple binary classification methods (Logistic Regression, Random forest classifier and Support Vector Machines) trained on top of generated RoBERTa sentence embeddings. On our testing set, we were able to achieve an accuracy of 85.13%.KeywordsNLPSlovak languageNative advertisementText classification
- News Article
21
- 10.1007/s10586-021-03361-w
- Aug 4, 2021
- Cluster Computing
The world is diving deeper into the digital age, and the sources of first information are moving towards social media and online news portals. The chances of being misinformed increase multifold as our reliance on sources of information are getting ambiguous. Traditional news sources followed strict codes of practice to verify stories, whereas today, users can upload news items on social media and unverified portals without proving their veracity. The absence of any determinants of such news articles’ truthfulness on the Internet calls for a novel approach to determine the realness quotient of unverified news items by leveraging technology. This study presents a dynamic model with a secure voting system, where news reviewers can provide feedback on news, and a probabilistic mathematical model is used for predicting the truthfulness of the news item based on the feedback received. A blockchain-based model, ProBlock is proposed; so that correctness of information propagated is ensured.
- Research Article
- 10.14738/assrj.1209.19429
- Oct 2, 2025
- Advances in Social Sciences Research Journal
Native advertising is most often defined as a hybrid form that merges journalism and advertising, but this paper explores the thesis that it may also be considered a hybrid of literature. By examining the similarities between native content and literary texts — particularly their fictional basis, appellative function, objectivity or subjectivity, language and writing style, form, authorship, reader involvement, ethics and public interest — this study shows that native advertising shares several core elements with literature. While it lacks artistic intent, native advertising blurs boundaries between genres and uses literary writing and narrative techniques to enhance its persuasive effect. Through a review of recent literature and selected examples of native advertising in lifestyle media, this study demonstrates that native advertising is a hybrid of literature. It presents fictional events, uses literary descriptions and artistic style, occasionally adopts the structure of a short story, is subjective, conveys impressions and emotions, and employs narration. Unlike news articles, it does not report on real events nor answer fundamental journalistic questions, thus distancing itself significantly from journalistic form. As audiences shift from literary to digital media consumption, native advertising may be evolving into a modern, commercially driven replacement for fiction.
- Research Article
11
- 10.3390/math11020485
- Jan 16, 2023
- Mathematics
Ontology is the kernel technique of Semantic Web (SW), which enables the interaction and cooperation among different intelligent applications. However, with the rapid development of ontologies, their heterogeneity issue becomes more and more serious, which hampers communications among those intelligent systems built upon them. Finding the heterogeneous entities between two ontologies, i.e., ontology matching, is an effective method of solving ontology heterogeneity problems. When matching two ontologies, it is critical to construct the entity pair’s similarity feature by comprehensively taking into consideration various similarity features, so that the identical entities can be distinguished. Due to the ability of learning complex calculating model, recently, Artificial Neural Network (ANN) is a popular method of constructing similarity features for matching ontologies. The existing ANNs construct the similarity feature in a single perspective, which could not ensure its effectiveness under diverse heterogeneous contexts. To construct an accurate similarity feature for each entity pair, in this work, we propose an adaptive aggregating method of combining different ANNs. In particular, we first propose a context-based ANN and syntax-based ANN to respectively construct two similarity feature matrices, which are then adaptively integrated to obtain a final similarity feature matrix through the Ordered Weighted Averaging (OWA) and Analytic hierarchy process (AHP). Ontology Alignment Evaluation Initiative (OAEI)’s benchmark and anatomy track are used to verify the effectiveness of our method. The experimental results show that our approach’s results are better than single ANN-based ontology matching techniques and state-of-the-art ontology matching techniques.
- Research Article
- 10.69528/jkmla.2023.50.1_2.4
- Dec 1, 2023
- Journal of Korean Medical Library Association
Publishing a Japan Medical Library Association (JMLA) official journal 「IGAKU TOSHOKAN」 is an important business of JMLA. The first issue was published in 1954. The content consists of articles on information activities and services in medicine and related fields, as well as regular and news articles that emphasize communication among readers. There are 3 main editorial policies: 1) media for Japanese medical librarians to discuss and inform each other, 2) something useful for medical librarians as well as users, and 3) accumulation of knowledge. The editorial board has 9 members. Editorial work is online, but editorial board members meet for final proofreading. There are 7 regular articles: 1) From the Member Libraries, 2) Forum, 3) Letter to the Editor, 4) Reference Cases, 5) Book Reviews, 6) News, and 7) Journal Club. The COVID-19 pandemic has changed the shape of communication. However, the raison d'etre of the journal remained the same. We will continue our activities as an entity that supports the professional development of our members.
- Research Article
24
- 10.1016/j.jbi.2020.103374
- Jan 3, 2020
- Journal of Biomedical Informatics
Disease surveillance using online news: Dengue and zika in tropical countries.
- Research Article
17
- 10.1080/21670811.2020.1836980
- Nov 6, 2020
- Digital Journalism
This study investigates the visual objects that are used to either disclose or disguise the commercial nature of native advertising as news articles. We adopt a “material object” approach to explore the potential implications for journalism regarding transparency, trust, and credibility. Methodologically, this study used content analysis covering 21 publications in five countries: Germany, Israel, Norway, Spain, and Sweden. We analysed 373 individual native ads. The findings show that news outlets do not follow a consistent way to disclose native ads visually, negotiating the balance between transparency and deception. In this balance, news organizations do not boldly push for transparency and instead remain ambiguous. Our analyses show that both national and organizational characteristics matter when shaping the visual boundaries of journalism.
- Book Chapter
2
- 10.4324/9781003140399-4
- May 14, 2022
Native advertising – a form of sponsored content that mimics news articles – has penetrated the news space online and blurred the news-advertising boundary. This chapter traces the historical root of native advertising, summarizes the formats and models of integrating native advertising into news websites, and identifies the limitations and challenges of differentiating native advertising from news in practice, despite prominent research on the effects of disclosure on ad recognition. The disappearing news-advertising boundary highlights a cultural transition from transparency and independence to integration and collaboration within the news organizations. The transition symbolizes a continual power struggle between the autonomous and heteronomous forces in the field of journalism, dominated by the latter. To defend the news-advertising boundary, publishers should disclose native ads explicitly in words and visuals. Audiences should also acquire media literacy and digital literacy to help them differentiate the unique formats, content features, and purposes of native advertising from news.
- Research Article
11
- 10.1080/21670811.2019.1571931
- Feb 13, 2019
- Digital Journalism
This study compares the role performance of native advertising on the websites of legacy news media and digital-only news media in the United States. Instead of analyzing discourses and rhetorics about native advertising, this study concentrates on the content characteristics of native advertising to infer its roles oriented to audiences. The study finds that native ads on the selected media sites have more emphasis on the service role than the civic and infortainment roles. The digital-only media sites have native ads with more emphasis on the service and infortainment roles than those in the legacy media sites, whereas the legacy media sites have native ads with more emphasis on the civic role than its counterparts in the digital-only media sites. However, native advertising lacks independent and transparent sources of information. Nearly half of the native ads on the legacy media sites reply on sponsor-affiliated sources, and more than half of the native ads on the digital-only media sites lack attributions for sources of information.
- Research Article
9
- 10.17485/ijst/2018/v11i47/130980
- Dec 1, 2017
- Indian Journal of Science and Technology
Objectives: Territorial disputes over the West Philippine Sea are an emerging issue in the Philippines and other ASEAN Members with China. The only source of information is coming from the media like the television, online news portals, and social media. In line with this, this study aimed to unveil Netizens emotions and common issues found from the different news articles. Methods/Statistical Analysis: This study scraped the different conversations from Twitter and online news articles published from January 2018 to April 2018 using a Pluchik algorithm for emotions, Valence Awareness Dictionary sEntiment Reasoner (VADER) algorithm for sentiment analysis, and Latent Dirichlet Allocation (LDA) using Gibbs Sampling – Monte Carlo Markov Chain (MCMC) for topic modeling. 2500 times of simulation were performed and validated before developing the hidden themes that are prevalent in the different news articles and supported by Netizens emotions. Findings: Results revealed five (5) underlying themes found in the different news articles such as The assertion of Philippine Sovereignty over Wes Philippine Sea Claim, Philippine-China Co-Ownership and Joint Exploitation of Natural Resources in the Disputed Islands, Intensive Military Presence, Freedom of Navigation, and Military Facilities Buildup in the Disputed Island, China-Philippine Relations through Bilateral Agreement, Trade, Military Support, and Security, and One China Policy through Territorial and Defense Power. Also, Netizens Sentiments showed a majority of them who tweeted were neutral (95%) and only few who were negative (2.67%) and positive (2.33%) where the majority of their emotions through tweets were classified as Joy, Surprise, and Sadness. Only a few tweets were classified as anger, disgust, anticipation, and fear. The findings were unique since it focuses on the soft evidence and the actual opinions, views, and emotions relative to the issues on the disputed islands. Moreover, this result could give insights to the appropriate authorities in dealing with the situations in the West Philippine Sea. Application/Improvements: Further studies maybe conducted by using other social media like Facebook in understanding the emotions and views of the Netizens. Keywords: Asia, Content Analysis, Data Mining, Deciphering, Gibbs Sampling, Latent Dirichlet Allocation, Plutchik Wheel of Emotion, Simulation, Topic Modeling, VADER Sentiment Analysis
- Conference Article
13
- 10.1109/icosc.2019.8665610
- Jan 1, 2019
In this era of fake news and political polarization, it is desirable to have a system to enable users to access balanced news content. Current solutions focus on top down, server based approaches to decide whether a news article is fake or biased, and display only trusted news to the end users. In this paper, we follow a different approach to help the users make informed choices about which news they want to read, making users aware in real time of the bias in news articles they were browsing and recommending news articles from other sources on the same topic with different levels of bias. We use a recent Pew research report to collect news sources that readers with varying political inclinations prefer to read. We then scrape news articles on a variety of topics from these varied news sources. After this, we perform clustering to find similar topics of the articles, as well as calculate a bias score for each article. For a news article the user is currently reading, we display the bias score and also display other articles on the same topic, out of the previously collected articles, from different news sources. This we present to the user. This approach, we hope, would make it possible for users to access more balanced articles on given news topics. We present the implementation details of the system along with some preliminary results on news articles.
- Abstract
1
- 10.1016/s0140-6736(19)32870-3
- Nov 1, 2019
- The Lancet
Readers’ comments of UK online news media about the Zika virus: a qualitative content analysis
- Research Article
- 10.1080/10410236.2025.2587893
- Nov 24, 2025
- Health Communication
This study examined the information landscape on cervical cancer by employing the extended cancer infodemiology framework and integrated analysis. We analyzed and compared news articles and online platform posts as the two primary sources of health information regarding cancer. Data were collected from BigKinds, a news article database, and blogs and café posts on Naver, which dominates the health information search market in Korea. We analyzed 68,977 documents (news n = 1,043, blog posts n = 56,455, café posts n = 11,479) using traditional manual coding, frequency analysis, and topic modeling to address our research questions. Results show that the key topics differ between news articles and online platform posts on cervical cancer. News covered the “development and clinical trials of treatment” prominently, while recommended hospitals for checkups (blogs) and seeking recommendations for OB/GYN clinics (cafés) were prominent in online platforms. The Cervical Cancer Awareness Month could not influence the frequency or topic of news articles or online platform posts on cervical cancer in Korea. Moreover, the findings show that the dominant engagers were economic daily newspapers (news articles), hospital accounts (blogs), and the cafés on family and childrearing. This study offers empirical evidence of differing pictures across digital platforms. Furthermore, it discusses the possibility that dominant engagers or their interests could have influenced different pictures in the information landscape on cervical cancer.
- Research Article
2
- 10.7233/jksc.2021.71.2.093
- Apr 30, 2021
- Journal of the Korean Society of Costume
Recently, many brands and advertising companies have used native advertising as an alternative way to reduce consumers’ ad avoidance. Therefore, this study intended to empirically verify the effect of fashion-related native advertising on purchase intention by comparing advertising information sources, content types, and activation of persuasion knowledge. The study’s design consisted of three components. It examined two social media fashion information sources: brand source versus consumer source; two content types: factual message versus evaluative message; and two types of activation of persuasion knowledge: low activation versus high activation. This study was conducted on 532 females in their twenties and thirties in Seoul and Incheon and used a total of 529 samples in the final analysis, excluding three compromised or incomplete samples. Frequency analysis, credibility analysis, independent sample t-test, three-way ANOVA, and simple main effect analysis were conducted using SPSS 25.0 Statistical Package for data analysis. The result of this study were as follows. First, a brand source combined with a factual message was demonstrated to cause high purchase intention. Second, positive purchase intention was also demonstrated when a consumer source provided a factual message. Third, when a brand source provided native advertising, the consumer group with high persuasion knowledge demonstrated high purchase intention. Fourth, both low and high activation of persuasion knowledge groups demonstrated positive purchase intention when a consumer source provided a factual message.
- Research Article
48
- 10.1108/intr-08-2019-0328
- Mar 2, 2021
- Internet Research
PurposeDrawing upon attribution theory, this study aims to examine how different types of product information sources (mainstream celebrities vs micro-celebrities) interact with content type (experiential vs promotional) to influence consumer response toward native posts on social media (causal attributions and click intention).Design/methodology/approachA total of 134 adult Twitter users participated in a 2 (source type: mainstream celebrity vs micro-celebrity) × 2 (content type: experiential vs promotional) between-subjects online experimental design.FindingsResults showed that for experiential native advertising, messages from a micro-celebrity generated more information-sharing attributions and less monetary gain attributions than those from a mainstream celebrity on social media. Moreover, the experiential native ads from a micro-celebrity elicited greater intention to click the URL than those from a mainstream celebrity. However, consumer response was similar for promotional native advertising regardless of message source. This study demonstrates that information-sharing attributions mediate the interaction effects of source type and content types on click intention.Originality/valueThis study contributes to the literature on native advertising by providing empirical evidence to highlight the effect of message source and content type on consumer response. This study shows that the success of native advertising depends on how consumers perceive the messages and content creators' intention to communicate.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.