Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Information Preparation with the Human in the Loop

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

With the advent of the World Wide Web (WWW) and the rise of digital media consumption, abundant information is available nowadays for any topic. But these days users often suffer from information overload posing a great challenge for finding relevant and important information. To alleviate this information overload and provide significant value to the users, there is a need for automatic information preparation methods. Such methods need to support users by discovering and recommending important information while filtering redundant and irrelevant information. They need to ensure that the users do not drown in, but rather benefit from the prepared information. However, the definition of what is relevant and important is subjective and highly specific to the user’s information need and the task at hand. Therefore, a method must continually learn from the feedback of its users. In this thesis, we propose new approaches to put the human in the loop in order to interactively prepare information along the three major lines of research: information aggregation, condensation, and recommendation. For multiple well-studied tasks in natural language processing, we point out the limitation of existing methods and discuss how our approach can successfully close the gap to the human upper bound by considering user feedback and adapting to the user’s information need. We put a particular focus on applications in digital journalism and introduce the new task of live blog summarization. We show that the corpora we create for this task are highly heterogeneous as compared to the standard summarization datasets which pose new challenges to previously proposed non-interactive methods. One way to alleviate information overload is information aggregation. We focus on the corresponding task of multi-document summarization and argue that previously proposed methods are of limited usefulness in the real-world application as they do not take the users’ goal into account. To address these drawbacks, we propose an interactive summarization loop to iteratively create and refine multi-document summaries based on the users’ feedback. We investigate sampling strategies based on active machine learning and joint optimization to reduce the number of iterations and the amount of user feedback required. Our approach significantly improves the quality of the summaries and reaches a performance near the human upper bound. We present a system demonstration implementing the interactive summarization loop, study its scalability, and highlight its use cases in exploring document collections and creating focused summaries in journalism. For information condensation, we investigate a text compression setup. We address the problem of neural models requiring huge amounts of training data and propose a new interactive text compression method to reduce the need for large-scale annotated data. We employ state-of-the-art Seq2Seq text compression methods as our base models and propose an active learning setup with multiple sampling strategies to efficiently use minimal training data. We find that our method significantly reduces the amount of data needed to train and that it adapts well to new datasets and domains. We finally focus on information recommendation and discuss the need for explainable models in machine learning. We propose a new joint recommendation system of rating prediction and review summarization, which shows major improvements over state-of-the-art systems in both the rating prediction and the review summarization task. By solving this task jointly based on multi-task learning techniques, we furthermore obtain explanations for a rating by showing the generated review summary marked based on the model’s attention and a histogram of user preferences learned from the reviews of the users. We conclude the thesis with a summary of how human-in-the-loop approaches improve information preparation systems and envision the use of interactive machine learning methods also for other areas of natural language processing.

Similar Papers
  • Research Article
  • Cite Count Icon 22
  • 10.1016/j.csl.2015.11.004
Weighted hierarchical archetypal analysis for multi-document summarization
  • Nov 23, 2015
  • Computer Speech & Language
  • Ercan Canhasi + 1 more

Weighted hierarchical archetypal analysis for multi-document summarization

  • Research Article
  • Cite Count Icon 88
  • 10.1017/s1351324908004968
Adapting SVM for data sparseness and imbalance: a case study in information extraction
  • Apr 1, 2009
  • Natural Language Engineering
  • Yaoyong Li + 2 more

Support Vector Machines (SVM) have been used successfully in many Natural Language Processing (NLP) tasks. The novel contribution of this paper is in investigating two techniques for making SVM more suitable for language learning tasks. Firstly, we propose an SVM with uneven margins (SVMUM) model to deal with the problem of imbalanced training data. Secondly, SVM active learning is employed in order to alleviate the difficulty in obtaining labelled training data. The algorithms are presented and evaluated on several Information Extraction (IE) tasks, where they achieved better performance than the standard SVM and the SVM with passive learning, respectively. Moreover, by combining SVMUM with the active learning algorithm, we achieve the best reported results on the seminars and jobs corpora, which are benchmark data sets used for evaluation and comparison of machine learning algorithms for IE. In addition, we also evaluate the token based classification framework for IE with three different entity tagging schemes. In comparison to previous methods dealing with the same problems, our methods are both effective and efficient, which are valuable features for real-world applications. Due to the similarity in the formulation of the learning problem for IE and for other NLP tasks, the two techniques are likely to be beneficial in a wide range of applications1.

  • PDF Download Icon
  • Supplementary Content
  • Cite Count Icon 346
  • 10.2196/17984
Clinical Text Data in Machine Learning: Systematic Review
  • Mar 31, 2020
  • JMIR Medical Informatics
  • Irena Spasic + 1 more

BackgroundClinical narratives represent the main form of communication within health care, providing a personalized account of patient history and assessments, and offering rich information for clinical decision making. Natural language processing (NLP) has repeatedly demonstrated its feasibility to unlock evidence buried in clinical narratives. Machine learning can facilitate rapid development of NLP tools by leveraging large amounts of text data.ObjectiveThe main aim of this study was to provide systematic evidence on the properties of text data used to train machine learning approaches to clinical NLP. We also investigated the types of NLP tasks that have been supported by machine learning and how they can be applied in clinical practice.MethodsOur methodology was based on the guidelines for performing systematic reviews. In August 2018, we used PubMed, a multifaceted interface, to perform a literature search against MEDLINE. We identified 110 relevant studies and extracted information about text data used to support machine learning, NLP tasks supported, and their clinical applications. The data properties considered included their size, provenance, collection methods, annotation, and any relevant statistics.ResultsThe majority of datasets used to train machine learning models included only hundreds or thousands of documents. Only 10 studies used tens of thousands of documents, with a handful of studies utilizing more. Relatively small datasets were utilized for training even when much larger datasets were available. The main reason for such poor data utilization is the annotation bottleneck faced by supervised machine learning algorithms. Active learning was explored to iteratively sample a subset of data for manual annotation as a strategy for minimizing the annotation effort while maximizing the predictive performance of the model. Supervised learning was successfully used where clinical codes integrated with free-text notes into electronic health records were utilized as class labels. Similarly, distant supervision was used to utilize an existing knowledge base to automatically annotate raw text. Where manual annotation was unavoidable, crowdsourcing was explored, but it remains unsuitable because of the sensitive nature of data considered. Besides the small volume, training data were typically sourced from a small number of institutions, thus offering no hard evidence about the transferability of machine learning models. The majority of studies focused on text classification. Most commonly, the classification results were used to support phenotyping, prognosis, care improvement, resource management, and surveillance.ConclusionsWe identified the data annotation bottleneck as one of the key obstacles to machine learning approaches in clinical NLP. Active learning and distant supervision were explored as a way of saving the annotation efforts. Future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which do not require data annotation.

  • Research Article
  • Cite Count Icon 21
  • 10.1162/coli_a_00420
Natural Language Processing and Computational Linguistics
  • Dec 7, 2021
  • Computational Linguistics
  • Junichi Tsujii

Natural Language Processing and Computational Linguistics

  • Conference Article
  • 10.1109/trustcom56396.2022.00205
Pre-training Fine-tuning data Enhancement method based on active learning
  • Dec 1, 2022
  • Deqi Cao + 3 more

With the development of Internet technology, the number of Internet users increases rapidly, and the amount of data generated on the Internet is very large every day. At the same time, with the development of storage technology and query technology, it is very easy to collect massive data, but the information value contained in these data is uneven, and most of them are unmarked. However, traditional supervised learning has a great demand for labeled samples. Faced with a large number of unlabeled samples, there is a problem of the lack of effective automatic labeling methods, and manual labeling costs are high. If the strategy of simple random sampling is used for annotation, it may lead to the selection of noisy information and waste of resources, and low-quality training data could also have an influence on the prediction accuracy of the model. Meanwhile, the training effect of traditional deep learning methods is very limited for small sample labeled training sets.This paper takes the text emotion analysis task in natural language processing as the background, selects IMDB film review data as the training set and test set, starts with the design of active learning algorithm based on clustering analysis, combined with the appropriate pre-training fine-tuning model, constructs a data enhancement method based on active learning. In the experiment, it is found that when the labeled training set is reduced by 90%, the prediction accuracy of the pre-training model is reduced by no more than 2%, which verifies the effectiveness of the data enhancement method combining active learning with the pre-training model.

  • Research Article
  • Cite Count Icon 121
  • 10.1145/3593042
A Survey of Adversarial Defenses and Robustness in NLP
  • Jul 17, 2023
  • ACM Computing Surveys
  • Shreya Goyal + 3 more

In the past few years, it has become increasingly evident that deep neural networks are not resilient enough to withstand adversarial perturbations in input data, leaving them vulnerable to attack. Various authors have proposed strong adversarial attacks for computer vision and Natural Language Processing (NLP) tasks. As a response, many defense mechanisms have also been proposed to prevent these networks from failing. The significance of defending neural networks against adversarial attacks lies in ensuring that the model’s predictions remain unchanged even if the input data is perturbed. Several methods for adversarial defense in NLP have been proposed, catering to different NLP tasks such as text classification, named entity recognition, and natural language inference. Some of these methods not only defend neural networks against adversarial attacks but also act as a regularization mechanism during training, saving the model from overfitting. This survey aims to review the various methods proposed for adversarial defenses in NLP over the past few years by introducing a novel taxonomy. The survey also highlights the fragility of advanced deep neural networks in NLP and the challenges involved in defending them.

  • Dissertation
  • 10.32657/10356/54827
Harnessing online social media to deal with information overload.
  • Jan 1, 2013
  • Chenliang Li

In online social media, users become information creators and disseminators through the active interplay between information items and other users, instead of just being information consumers of a decade ago. This kind of information production and dissemination in collaborative and active manner further aggravates the problem of information overload on the World Wide Web (WWW). The existing approaches for information retrieval (IR) and natural language processing (NLP) tasks often offer an intolerable response time for Web users. Moreover, given the numerous interactions between users and information items, new kinds of information needs are emerging, such as opinion mining, event detection and summarization, etc. However, the existing IR technologies (based on bag-of-word model), and NLP technologies (based on the linguistical features), often fail to satisfy the web users in these emerging information needs. On the other hand, people participate in online social media to share stories, photos with their friends, vote and leave opinions, or tag web pages, and so on. The digital footprints of these behaviors make online social media semantic resources which we can exploit to better understand and organize the astronomical information.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 82
  • 10.1007/s11633-022-1331-6
Paradigm Shift in Natural Language Processing
  • May 28, 2022
  • Machine Intelligence Research
  • Tian-Xiang Sun + 3 more

In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, named entity recognition (NER), and chunking, and adopt the classification paradigm to solve tasks like sentiment analysis. With the rapid progress of pre-trained language models, recent years have witnessed a rising trend of paradigm shift, which is solving one NLP task in a new paradigm by reformulating the task. The paradigm shift has achieved great success on many tasks and is becoming a promising way to improve model performance. Moreover, some of these paradigms have shown great potential to unify a large number of NLP tasks, making it possible to build a single model to handle diverse tasks. In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.

  • Dissertation
  • Cite Count Icon 1
  • 10.32657/10356/168487
Natural language processing as autoregressive generation
  • Jan 1, 2023
  • Xiang Lin

The advances in deep learning have led to great achievements in many Natural Language Processing (NLP) tasks. With the nature of language, i.e., sequential data, most NLP tasks can be framed into the sequence learning framework, such as text generation. As one of the most important foundations for modern NLP techniques, autoregressive generation models have achieved dominant performance in a great deal of NLP tasks. Therefore, this thesis emphasizes improving the autoregressive generation model for different NLP tasks. While many tasks can naturally fit into the sequence learning framework, some of them, e.g., building discourse parsing tree, require sophisticated designs to fit into neural models. Therefore, this thesis firstly emphasizes a novel unified framework for discourse parsing, which builds a discourse tree in a top-down depth-first manner, and it frames the task as an autoregressive generation task with the goal of each step being the prediction of the node position given a piece of text. The proposed approach is proven effective with extensive empirical experiments. In addition, I extend the above framework by proposing a hierarchical decoder, which leverages the information from parents and siblings of the nodes that are currently processed. The proposed decoder utilizes the nature of the tree structure and further improves the experiment performance on both discourse parsing and dependency parsing tasks. On the other hand, the de facto strategies, i.e., cross entropy loss and teacher forcing, for training the autoregressive generation models have been shown problematic in certain aspects. For example, cross entropy loss, which is one of the widely leveraged training objective functions, often leads to text degeneration in text generation, and teacher forcing suffers from the exposure bias problem, where there exists a mismatch between the training and testing setup. For text degeneration, I introduce a class of diminishing attentions, which enforces the submodularity of the coverage calculated by cross attention in the sequence-to-sequence model. The proposed diminishing attentions achieve notable improvement on several neural text generation tasks, including text summarization, machine translation, and image paragraph generation. Further, I propose a novel training objective, ScaleGrad, to replace cross entropy, which significantly reduces the degeneration problem in different text generation tasks. In fact, ScaleGrad can be extended to problems beyond text degeneration. It provides wide flexibility to inject different inductive biases into the text generation model by directly modifying the gradient information in the output layer. Next, for the exposure bias problem, this thesis introduces a novel type of scheduled sampling based on training accuracy, which requires only minimal hyper-parameter tuning compared to existing scheduled sampling methods. Additionally, a novel imitation loss is proposed to further enforce the model’s generative behavior to match the teacher-forced behavior. Moreover, this thesis demonstrates that reducing exposure bias can improve the robustness of language models against repetition and toxic errors.

  • Research Article
  • 10.1162/coli_r_00388
Statistical Significance Testing for Natural Language Processing. By Rotem Dror, Lotem Peled-Cohen, Segev Shlomov, and Roi Reichart (Technion Israel Institute of Technology)). Morgan & Claypool (Synthesis Lectures on Human Language Technologies, edited by Graeme Hirst, volume 45), 2020, xx+98 pp; paperback, ISBN 978-1-68173-795-9, $49.95; ebook, ISBN 978-1-68173-796-6, $39.96; hardcover, $69.95; doi:10.2200/S00994ED1V01Y202002HLT045
  • Oct 29, 2020
  • Computational Linguistics
  • Edwin D Simpson

Like any other science, research in natural language processing (NLP) depends on the ability to draw correct conclusions from experiments. A key tool for this is statistical significance testing: We use it to judge whether a result provides meaningful, generalizable findings or should be taken with a pinch of salt. When comparing new methods against others, performance metrics often differ by only small amounts, so researchers turn to significance tests to show that improved models are genuinely better. Unfortunately, this reasoning often fails because we choose inappropriate significance tests or carry them out incorrectly, making their outcomes meaningless. Or, the test we use may fail to indicate a significant result when a more appropriate test would find one. NLP researchers must avoid these pitfalls to ensure that their evaluations are sound and ultimately avoid wasting time and money through incorrect conclusions.This book guides NLP researchers through the whole process of significance testing, making it easy to select the right kind of test by matching canonical NLP tasks to specific significance testing procedures. As well as being a handbook for researchers, the book provides theoretical background on significance testing, includes new methods that solve problems with significance tests in the world of deep learning and multidataset benchmarks, and describes the open research problems of significance testing for NLP.The book focuses on the task of comparing one algorithm with another. At the core of this is the p-value, the probability that a difference at least as extreme as the one we observed could occur by chance. If the p-value falls below a predetermined threshold, the result is declared significant. Leaving aside the fundamental limitation of turning the validity of results into a binary question with an arbitrary threshold, to be a valid statistical significance test, the p-value must be computed in the right way. The book describes the two crucial properties of an appropriate significance test: The test must be both valid and powerful. Validity refers to the avoidance of type 1 errors, in which the result is incorrectly declared significant. Common mistakes that lead to type 1 errors include deploying tests that make incorrect assumptions, such as independence between data points. The power of a test refers to its ability to detect a significant result and therefore to avoid type 2 errors. Here, knowledge of the data and experiment must be used to choose a test that makes the correct assumptions. There is a trade-off between validity and power, but for the most common NLP tasks (language modeling, sequence labeling, translation, etc.), there are clear choices of tests that provide a good balance.Beginning with a detailed background on significance testing, the book then shows the reader how to carry out tests for specific NLP tasks. There is a mix of styles, with the first four chapters providing reference material that will be extremely useful to both new and experienced researchers. Here, it is easy to find the material related to a given NLP task. The next two chapters discuss more recent research into the application of significance tests to deep neural networks and for testing across multiple datasets. Alongside open research questions, these later chapters provide clear guidelines on how to apply the proposed methods. It is this mix of background material and reference guidelines that I believe makes this book so compelling and nicely self-contained.The introduction in Chapter 1 motivates the need for a comprehensive textbook and outlines challenges that the later chapters address more deeply. The theoretical background material begins in Chapter 2, which introduces core concepts, including hypothesis testing, type 1 and type 2 errors, validity and power, and p-values. The reader does not need to have any prior knowledge of statistical significance tests to follow this part. However, experienced readers could still benefit from reading this chapter, as concepts such as p-values are widely misunderstood and misused (Amrhein, Greenland, and McShane 2019).The significance tests themselves are introduced in Chapter 3, categorized into parametric and nonparametric tests. The chapter begins with the intuitively simple paired z-test, then builds up to more commonly-applied techniques, showing the connections and assumptions that each test makes. Step-by-step algorithms help the reader to implement each test. Although the chapter does cite uses of tests in NLP research, the main purpose is to present the theory behind each test and point out their differences.Chapter 4 provides perhaps the most handy part of the book for reference: a correspondence between common NLP tasks and statistical tests. Each task is discussed in terms of the evaluation metrics used, then a decision tree is introduced to guide the reader toward a choice between a parametric test, bootstrap or randomization test, or sampling-free nonparametric test. Section 4.3 then links each NLP evaluation measure to a specific significance test, presenting a large table that helps readers identify which test they need for a specific task. Particular considerations for each task are also pointed out to provide more detail about the appropriate options. The final part of this chapter describes the issue of p-hacking, in which dataset sizes are increased until a significance threshold is reached without consideration for biases in the data (discussed, for example, in Hofmann [2015]). The chapter proposes a simple solution to ensure robust significance testing with large datasets.Where Chapter 4 presents well-established methods, Chapter 5 introduces the current research question of how best to apply statistical significance testing to deep learning. Non-convex loss functions, stochastic optimization, random initialization, and a multitude of hyperparameters limit the conclusions we can draw from a single test run of a deep neural network (DNN). This chapter, which is based on the authors’ ACL paper (Dror, Shlomov, and Reichart 2019), explains how the comparison process can be overhauled to provide more meaningful evaluations. Beginning by explaining the difficulties of evaluating DNNs, the chapter then introduces criteria for a comparison framework, then discusses the limitations of current methods. Reimers and Gurevych (2018) have previously tackled this problem, but their approach has limited power and does not provide a confidence score. Other works, such as Clark et al. (2011), compare DNNs using a collection of statistics, such as the mean or standard deviation of performance across runs. This book shows how such an approach violates the assumptions of the significance tests. The authors propose almost stochastic dominance as the basis for a better alternative. The chapter explains how to use the proposed method, evaluates it in an empirical case study, and finally analyzes the errors made by each testing approach.Large NLP models are often tested across a range of datasets, which presents another problem for standard significance testing. Chapter 6 discusses the challenges of assessing two questions: (1) On how many datasets does algorithm A outperform algorithm B? (2) On which datasets does A outperform B? Applying standard significance tests individually to each dataset and counting the number of significant results is likely to overestimate the total number of significant results, as this chapter explains. The authors then present a new framework for replicability analysis, based on partial conjunction testing, and discuss two variants (Bonferroni and Fisher) for when the datasets are independent or dependent. They introduce a method based on Benjamini and Heller (2008) to count the number of datasets where one method outperforms another, then show how to use the Holm procedure (Holm 1979) to identify which datasets these are. Chapter 6 provides a lot of detailed background on the proposed replicability analysis framework, and the later sections again link the process to specific NLP case studies, and step-by-step summaries help the reader to apply the methodology. Extensive empirical results illustrate the very substantial differences in outcomes between the proposed approach and standard procedures.The final two chapters present open problems and conclude, showing that the topic has many interesting research questions of its own, such as problems when performing cross-validation, and the limited statistical power of replicability analysis.Overall, I highly recommend this book to a wide range of NLP researchers, from new students to seasoned experts who wish to ensure that they compare methods effectively. The book is excellent as both an introduction to the topic of significance testing and as a reference to use when evaluating your results. For anyone with further interest in the topic, it also points the way to future work. If one could level any criticism at this book at all, it is that it does not deeply discuss the basic flaws of significance testing or what the alternatives might be. For now, though, significance testing is an integral part of NLP research and this book provides a great resource for researchers who wish to perform it correctly and painlessly.

  • Research Article
  • Cite Count Icon 1
  • 10.56941/odutip.1413597
A Developed Graphical User Interface-Based on Different Generative Pre-trained Transformers Models
  • Apr 30, 2024
  • ODÜ Tıp Dergisi
  • Ekrem Küçük + 4 more

Objective: The article investigates the integration of advanced Generative Pretrained Transformers (GPT) models into a user-friendly Graphical User Interface (GUI). The primary objective of this work is to simplify access to complex Natural Language Processing (NLP) tasks for a diverse range of users, including those with limited technical background. Method: The development process of the GUI was comprehensive and systematic: Needs Assessment: This stage involved understanding the requirements and expectations of potential users to ensure the GUI effectively addresses their needs. Preliminary Design and Development: The initial designs were created and developed into a functional GUI, emphasizing the integration of features supporting various NLP tasks like text summarization, translation, and question-answering. Iterative Refinement: Continuous improvements were made based on user feedback, focusing on enhancing user experience, ease of navigation, and customization capabilities. Results: The developed GUI successfully integrated GPT models, including GPT-4 Turbo and GPT-3.5, resulting in an intuitive and adaptable interface. It demonstrated efficiency in performing various NLP tasks, thereby making these advanced language processing tools accessible to a broader audience. The GUI's design, emphasizing user-friendliness and adaptability, was particularly noted for its ability to cater to both technical and non-technical users. Conclusion: In conclusion, the article illustrates the significant impact of combining advanced GPT models with a Graphical User Interface to democratize the use of NLP tools. This integration not only makes complex language processing more accessible but also marks a pivotal step in the inclusive application of AI technology across various domains. The successful implementation of the GUI highlights the potential of AI in enhancing user interaction and broadening the scope of technology usage in everyday tasks.

  • Conference Article
  • 10.1109/iccit54785.2021.9689900
Multitask Learning as Question Answering with BERT
  • Dec 18, 2021
  • Shishir Roy + 3 more

Question Answering demands a deep understanding of semantic relations among question, answer, and context. Multi-Task Learning (MTL) and Meta Learning with deep neural networks have recently shown impressive performance in many Natural Language Processing (NLP) tasks, particularly when there is inadequate data for training. But a little work has been done for a general NLP architecture that spans over many NLP tasks. In this paper, we present a model that can generalize to ten different NLP tasks. We demonstrate that multi-pointer-generator decoder and pre-trained language model is key to success and suppress all previous state-of-the-art baselines by 74 decaScore which is more than 12% absolute improvement over all of the datasets.

  • Conference Article
  • Cite Count Icon 14
  • 10.1109/ijcnn55064.2022.9892105
Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language
  • Jul 18, 2022
  • Mounika Marreddy + 4 more

Graph Convolutional Networks (GCN) have achieved state-of-art results on single text classification tasks like sentiment analysis, emotion detection, etc. However, the performance is achieved by testing and reporting on resource-rich languages like English. Applying GCN for multi-task text classification is an unexplored area. Moreover, training a GCN or adopting an English GCN for Indian languages is often limited by data availability, rich morphological variation, syntax, and semantic differences. In this paper, we study the use of GCN for the Telugu language in single and multi-task settings for four natural language processing (NLP) tasks, viz. sentiment analysis (SA), emotion identification (EI), hate-speech (HS), and sarcasm detection (SAR). In order to evaluate the performance of GCN with one of the Indian languages, Telugu, we analyze the GCN based models with extensive experiments on four downstream tasks. In addition, we created an annotated Telugu dataset, TEL-NLP, for the four NLP tasks. Further, we propose a supervised graph reconstruction method, Multi-Task Text GCN (MT- Text GCN) on the Telugu that leverages to simultaneously (i) learn the low-dimensional word and sentence graph embeddings from word-sentence graph reconstruction using graph autoencoder (GAE) and (ii) perform multi-task text classification using these latent sentence graph embeddings. We argue that our proposed MT- Text GCN achieves significant improvements on TEL-NLP over existing Telugu pretrained word embeddings [1], multilingual pretrained Transformer models: mBERT [2], and XLM-R [3]. On TEL-NLP, we achieve a high Fl-score for four NLP tasks: SA (0.84), EI (0.55), HS (0.83) and SAR (0.66). Finally, we show our model's quantitative and qualitative analysis on the four NLP tasks in Telugu. We open-source our TEL-NLP dataset, pretrained models, and code <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> https://github.com/scsmuhio/MTGCN_Resources.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/icict57646.2023.10134311
Word Embedding for Bengali Language using Domain-related Corpus
  • Apr 26, 2023
  • Ashutosh Bandyopadhyay + 1 more

Many natural language processing (NLP) tasks, including machine translation, document classification, information retrieval, news category classification, document clustering, news category clustering, and question-answering systems require the use of distributional word vector representation which is a very low dimensional vector representation of words that is called word embedding, such as contextual and non-contextual embedding. In addition to several contextual word-embedding approaches like BERT, IndicBERT, and SahojBERT (the Bengali equivalent of BERT), this work also covers numerous embedding strategies (such as Word2Vec, GloVe, and FastText) with multiple hyper-parameters. Of late, word embeddings from general corpus such as Wikipedia dump, and common crawl corpus are very famous to make word vectors. In the case of Indian languages also such type of corpus is used to make word vectors. But it suffers in giving good accuracy in general NLP tasks like Sentiment Analysis, News Category Classification, and News Category Clustering, especially for the Bengali Language. This research demonstrates, most importantly, that word embedding from a domain-related corpus enhances the quality of the embedding as those embeddings are working far better than general pre-trained word embeddings which can be used in various NLP tasks like News Category Classification, Word Similarity, and Document Clustering like News Category Clustering using various clustering techniques like KMeans, KMedoid algorithms and also obtaining better accuracy. This work proves that making word vectors or word embedding from a domain-related corpus for a particular NLP task derives better results alongside the quality of those embeddings. Throughout this work, a Deep Neural Network model was used. And the same model was used in all the tests and findings in this work. This work clearly states if word embedding can be derived from a corpus that is related to the domain on which NLP tasks will be performed, those embeddings will surely outperform the embeddings which are obtained from the common corpus or publicly available embeddings for the Bengali language. So, for a task like News Category Classification or Sentiment Analysis if word embedding can be created from a domain-related News Corpus or Sentiment related corpus it will perform way better. Nowadays, labeled data sets are hard to come by. This work demonstrated how unsupervised approaches, such as clustering, may be used effectively and produce excellent outcomes, such as news category clustering. This work recorded the performance of our embeddings in several scenarios, such as News Category Classification and Document Clustering tasks. Word Embedding created from a domain-related corpus shows promising results over a common corpus like Wikipedia dump corpus.

  • Research Article
  • Cite Count Icon 9
  • 10.1007/s40264-023-01323-2
Automatic Extraction of Comprehensive Drug Safety Information from Adverse Drug Event Narratives in the Korea Adverse Event Reporting System Using Natural Language Processing Techniques.
  • Jun 17, 2023
  • Drug Safety
  • Siun Kim + 6 more

Concerns have been raised over the quality of drug safety information, particularly data completeness, collected through spontaneous reporting systems (SRS), although regulatory agencies routinely use SRS data to guide their pharmacovigilance programs. We expected that collecting additional drug safety information from adverse event (ADE) narratives and incorporating it into the SRS database would improve data completeness. The aims of this study were to define the extraction of comprehensive drug safety information from ADE narratives reported through the Korea Adverse Event Reporting System (KAERS) as natural language processing (NLP) tasks and to provide baseline models for the defined tasks. This study used ADE narratives and structured drug safety information from individual case safety reports (ICSRs) reported through KAERS between 1 January 2015 and 31 December 2019. We developed the annotation guideline for the extraction of comprehensive drug safety information from ADE narratives based on the International Conference on Harmonisation (ICH) E2B(R3) guideline and manually annotated 3723 ADE narratives. Then, we developed a domain-specific Korean Bidirectional Encoder Representations from Transformers (KAERS-BERT) model using 1.2 million ADE narratives in KAERS and provided baseline models for the task we defined. In addition, we performed an ablation experiment to investigate whether named entity recognition (NER) models were improved when a training dataset contained more diverse ADE narratives. We defined 21 types of word entities, six types of entity labels, and 49 types of relations to formulate the extraction of comprehensive drug safety information as NLP tasks. We obtained a total of 86,750 entities, 81,828 entity labels, and 45,107 relations from manually annotated ADE narratives. The KAERS-BERT model achieved F1-scores of 83.81 and 76.62% on the NER and sentence extraction tasks, respectively, while outperforming other baseline models on all the NLP tasks we definedexcept the sentence extraction task. Finally, utilizing the NER model for extracting drug safety information from ADE narratives resulted in an average increase of 3.24% in data completeness for KAERS structured data fields. We formulated the extraction of comprehensive drug safety information from ADE narratives as NLP tasks and developed the annotated corpus and strong baseline models for the tasks. The annotated corpus and models for extracting comprehensive drug safety information can improve the data quality of an SRS database.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant