Use Of Word Embeddings Research Articles

Recent initiatives in psychiatry emphasize the utility of characterizing psychiatric symptoms in a multidimensional manner. However, strategies for applying standard self-report scales for multiaxial assessment have not been well-studied, particularly where the aim is to support both categorical and dimensional phenotypes. We propose a method for applying natural language processing to derive dimensional measures of psychiatric symptoms from questionnaire data. We utilized nine self-report symptom measures drawn from a large cellular biobanking study that enrolled individuals with mood and psychotic disorders, as well as healthy controls. To summarize questionnaire results we used word embeddings, a technique to represent words as numeric vectors preserving semantic and syntactic meaning. A low-dimensional approximation to the embedding space was used to derive the proposed succinct summary of symptom profiles. To validate our embedding-based disease profiles, these were compared to presence or absence of axis I diagnoses derived from structured clinical interview, and to objective neurocognitive testing. Unsupervised and supervised classification to distinguish presence/absence of axis I disorders using survey-level embeddings remained discriminative, with area under the receiver operating characteristic curve up to 0.85, 95% confidence interval (CI) (0.74,0.91) using Gaussian mixture modeling, and cross-validated area under the receiver operating characteristic curve 0.91, 95% CI (0.88,0.94) using logistic regression. Derived symptom measures and estimated Research Domain Criteria scores also associated significantly with performance on neurocognitive tests. Our results support the potential utility of deriving dimensional phenotypic measures in psychiatric illness through the use of word embeddings, while illustrating the challenges in identifying truly orthogonal dimensions.

BackgroundDrug Package Leaflets (DPLs) provide information for patients on how to safely use medicines. Pharmaceutical companies are responsible for producing these documents. However, several studies have shown that patients usually have problems in understanding sections describing posology (dosage quantity and prescription), contraindications and adverse drug reactions. An ultimate goal of this work is to provide an automatic approach that helps these companies to write drug package leaflets in an easy-to-understand language. Natural language processing has become a powerful tool for improving patient care and advancing medicine because it leads to automatically process the large amount of unstructured information needed for patient care. However, to the best of our knowledge, no research has been done on the automatic simplification of drug package leaflets. In a previous work, we proposed to use domain terminological resources for gathering a set of synonyms for a given target term. A potential drawback of this approach is that it depends heavily on the existence of dictionaries, however these are not always available for any domain and language or if they exist, their coverage is very scarce. To overcome this limitation, we propose the use of word embeddings to identify the simplest synonym for a given term. Word embedding models represent each word in a corpus with a vector in a semantic space. Our approach is based on assumption that synonyms should have close vectors because they occur in similar contexts.ResultsIn our evaluation, we used the corpus EasyDPL (Easy Drug Package Leaflets), a collection of 306 leaflets written in Spanish and manually annotated with 1400 adverse drug effects and their simplest synonyms. We focus on leaflets written in Spanish because it is the second most widely spoken language on the world, but as for the existence of terminological resources, the Spanish language is usually less prolific than the English language. Our experiments show an accuracy of 38.5% using word embeddings.ConclusionsThis work provides a promising approach to simplify DPLs without using terminological resources or parallel corpora. Moreover, it could be easily adapted to different domains and languages. However, more research efforts are needed to improve our approach based on word embedding because it does not overcome our previous work using dictionaries yet.

Use Of Word Embeddings Research Articles

Related Topics

Articles published on Use Of Word Embeddings

Integrating questionnaire measures for transdiagnostic psychiatric phenotyping using word2vec.

A dynamic Windows malware detection and prediction method based on contextual understanding of API call sequence

Characterizing Word Embeddings for Zero-Shot Sensor-Based Human Activity Recognition.

A word-embedding-based approach for accurate identification of corresponding activities

Extending sparse text with induced domain-specific lexicons and embeddings: A case study on predicting donations

On the use of word embedding for cross language plagiarism detection

Word embeddings and external resources for answer processing in biomedical factoid question answering.

Semantic Web Annotation using Deep Learning with Arabic Morphology

Using Communities of Words Derived from Multilingual Word Vectors for Cross-Language Information Retrieval in Indian Languages

Evaluating deep learning models for sentiment classification

Exploring the impact of word embeddings for disjoint semisupervised Spanish verb sense disambiguation

A novel sentence similarity model with word embedding based on convolutional neural network

GraphDBLP: a system for analysing networks of computer scientists through graph databases

Practical Text Phylogeny for Real-World Settings

Deep Learning Algorithm for Cyberbullying Detection

Simplifying drug package leaflets written in Spanish by using word embedding

AraVec: A set of Arabic Word Embedding Models for use in Arabic NLP

An approach to the use of word embeddings in an opinion classification task

Improved CCG Parsing with Semi-supervised Supertagging

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Use Of Word Embeddings Research Articles

Related Topics

Articles published on Use Of Word Embeddings

Integrating questionnaire measures for transdiagnostic psychiatric phenotyping using word2vec.

A dynamic Windows malware detection and prediction method based on contextual understanding of API call sequence

Characterizing Word Embeddings for Zero-Shot Sensor-Based Human Activity Recognition.

A word-embedding-based approach for accurate identification of corresponding activities

Extending sparse text with induced domain-specific lexicons and embeddings: A case study on predicting donations

On the use of word embedding for cross language plagiarism detection

Word embeddings and external resources for answer processing in biomedical factoid question answering.

Semantic Web Annotation using Deep Learning with Arabic Morphology

Using Communities of Words Derived from Multilingual Word Vectors for Cross-Language Information Retrieval in Indian Languages

Evaluating deep learning models for sentiment classification

Exploring the impact of word embeddings for disjoint semisupervised Spanish verb sense disambiguation

A novel sentence similarity model with word embedding based on convolutional neural network

GraphDBLP: a system for analysing networks of computer scientists through graph databases

Practical Text Phylogeny for Real-World Settings

Deep Learning Algorithm for Cyberbullying Detection

Simplifying drug package leaflets written in Spanish by using word embedding

AraVec: A set of Arabic Word Embedding Models for use in Arabic NLP

An approach to the use of word embeddings in an opinion classification task

Improved CCG Parsing with Semi-supervised Supertagging