Macro-averaged F1-score Research Articles

BackgroundIn silico prediction of potential drug side-effects is of crucial importance for drug development, since wet experimental identification of drug side-effects is expensive and time-consuming. Existing computational methods mainly focus on leveraging validated drug side-effect relations for the prediction. The performance is severely impeded by the lack of reliable negative training data. Thus, a method to select reliable negative samples becomes vital in the performance improvement.MethodsMost of the existing computational prediction methods are essentially based on the assumption that similar drugs are inclined to share the same side-effects, which has given rise to remarkable performance. It is also rational to assume an inverse proposition that dissimilar drugs are less likely to share the same side-effects. Based on this inverse similarity hypothesis, we proposed a novel method to select highly-reliable negative samples for side-effect prediction. The first step of our method is to build a drug similarity integration framework to measure the similarity between drugs from different perspectives. This step integrates drug chemical structures, drug target proteins, drug substituents, and drug therapeutic information as features into a unified framework. Then, a similarity score between each candidate negative drug and validated positive drugs is calculated using the similarity integration framework. Those candidate negative drugs with lower similarity scores are preferentially selected as negative samples. Finally, both the validated positive drugs and the selected highly-reliable negative samples are used for predictions.ResultsThe performance of the proposed method was evaluated on simulative side-effect prediction of 917 DrugBank drugs, comparing with four machine-learning algorithms. Extensive experiments show that the drug similarity integration framework has superior capability in capturing drug features, achieving much better performance than those based on a single type of drug property. Besides, the four machine-learning algorithms achieved significant improvement in macro-averaging F1-score (e.g., SVM from 0.655 to 0.898), macro-averaging precision (e.g., RBF from 0.592 to 0.828) and macro-averaging recall (e.g., KNN from 0.651 to 0.772) complimentarily attributed to the highly-reliable negative samples selected by the proposed method.ConclusionsThe results suggest that the inverse similarity hypothesis and the integration of different drug properties are valuable for side-effect prediction. The selection of highly-reliable negative samples can also make significant contributions to the performance improvement.

To create an system to aid in the analysis of art history by classifying and grouping digitized paintings based on stylistic features automatically learned without prior knowledge. 6,776 digitized paintings from 8 different artistic styles (Art Nouveau, Baroque, Expressionism, Impressionism, Realism, Romanticism, Renaissance, and Post-Impressionism) were utilized to classify (predict) and cluster (group) paintings according to style. The method of unsupervised feature learning with K-means (UFLK), inspired by deep learning, was utilized to extract features from the paintings. These features were then used in: (1) a support vector machine algorithm to classify the style of new test paintings based on a training set of paintings having known style labels; and (2) a spectral clustering algorithm to group the paintings into distinct style groups (anonymously, without employing any known style labels). Classification performance was determined by accuracy and F-score. Clustering performance was determined by: 1) the ability to recover the original stylistic groupings (using a cost analysis of all possible combinations of 8 group label assignments); 2) F-score; and 3) a reliability analysis. The latter analysis used two novel ways to determine the distribution of the null-hypothesis: (a) a uniform distribution projected onto the principal components of the original data; and (b) a randomized, weighted adjacency matrix. The ability to gain insights into art was tested by a semantic analysis of the clustering results. For this purpose, we represented the featural characteristics of each painting by an N-dimensional feature vector, and plotted the distance between vector endpoints (i.e., similarity between paintings). Then, we color-coded the endpoints with the assigned lowest-cost style labels. The scatterplot was visually inspected for separation of the paintings, where the amount of separation between color clusters provides semantic information on the interrelatedness between styles. The UFLK-extracted features resembled the edges/lines/colors in the paintings. For feature-based classification of paintings, the macro-averaged F-score was 0.469. Classification accuracy and F-score were similar/higher compared to other classification methods using more complex feature learning models (e.g., convolutional neural networks, a supervised algorithm). The clustering via UFLK-extracted features yielded 8 unlabeled style groupings. In 6 of 8 clusters, the most common true painting style matched the cluster style assigned by cost analysis. The clustering had an F-score of 0.212. (There are no comparison methods for clustering paintings.) For the semantic analysis, the featural characteristics of Baroque and Art Nouveau were found to be similar, indicating a relationship between these styles. The UFLK method can extract features from digitized paintings. We were able to extract characteristics of art without any prior information about the nature of the features or the stylistic designation of the paintings. The methods herein may provide art researchers with the latest computational techniques for the documentation, interpretation, and forensics of art. The tools could assist the preservation of culturally sensitive works of art for future generations, and provide new insights into works of art and the artists who created them.

Macro-averaged F1-score Research Articles

Articles published on Macro-averaged F1-score

Inverse similarity and reliable negative samples for drug side-effect prediction

A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records

A Text Mining Pipeline Using Active and Deep Learning Aimed at Curating Information in Computational Neuroscience

A Metrical Analysis of Medieval German Poetry Using Supervised Learning

On Hierarchical Text Language-Identification Algorithms

Extensive Experimental Evaluation of Self-Organizing Maps for Automatic Classification of a Multi-Class Multi-Label Corpus

Predicting and Grouping Digitized Paintings by Style using Unsupervised Feature Learning.

An automated and robust image processing algorithm for glaucoma diagnosis from fundus images using novel blood vessel tracking and bend point detection

위키피디아 기반 개체명 사전 반자동 구축 방법

Sparse Modeling of Magnitude and Phase-Derived Spectra for Playing Technique Classification

Aara’– a system for mining the polarity of Saudi public opinion through e-newspaper comments

Minimally Supervised Novel Relation Extraction Using a Latent Relational Mapping

Clustering Generalised Instances Set Approaches for Text Classification

An interactive and user-centered computer system to predict physician’s disease judgments in discharge summaries

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Macro-averaged F1-score Research Articles

Articles published on Macro-averaged F1-score

Inverse similarity and reliable negative samples for drug side-effect prediction

A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records

A Text Mining Pipeline Using Active and Deep Learning Aimed at Curating Information in Computational Neuroscience

A Metrical Analysis of Medieval German Poetry Using Supervised Learning

On Hierarchical Text Language-Identification Algorithms

Extensive Experimental Evaluation of Self-Organizing Maps for Automatic Classification of a Multi-Class Multi-Label Corpus

Predicting and Grouping Digitized Paintings by Style using Unsupervised Feature Learning.

An automated and robust image processing algorithm for glaucoma diagnosis from fundus images using novel blood vessel tracking and bend point detection

위키피디아 기반 개체명 사전 반자동 구축 방법

Sparse Modeling of Magnitude and Phase-Derived Spectra for Playing Technique Classification

Aara’– a system for mining the polarity of Saudi public opinion through e-newspaper comments

Minimally Supervised Novel Relation Extraction Using a Latent Relational Mapping

Clustering Generalised Instances Set Approaches for Text Classification

An interactive and user-centered computer system to predict physician’s disease judgments in discharge summaries