Kinds Of Outliers Research Articles

Community detection, which aims to cluster $N$ nodes in a given graph into $r$ distinct groups based on the observed undirected edges, is an important problem in network data analysis. In this paper, the popular stochastic block model (SBM) is extended to the generalized stochastic block model (GSBM) that allows for adversarial outlier nodes, which are connected with the other nodes in the graph in an arbitrary way. Under this model, we introduce a procedure using convex optimization followed by $k$-means algorithm with $k=r$. Both theoretical and numerical properties of the method are analyzed. A theoretical guarantee is given for the procedure to accurately detect the communities with small misclassification rate under the setting where the number of clusters can grow with $N$. This theoretical result admits to the best-known result in the literature of computationally feasible community detection in SBM without outliers. Numerical results show that our method is both computationally fast and robust to different kinds of outliers, while some popular computationally fast community detection algorithms, such as spectral clustering applied to adjacency matrices or graph Laplacians, may fail to retrieve the major clusters due to a small portion of outliers. We apply a slight modification of our method to a political blogs data set, showing that our method is competent in practice and comparable to existing computationally feasible methods in the literature. To the best of the authors’ knowledge, our result is the first in the literature in terms of clustering communities with fast growing numbers under the GSBM where a portion of arbitrary outlier nodes exist.

Read full abstract

With the provision of any source of real-time information, the timeliness and accuracy of the data provided are paramount to the effectiveness and success of the system and its acceptance by the users. In order to improve the accuracy and reliability of parking guidance systems (PGSs), the technique of outlier mining has been introduced for detecting and analysing outliers in available parking space (APS) datasets. To distinguish outlier features from the APS’s overall periodic tendency, and to simultaneously identify the two types of outliers which naturally exist in APS datasets with intrinsically distinct statistical features, a two-phase detection method is proposed whereby an improved density-based detection algorithm named “local entropy based weighted outlier detection” (EWOD) is also incorporated. Real-world data from parking facilities in the City of Newcastle upon Tyne was used to test the hypothesis. Thereafter, experimental tests were carried out for a comparative study in which the outlier detection performances of the two-phase detection method, statistic-based method, and traditional density-based method were compared and contrasted. The results showed that the proposed method can identify two different kinds of outliers simultaneously and can give a high identifying accuracy of 100% and 92.7% for the first and second types of outliers, respectively.

Read full abstract

Kinds Of Outliers Research Articles

Articles published on Kinds Of Outliers

How to Classify, Detect, and Manage Univariate and Multivariate Outliers, With Emphasis on Pre-Registration

Spurious PIV vector detection and correction using a penalized least-squares method with adaptive order differentials

Robust and computationally feasible community detection in the presence of arbitrary outlier nodes

Detection of Outliers in a Time Series of Available Parking Spaces

Interventions in log-linear Poisson autoregression

A continuous optimization framework for hybrid system identification

ARFIMA processes and outliers: a weighted likelihood approach

Outlier detection in ARMA models

Trimming Tools in Exploratory Data Analysis

On Robustness in the Logistic Regression Model

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Kinds Of Outliers Research Articles

Articles published on Kinds Of Outliers

How to Classify, Detect, and Manage Univariate and Multivariate Outliers, With Emphasis on Pre-Registration

Spurious PIV vector detection and correction using a penalized least-squares method with adaptive order differentials

Robust and computationally feasible community detection in the presence of arbitrary outlier nodes

Detection of Outliers in a Time Series of Available Parking Spaces

Interventions in log-linear Poisson autoregression

A continuous optimization framework for hybrid system identification

ARFIMA processes and outliers: a weighted likelihood approach

Outlier detection in ARMA models

Trimming Tools in Exploratory Data Analysis

On Robustness in the Logistic Regression Model