Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems

Feng Wang,Tianhua Xu,Mengchu Zhou,Haifeng Wang,Tao Tang

doi:10.1109/tits.2016.2521866

Abstract

A vast amount of text data is recorded in the forms of repair verbatim in railway maintenance sectors. Efficient text mining of such maintenance data plays an important role in detecting anomalies and improving fault diagnosis efficiency. However, unstructured verbatim, high-dimensional data, and imbalanced fault class distribution pose challenges for feature selections and fault diagnosis. We propose a bilevel feature extraction-based text mining that integrates features extracted at both syntax and semantic levels with the aim to improve the fault classification performance. We first perform an improved $\chi^{2}$ statistics-based feature selection at the syntax level to overcome the learning difficulty caused by an imbalanced data set. Then, we perform a prior latent Dirichlet allocation-based feature selection at the semantic level to reduce the data set into a low-dimensional topic space. Finally, we fuse fault features derived from both syntax and semantic levels via serial fusion. The proposed method uses fault features at different levels and enhances the precision of fault diagnosis for all fault classes, particularly minority ones. Its performance has been validated by using a railway maintenance data set collected from 2008 to 2014 by a railway corporation. It outperforms traditional approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Intelligent Transportation Systems

Lead the way for us

Journal: IEEE Transactions on Intelligent Transportation Systems	Publication Date: Jan 1, 2017
Citations: 133

Similar Papers

Fault diagnosis of planetary gearbox using multi-criteria feature selection and heterogeneous ensemble learning classification
Zirui Wang ... Youren Wang
Measurement | VOL. 173
Zirui Wang, et. al.Zirui Wang ... Youren Wang
01 Nov 2020
Measurement | VOL. 173

Bi-TLLDA and CSSVM based fault diagnosis of vehicle on-board equipment for high speed railway
Wei Wei ... Xiaoqiang Zhao
Measurement Science and Technology | VOL. 32
Wei Wei, et. al.Wei Wei ... Xiaoqiang Zhao
18 May 2021
Measurement Science and Technology | VOL. 32

Feature Selection for High-Dimensional and Imbalanced Biomedical Data Based on Robust Correlation Based Redundancy and Binary Grasshopper Optimization Algorithm.
Garba Abdulrauf Sharifai ... Zurinahni Zainol
Genes | VOL. 11
Garba Abdulrauf Sharifai, et. al.Garba Abdulrauf Sharifai ... Zurinahni Zainol
27 Jun 2020
Genes | VOL. 11

A Modified Adaptive Chaotic Binary Ant System and Its Application in Chemical Process Fault Diagnosis
Ling Wang ... Jinshou Yu
-
Ling Wang, et. al.Ling Wang ... Jinshou Yu
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Intelligent Transportation Systems