A Comparative Study of Two State-of-the-Art Feature Selection Algorithms for Texture-Based Pixel-Labeling Task of Ancient Documents

Maroua Mehri,Ramzi Chaieb,Pierre Héroux,Rémy Mullot,Karim Kalti,Najoua Essoukri Ben Amara

doi:10.3390/jimaging4080097

Abstract

Recently, texture features have been widely used for historical document image analysis. However, few studies have focused exclusively on feature selection algorithms for historical document image analysis. Indeed, an important need has emerged to use a feature selection algorithm in data mining and machine learning tasks, since it helps to reduce the data dimensionality and to increase the algorithm performance such as a pixel classification algorithm. Therefore, in this paper we propose a comparative study of two conventional feature selection algorithms, genetic algorithm and ReliefF algorithm, using a classical pixel-labeling scheme based on analyzing and selecting texture features. The two assessed feature selection algorithms in this study have been applied on a training set of the HBR dataset in order to deduce the most selected texture features of each analyzed texture-based feature set. The evaluated feature sets in this study consist of numerous state-of-the-art texture features (Tamura, local binary patterns, gray-level run-length matrix, auto-correlation function, gray-level co-occurrence matrix, Gabor filters, Three-level Haar wavelet transform, three-level wavelet transform using 3-tap Daubechies filter and three-level wavelet transform using 4-tap Daubechies filter). In our experiments, a public corpus of historical document images provided in the context of the historical book recognition contest (HBR2013 dataset: PRImA, Salford, UK) has been used. Qualitative and numerical experiments are given in this study in order to provide a set of comprehensive guidelines on the strengths and the weaknesses of each assessed feature selection algorithm according to the used texture feature set.

Highlights

Providing reliable computer-based access and analysis of cultural heritage documents has been flagged as a very important need for the library and the information science community, spanning educationalists, students, practitioners, researchers in book history, computer scientists, historians, librarians, end-users and decision makers
The performance of each texture feature set according to the use of a full texture feature set, the use of a subset of texture features selected by means of the genetic algorithm (GA), and the use of a subset of texture features selected by means of the ReliefF algorithm (RA) is discussed after describing our experimental corpus and its associated ground truth, and presenting the used accuracy metrics for performance evaluation
It aims at analyzing and comparing of the performance of each texture feature set according to the use of a full texture feature set, the use of a subset of texture features selected by means of the GA, and the use of a subset of texture features selected by means of the RA

Summary

Introduction

Providing reliable computer-based access and analysis of cultural heritage documents has been flagged as a very important need for the library and the information science community, spanning educationalists, students, practitioners, researchers in book history, computer scientists, historians, librarians, end-users and decision makers. Wei et al [3] proposed a layout analysis method of historical document images using the sequential forward selection algorithm and the autoencoder technique as a deep neural network for feature selection and learning. These methods based on deep architectures are hindered by many issues related to the computational cost in terms of memory consumption, processing time and computational complexity on the one hand, and the need for large datasets.

Texture Features

Feature Selection Algorithms

Genetic Algorithm

ReliefF Algorithm

Evaluation and Results

Pixel-Labeling Scheme

Corpus and Preparation of Ground Truth

Qualitative Results

Benchmarking and Performance Evaluation

Conclusions and Further Work

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Imaging	Publication Date: Aug 1, 2018
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Comparative Study of Two State-of-the-Art Feature Selection Algorithms for Texture-Based Pixel-Labeling Task of Ancient Documents

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Imaging

Lead the way for us

Similar Papers

<title>Texture features for image classification and retrieval</title>
Mohamed Borchani ... Georges Stamon
-
Mohamed Borchani, et. al.Mohamed Borchani ... Georges Stamon
06 Oct 1997
06 Oct 1997

Classification of childhood medulloblastoma into WHO-defined multiple subtypes based on textural analysis.
Lipi B Mahanta ... Shabnam Ahmed
Journal of microscopy | VOL. 279
Lipi B Mahanta, et. al.Lipi B Mahanta ... Shabnam Ahmed
28 Apr 2020
Journal of microscopy | VOL. 279

Pathological automatic classification of hepatocellular carcinoma based on adaptive weighted multi-classifier fusion
Wu Zhou ... Chaobing Huang
-
Wu Zhou, et. al.Wu Zhou ... Chaobing Huang
01 Mar 2017
01 Mar 2017

Automatic surface defects classification of Kinnow mandarins using combination of multi‐feature fusion techniques
Lingaraj Hadimani ... Neerja Mittal Garg
Journal of Food Process Engineering | VOL. 44
Lingaraj Hadimani, et. al.Lingaraj Hadimani ... Neerja Mittal Garg
06 Nov 2020
Journal of Food Process Engineering | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Comparative Study of Two State-of-the-Art Feature Selection Algorithms for Texture-Based Pixel-Labeling Task of Ancient Documents

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Imaging