Abstract

In recovering information from the chart image, the first step should be chart type classification. Throughout history, many approaches have been used, and some of them achieve results better than others. The latest articles are using a Support Vector Machine (SVM) in combination with a Convolutional Neural Network (CNN), which achieve almost perfect results with the datasets of few thousand images per class. The datasets containing chart images are primarily synthetic and lack real-world examples. To overcome the problem of small datasets, to our knowledge, this is the first report of using Siamese CNN architecture for chart type classification. Multiple network architectures are tested, and the results of different dataset sizes are compared. The network verification is conducted using Few-shot learning (FSL). Many of described advantages of Siamese CNNs are shown in examples. In the end, we show that the Siamese CNN can work with one image per class, and a 100% average classification accuracy is achieved with 50 images per class, where the CNN achieves only average classification accuracy of 43% for the same dataset.

Highlights

  • Which model achieves the highest classification on chart which model achieves the the highest average classification accuracy on chart images

  • This paper focuses on the classification of chart images using the Siamese Convolutional Neural Network (CNN), which has never been conducted before

  • The conducted research proves that Siamese CNN can be used with chart type classification

Read more

Summary

Introduction

The majority of the used data visualizations are “locked” inside the documents, which can be digitized These documents contain graphical and textual information linked together in one visual unit. The first challenge in retrieving information from digitized data visualization images is classifying that image in one of many existing chart classes. CNN can achieve state-of-the-art results in many computer vision tasks This method requires many images (often 1000 or more per class) to be successful. To deal with the presented problem, and to our knowledge, we are the first to introduce the Siamese CNN in chart type classification.

Related Work
Results
The Model
The Dataset
The Architecture
Experiment Setup
Example a CNN highest expected similarity score should
F-1 F-1 score
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.