Abstract

A book cover can convey a lot about the content of the book. Despite the adage to not evaluate something based on outward appearances, we apply machine learning to see if we can, in fact, judge a book by its cover, or more specifically by its cover art and text. The classification was done considering three different aspects - cover image only, cover text only and both image and text in a multimodal approach. Image classification was done using transfer learning with Inception-v3. For text detection from the cover image, images were first converted to greyscale and different thresholds were applied to detect maximum text. This text was then vectorized and used to train a Multinomial Naïve Bayes model. We also trained custom CNNs for image and text modalities. For multimodal classification, we examine late fusion model, where the modalities are combined at decision level, and early fusion model, where the modalities are combined at the feature level. Our results show that the late fusion model performs best in our setting. We also observe that text is more informative with respect to genre prediction and that significant efforts need to be devoted to solve this image-based classification task to a satisfactory level. This research can be used to aid product design process by revealing underlying information. It could also be used in recommender systems and to help in promotion and sales processes for automatic genre suggestion.

Highlights

  • We examine the outcomes of unimodal classification – using Multinomial Naïve Bayes and a custom CNN for text classification, Inception-v3 [6] transfer learning and a custom-designed CNN for image classification - and compare the performance of different multimodal techniques

  • We have looked at work done on genre classification of comic books and movie genre based on posters since these endeavours are analogous to our purposes

  • We have trained models to predict the genre of a book based on visual and textual modalities of the cover and proved that it is possible to draw a relationship between book cover images and genre using automatic recognition

Read more

Summary

Introduction

Since the invention of the printing press in the 15th century, books have become a widespread method of retaining and sharing information. Even with the recent trend of electronic devices, the practice of reading continues to be practised and encouraged. Thousands of books are published worldwide, which means it is a daunting task for a new book to be noticed and acquire a significant readership. Visual design gives significant impressions to transmit. Revised Manuscript Received on July 22, 2020.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call