Abstract

BackgroundInstagram, with millions of posts per day, can be used to inform public health surveillance targets and policies. However, current research relying on image-based data often relies on hand coding of images, which is time-consuming and costly, ultimately limiting the scope of the study. Current best practices in automated image classification (eg, support vector machine (SVM), backpropagation neural network, and artificial neural network) are limited in their capacity to accurately distinguish between objects within images.ObjectiveThis study aimed to demonstrate how a convolutional neural network (CNN) can be used to extract unique features within an image and how SVM can then be used to classify the image.MethodsImages of waterpipes or hookah (an emerging tobacco product possessing similar harms to that of cigarettes) were collected from Instagram and used in the analyses (N=840). A CNN was used to extract unique features from images identified to contain waterpipes. An SVM classifier was built to distinguish between images with and without waterpipes. Methods for image classification were then compared to show how a CNN+SVM classifier could improve accuracy.ResultsAs the number of validated training images increased, the total number of extracted features increased. In addition, as the number of features learned by the SVM classifier increased, the average level of accuracy increased. Overall, 99.5% (418/420) of images classified were correctly identified as either hookah or nonhookah images. This level of accuracy was an improvement over earlier methods that used SVM, CNN, or bag-of-features alone.ConclusionsA CNN extracts more features of images, allowing an SVM classifier to be better informed, resulting in higher accuracy compared with methods that extract fewer features. Future research can use this method to grow the scope of image-based studies. The methods presented here might help detect increases in the popularity of certain tobacco products over time on social media. By taking images of waterpipes from Instagram, we place our methods in a context that can be utilized to inform health researchers analyzing social media to understand user experience with emerging tobacco products and inform public health surveillance targets and policies.

Highlights

  • Instagram, with millions of posts per day, [1] can be used to inform public health surveillance targets and policies

  • 3 Results: Results demonstrated that hookah features could be extracted by convolutional neural network (CNN), with image categories classified by the SVM, maintaining a high level of accuracy

  • Compared to earlier work using CNN, SVM, and Bag of Features (BOF), our method improves accuracy when the number of training images is increased with accuracy approaching 100% (99.5%)

Read more

Summary

Introduction

With millions of posts per day, [1] can be used to inform public health surveillance targets and policies. Images from social media may be more useful than findings from text-based platforms alone (e.g., Twitter, Reddit) when attempting to understand health behaviors e.g., user experiences with emerging tobacco products.[4] While automated image classification is useful for large-scale image classification (e.g., processing and assigning labels to millions of images), current best practices in automated image classification are limited in their capacity to accurately distinguish between objects within images[5][6][7]. 99.5% of the 420 images classified were correctly identified as either hookah or non-hookah images This level of accuracy was an improvement over earlier methods that used SVM, CNN or Bag of Features (BOF) alone. By taking images of waterpipes from Instagram, we place our methods in a context that can be utilized to inform health researchers analyzing social media to understand user experience with emerging tobacco products and inform public health surveillance targets and policies

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call