Purpose Social media data contains a wealth of content related to customers’ reactions to, and comments on, firms’ performance. Through the lens of signaling theory, this paper aims to investigate the use of social media data as a knowledge resource in communicating firms’ noncompliance risk to regulatory agencies. Design/methodology/approach This paper proposes a two-step social media analytics framework to detect noncompliant firms. First, it creates a context-specific dictionary that contains keywords relevant to firms’ noncompliant behaviors. Next, it extracts those keywords from customer reviews, customer sentiment and emotions to predict firm noncompliance. It tests these ideas in the context of food safety regulations. Findings It identified over 100 words that are related to restaurants’ hygiene deficiencies. Using the occurrence of these words in customer reviews, as well as sentiments and emotions expressed within them, the author’s best-performing model can identify nearly 90% of the restaurants that severely violated regulations. Practical implications After being processed by appropriate machine learning algorithms, customer reviews serve as valuable knowledge resources, enabling regulatory agencies to identify noncompliant firms. Regulatory agencies can use this model to complement the current compliance monitoring scheme. Originality/value This research contributes a novel methodology for creating a context-specific dictionary that keeps only the relevant words customers use when discussing firms’ noncompliant acts. In the absence of such an approach, numerous irrelevant signals would be included in the modeling process, thereby increasing the cost of social media analytics.
Read full abstract