Abstract

The widespread use of deception in online sources has motivated the need for methods to automatically profile and identify deceivers. This work explores deception, gender and age detection in short texts using a machine learning approach. First, we collect a new open domain deception dataset also containing demographic data such as gender and age. Second, we extract feature sets including n-grams, shallow and deep syntactic features, semantic features, and syntactic complexity and readability metrics. Third, we build classifiers that aim to predict deception, gender, and age. Our findings show that while deception detection can be performed in short texts even in the absence of a predetermined domain, gender and age prediction in deceptive texts is a challenging task. We further explore the linguistic differences in deceptive content that relate to deceivers gender and age and find evidence that both age and gender play an important role in people’s word choices when fabricating lies.

Highlights

  • Given the potential ethical and security risks associated with deceitful interactions, it is important to build computational tools able to detect deceivers and to provide insights into the nature of deceptive behaviors

  • Information related to the demographics of the deceivers could be potentially useful, as recent studies have shown that online users lie frequently about their appearance, gender, age or even education level

  • We present a study on deception detection in an open domain, and present an analysis of deceptive behavior in association with gender and age

Read more

Summary

Introduction

Given the potential ethical and security risks associated with deceitful interactions, it is important to build computational tools able to detect deceivers and to provide insights into the nature of deceptive behaviors. Information related to the demographics of the deceivers could be potentially useful, as recent studies have shown that online users lie frequently about their appearance, gender, age or even education level. We present a study on deception detection in an open domain, and present an analysis of deceptive behavior in association with gender and age. We present an analysis of the topics discussed by deceivers given their age and gender based on the assumption that, when lying in an open domain setting, deceivers will show natural bias towards specific topics related to gender and age

Related work
Open Domain Deception Dataset
Features
Analyzing Language Used by Deceivers Given Age and Gender
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call