Abstract

An outlier detection method may be considered fair over specified sensitive attributes if the results of outlier detection are not skewed toward particular groups defined on such sensitive attributes. In this paper, we consider the task of fair outlier detection. Our focus is on the task of fair outlier detection over multiple multi-valued sensitive attributes (e.g., gender, race, religion, nationality and marital status, among others), one that has broad applications across modern data scenarios. We propose a fair outlier detection method, FairLOF, that is inspired by the popular LOF formulation for neighborhood-based outlier detection. We outline ways in which unfairness could be induced within LOF and develop three heuristic principles to enhance fairness, which form the basis of the FairLOF method. Being a novel task, we develop an evaluation framework for fair outlier detection, and use that to benchmark FairLOF on quality and fairness of results. Through an extensive empirical evaluation over real-world datasets, we illustrate that FairLOF is able to achieve significant improvements in fairness at sometimes marginal degradations on result quality as measured against the fairness-agnostic LOF method. We also show that a generalization of our method, named FairLOF-Flex, is able to open possibilities of further deepening fairness in outlier detection beyond what is offered by FairLOF.

Highlights

  • There has been much recent interest in incorporating fairness constructs into data analytics algorithms, within the broader theme of algorithmic fairness [12]

  • We measure how well FairLOF results align with the fairness-agnostic Local Outlier Factor (LOF), to assess quality of FairLOF results

  • Fairness is of immense importance in this day and age when data analytics in general, and outlier detection in particular, is being used to make and influence decisions that will affect human lives to a significant extent, especially within web data scenarios that operate at scale

Read more

Summary

Introduction

There has been much recent interest in incorporating fairness constructs into data analytics algorithms, within the broader theme of algorithmic fairness [12]. 701 82 Örebro, Sweden paper, we explore the task of fairness in outlier detection, an analytics task of wide applicability in myriad scenarios. Identification of non-mainstream behavior, the high-level task that outlier detection accomplishes, has a number of applications in new age data scenarios. Immigration officials at airports might want to carry out detailed checks on ‘suspicious’ people, while AI is likely used in proactive policing to identify ‘suspicious’ people for stop-and-frisk checks In this age of pervasive digitization, ‘abnormality’ in health, income or mobility patterns may invite proactive checks from healthcare, taxation. LOF comprises three phases, each computing a value for each object in X , progressively leading to LOF: (i) k-distance, (ii) local reachability density (LRD), and (iii) local outlier factor (LOF).

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call