Abstract

SummaryWe consider the problem of quantifying the degree of association between pairs of discrete event time series, with potential applications in forensic and cybersecurity settings. We focus in particular on the case where two associated event series exhibit temporal clustering such that the occurrence of one type of event at a particular time increases the likelihood that an event of the other type will also occur nearby in time. We pursue a non-parametric approach to the problem and investigate various score functions to quantify association, including characteristics of marked point processes and summary statistics of interevent times. Two techniques are proposed for assessing the significance of the measured degree of association: a population-based approach to calculating score-based likelihood ratios when a sample from a relevant population is available, and a resampling approach to computing coincidental match probabilities when only a single pair of event series is available. The methods are applied to simulated data and to two real world data sets consisting of logs of computer activity and achieve accurate results across all data sets.

Highlights

  • Forensic analysis involves analysing observed evidence during a legal investigation

  • We focus in particular on the case where two associated event series exhibit temporal clustering such that the occurrence of one type of event at a particular time increases the likelihood that an event of the other type will occur nearby in time.We pursue a non-parametric approach to the problem and investigate various score functions to quantify association, including characteristics of marked point processes and summary statistics of interevent times

  • Should the relevant population be a sample from all individuals in general, or from everyone who matches the description of a suspect in a given region, or from some other group? (See Stern (2017) for additional discussion of this issue.) To address these potential problems we propose below a resampling approach that computes coincidental match probabilities (CMPs) by using only a single pair of event series

Read more

Summary

Introduction

Forensic analysis involves analysing observed evidence during a legal investigation. This can be in the context of civil or criminal investigations. In this general context we address the problem of developing methods to quantify the likelihood of observing the pair of event series .A, B/ under different hypotheses regarding their source. Real world pairs of event series M = .A, B/ of user-generated event data can exhibit significant burstiness and inhomogeneity over time (e.g. as in Fig. 1), making it challenging to develop robust parametric models of association between A and B. For this reason we pursue non-parametric measures. The second is a resampling approach when only a single pair M is available, i.e. we do not have access to a sample from a relevant population of realizations

Background on approaches to assessing the strength of association
Measures of association
Assessing the degree of association
Population-based approach
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.