A novel privacy preserving user identification approach for network traffic

N Clarke,F Li,S Furnell

doi:10.1016/j.cose.2017.06.012

N Clarke, F Li + Show 1 more

Open Access

https://doi.org/10.1016/j.cose.2017.06.012

Copy DOI

Abstract

The prevalence of the Internet and cloud-based applications, alongside the technological evolution of smartphones, tablets and smartwatches, has resulted in users relying upon network connectivity more than ever before. This results in an increasingly voluminous footprint with respect to the network traffic that is created as a consequence. For network forensic examiners, this traffic represents a vital source of independent evidence in an environment where anti-forensics is increasingly challenging the validity of computer-based forensics. Performing network forensics today largely focuses upon an analysis based upon the Internet Protocol (IP) address – as this is the only characteristic available. More typically, however, investigators are not actually interested in the IP address but rather the associated user (whose account might have been compromised). However, given the range of devices (e.g., laptop, mobile, and tablet) that a user might be using and the widespread use of DHCP, IP is not a reliable and consistent means of understanding the traffic from a user. This paper presents a novel approach to the identification of users from network traffic using only the meta-data of the traffic (i.e. rather than payload) and the creation of application-level user interactions, which are proven to provide a far richer discriminatory feature set to enable more reliable identity verification. A study involving data collected from 46 users over a two-month period generated over 112 GBs of meta-data traffic was undertaken to examine the novel user-interaction based feature extraction algorithm. On an individual application basis, the approach can achieve recognition rates of 90%, with some users experiencing recognition performance of 100%. The consequence of this recognition is an enormous reduction in the volume of traffic an investigator has to analyse, allowing them to focus upon a particular suspect or enabling them to disregard traffic and focus upon what is left.

Highlights

During the past 15 years, Internet usage has experienced explosive growth and technological evolution – from a simple data network with around 500 million users to a multipurpose and multiservice platform with almost 3.2 billion users (Internetlivestats, 2015)
To provide scientific rigour and statistical reliability, the following criteria were established: (a) The dataset must contain a sufficient number of participants to provide a basis for identifying them; (b) The dataset must contain sufficient samples across a prolonged period in order to ensure identification performance can be maintained; (c) All network traffic meta-data from all participants is to be collected; (d) The Internet Protocol (IP) address and user must be fixed for the complete duration in order to provide a ground truth to which to label the interactions and calculate the performance
This paper has presented and evaluated a novel feature extraction approach for network traffic that provides robust user identification

Summary

Introduction

During the past 15 years, Internet usage has experienced explosive growth and technological evolution – from a simple data network with around 500 million users to a multipurpose and multiservice platform with almost 3.2 billion users (Internetlivestats, 2015). Studies into behavioural profiling on desktop and mobile platforms have demonstrated the ability to verify an individual; deriving application-level interactions (such as which websites users visit and more importantly what they do whilst visiting – posting, chatting, listening to music or watching video) from low-level encrypted packet-based data has proven challenging. Using these application-based interactions for identification rather than verification introduces a need for stronger discriminative information.

Prior art in network and behavioural profiling

Packet based network analysis method

Flow based network analysis approach

Biometric-based behavioural profiling

Deriving user interactions from network metadata

Network data collection dataset

Data collection

Data pre-processing

User identification via network interactions

Preliminary experiment: classification configuration

Experiment

Findings

Discussion

Conclusion and future work

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computers & Security	Publication Date: Jul 10, 2017
Citations: 19	License type: cc-by

R Discovery Prime

R Discovery Prime

A novel privacy preserving user identification approach for network traffic

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computers & Security

Lead the way for us

Similar Papers

Enhancing IP Address Geocoding, Geolocating and Visualization for Digital Forensics
Mohammad Meraj Mirza ... Umit Karabiyik
-
Mohammad Meraj Mirza, et. al.Mohammad Meraj Mirza ... Umit Karabiyik
31 Oct 2021
31 Oct 2021

Investigating Public IP Address Assignment in Infrastructureless Social Networks
Amit Ramkissoon
-
Amit RamkissoonAmit Ramkissoon
01 Jan 2023
01 Jan 2023

IP address management : augmenting Sandia's capabilities through open source tools.
R Nayar
-
R NayarR Nayar
01 Aug 2005
01 Aug 2005

Strategy for Detecting IP Address of LINE VOIP Network Packets by Using the Decision-Tree Approach
...
-
, et. al. ...
01 Nov 2018
01 Nov 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel privacy preserving user identification approach for network traffic

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computers &amp; Security

More From: Computers & Security