DNS tunneling is the attempt to create a hidden tunnel through a domain name service. Such a tunnel would jeopardize the targeted network and open the door for illegal access, control, and data exfiltration. The information security research community showed the variety of techniques that have been proposed to detect the tunnel. The majority of these efforts were relying on machine learning techniques where features of tunneling are considered such as length of DNS query, size, and entropy of the query. However, an additional analysis of the lexical information of the DNS query has been depicted recently and showed remarkable performance. This paper aims to examine the role of Long Short Term Memory (LSTM) model in terms of DNS lexical analysis. Two benchmark datasets related to DNS have been used. In addition, a character mapping mechanism has been used to replace every possible character with an integer number. Consequentially, the mapped representation has been fed into an LSTM model for DNS tunneling detection. Results showed that the proposed method was able to obtain a weighted average F1-score of 98% for both datasets respectively. Such results are competitive in the context of the state of the art and demonstrate the efficacy of the lexical analysis within the DNS tunneling detection task.
Read full abstract