Abstract

Relational databases are created for the purpose of handling and organizing sensitive data for organizations as well as for individuals. Although database security mechanisms and network intrusion detection systems (IDSs) are present, they have been found to be inadequate or unsuitable in detecting threats specifically directed toward the database application layer. Therefore, an IDS especially for the database is needed. In this paper, we propose random forest with weighted voting (WRF) and principal components analysis (PCA) as a feature selection technique, for the task of detecting database access anomalies, assuming that the database has a role-based access control (RBAC) model in place. PCA produces uncorrelated and relevant features, and, at the same time, reduces dimensionality for easier integration with large databases. RF exploits the inherent tree-structure syntax of SQL queries, and its weighted voting scheme further minimizes false alarms. Experiments showed that not only does the WRF result in improved false-positive and false-negative rates, but it is also fast in terms of model building and anomaly detection time. Moreover, for a given query, RF classification accuracy was found to be significantly affected by the type of command and the tables accessed, which, in turn, explains the confusion between some role classes. Lastly, both RF and PCA outperforms other state-of-the-art data mining techniques for the task of database anomaly detection, and WRF achieved the best performance, even on very skewed data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.