Web security has emerged as one of the most prominent concerns in the realm of cybersecurity. Traditional rule-based methods for detecting web attacks often rely on manual rule definition and pattern matching, leaving them inadequate in accurately identifying new and intricate attack patterns. In the face of these challenges, machine learning techniques have demonstrated potential and advantages. This paper presents a web attack detection method based on honeypots and the logistic regression algorithm. It involves the cleansing, filtering, and analysis of web logs captured by honeypots, followed by the vectorization of the textual data contained in these logs. The logistic regression algorithm is employed to train and test the classification of the text vectors, generating a logistic regression model. This model is then used to predict newly generated web logs, enabling effective dynamic web attack detection. Experimental evaluations using collected datasets are conducted, comparing the proposed method with the support vector machine approach. The results demonstrate that this method achieves rapid and accurate detection and recognition of web attack behaviors while ensuring performance efficacy.
Read full abstract