Abstract

The protection of critical infrastructure such as water treatment and water distribution systems is crucial for a functioning economy. The use of cyber-physical systems in these systems presents numerous vulnerabilities to attackers. To enhance security, intrusion detection systems play a crucial role in limiting damage from successful attacks. Machine learning can enhance security by analysing data patterns, but several attributes of the data can negatively impact the performance of the machine learning model. Data in critical water system infrastructure can be difficult to work with due to their complexity, variability, irregularities, and sensitivity. The data involve various measurements and can vary over time due to changes in environmental conditions and operational changes. Irregular patterns and small changes can have significant impacts on analysis and decision making, requiring effective data preprocessing techniques to handle the complexities and ensure accurate analysis. This paper explores data preprocessing techniques using a water treatment system dataset as a case study and provides preprocessing techniques specific to processing data in industrial control to yield a more informative dataset. The results showed significant improvement in accuracy, F1 score, and time to detection when using the preprocessed dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call