Combining physical and network data for attack detection in water distribution networks
1 July 2024
Water distribution infrastructures are increasingly incorporating IoT in the form of sensing and computing power to improve control over the system and achieve a greater adaptability to the water demand. This evolution, from physical towards cyberphysical systems, comes with an attack perimeter extended to the cyberspace. Being able to detect this novel kind of attacks is gaining traction in the scientific community. However, machine learning detection algorithms, which are showing encouraging results in cybersecurity applications, needs training data as close as possible to real world data in order to perform well in production environment. The availability of such data, with complexity levels on par with real world infrastructures, with acquisitions from both from physical and cyber spaces, is a bottleneck for the development of machine learning algorithms. This paper addresses this problem by providing an analysis of the currently available cyberphysical datasets in the water distribution field, together with a multi-layer comparison methodology to assess their complexity. This multi-layer approach to complexity evaluation of datasets is based on three major axes, namely attack scenarios, network topology and network communications, allowing for a precise look at the forces and weaknesses of available datasets across a wide spectrum. The results show that currently available datasets are emphasizing on one aspect of real world complexity but lacks on the others, highlighting the need for a more global approach in further work to propose datasets with an increased complexity on multiple aspects at the same time.