Abstract
ABSTRACT Outlier detection is an important task in data-driven geotechnics. It is noted that many existing methods for outlier detection (e.g. Bayesian learning, neural networks) may pose computational challenge for conventional geotechnical practitioners. Towards this aspect, this study proposes a simple, fast, and explainable method for detecting outliers in geotechnical database. The principle of the method is that for a data point to be labelled as a potential outlier, the probability of observing a data point as extreme as that value should be low. To account for outliers in left-tail and right-tail, the skewness of the dataset is incorporated, and an indicator referred to as outlier score ( > 0 ) is assigned to each data point. The method also provides another indicator (referred to as dimensional outlier score) to identify which dimensions/soil properties contribute to the outlierness; hence, it is explainable. The method doesn't require any time-consuming learning or sampling procedures; hence, it is quite fast and practitioner-friendly. Multiple numerical examples are utilised to demonstrate the capability first. Finally, four publicly available geotechnical databases are utilised to demonstrate the outlier detection task. The results suggest that the outliers identified using the proposed method can be meaningful from a geotechnical point of view.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.