Abstract

AbstractThe aim of the paper was to present the methodology of imputation of the missing sound level data, for a period of several months, in many noise monitoring stations located at thoroughfares by applying one model which describes variability of sound level within the tested period. To build the model, at first the proper set of input attributes was elaborated, and training dataset was prepared using recorded equivalent sound levels at one of thoroughfares. Sound level values in the training data were calculated separately for the following 24-hour sub-intervals: day (6–18), evening (18–22) and night (22–6). Next, a computational intelligence approach, called Random Forest was applied to build the model with the aid of Weka software. Later, the scaling functions were elaborated, and the obtained Random Forest model was used to impute data at two other locations in the same city, using these scaling functions. The statistical analysis of the sound levels at the abovementioned locations during the whole year, before and after imputation, was carried out.

Highlights

  • Missing values in measurement data always hamper interpretation of results, regardless of the area of research [1]

  • The aim of the paper was to present the methodology of imputation of the missing sound level data, for a period of several months, in many noise monitoring stations located at thoroughfares by applying one model which describes variability of sound level within the tested period

  • When we consider time series data imputation only, autoregressive and computational intelligence (CI) methods [8] can be applied to building models

Read more

Summary

Introduction

Missing values in measurement data always hamper interpretation of results, regardless of the area of research [1]. When we consider time series data imputation only, autoregressive and computational intelligence (CI) methods [8] can be applied to building models. Examples of machine learning and computational intelligence methods used for missing data imputation are [12]: K-nearest neighbor (KNN), fuzzy K-means (FKM), singular value decomposition (SVD), and Bayesian principal component analysis (BPCA) as well as regression trees [1] like classification and regression trees (CART) [13] or Cubist [14].

Measurement data used for building the model
Elaborated model
Generalization of the model by using scaling functions
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.