Abstract
Rapid urbanization presents significant challenges in energy consumption, noise control, and environmental sustainability. Smart cities aim to address these issues by leveraging information technologies to enhance operational efficiency and urban liveability. In this context, urban sound recognition supports environmental monitoring and public safety. This study provides a comparative evaluation of three machine learning models—convolutional neural networks (CNNs), long short-term memory (LSTM), and dense neural networks (Dense)—for classifying urban sounds. The analysis used the UrbanSound8K dataset, a static dataset designed for environmental sound classification, with mel-frequency cepstral coefficients (MFCCs) applied to extract core sound features. The models were tested in a fog computing architecture on AWS to simulate a smart city environment, chosen for its potential to reduce latency and optimize bandwidth for future real-time sound-recognition applications. Although real-time data were not used, the simulated setup effectively assessed model performance under conditions relevant to smart city applications. According to macro and weighted F1-score metrics, the CNN model achieved the highest accuracy at 90%, followed by the Dense model at 84% and the LSTM model at 81%, with the LSTM model showing limitations in distinguishing overlapping sound categories. These simulations demonstrated the framework’s capacity to enable efficient urban sound recognition within a fog-enabled architecture, underscoring its potential for real-time environmental monitoring and public safety applications.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have