Abstract

This study investigates people's perception of visual and auditory landscapes in a mixed-use urban environment. A set of audio and visual data is collected at different intervals during the day in local streets with the help of an audio recorder and camera setup. The High and Low-level features from the collected audio and visual datasets are captured with the help of custom Deep Learning (DL) models and other standard algorithms. The collected data is used in the perception survey, which included human subjects (n = 73). The evaluation of the individual perception is done with the help of eight and six auditory and visual perceptual attributes, respectively. The results from the survey are then studied in relation to the features extracted from algorithms. Finally, a street of 10 km length is chosen within the study area where a spatiotemporal street-level visual and auditory data is collected. Statistical analysis and Machine Learning modeling are performed in the surveyed dataset to predict the human perception of audio and visual scenes in the chosen street. The results helped in understanding specific audio and visual features that are related to individual perceptions. Further, these relationships are utilized to create prediction models, which helped in creating spatiotemporal visual and auditory perception maps.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.