Abstract
Human collaboration has a great impact on the performance of multi-person activities. The analysis of speaker information and speech timing can be used to extract human collaboration data in detail. Some studies have extracted human collaboration data by identifying a speaker with business-card-type sensors. However, it is difficult to realize speaker identification for business-card-type sensors at low cost and high accuracy because of spikes in the measured sound pressure data, ambient noise in the non-speaker sensor, and synchronization errors across each sensor. This study proposes a novel sound pressure sensor and speaker identification algorithm to realize speaker identification for business-card-type sensors. The sensor extracts the user's speech at low cost and high accuracy by employing a peak hold circuit and time synchronization module for spike mitigation and precise time synchronization. The algorithm identifies a speaker with high accuracy by removing ambient noise. The evaluations show that the algorithm accurately identifies a speaker in a multi-person activity considering varying numbers of users, environmental noises, and reverberation conditions as well as long or short utterances. In addition, the peak hold circuit enables accurate extraction of speech and the synchronization error between the sensors is always within $\pm$ 30 $\boldsymbol\mu$ s, that is, negligible error.
Highlights
Human collaboration has a great impact on the effectiveness of multi-person activities; examples include collaborative work and learning
We found that 1) the peak hold circuit removes the spikes from the measured sound pressure data, 2) the experiments show the effectiveness of the proposed scheme under different numbers of users, environmental noises, and reverberation conditions as well as for long or short utterances, and 3) the synchronization error between the sensors is always within ±30 μs
SPEAKER IDENTIFICATION ACCURACY We experimentally evaluated the accuracy of the speaker identification algorithm using the sound pressure data obtained from existing and proposed business-card-type sensors
Summary
Human collaboration has a great impact on the effectiveness of multi-person activities; examples include collaborative work and learning. Some studies have used speaker information in multi-person activities to estimate human collaboration [2]– [5]. Classification is difficult, even with the sound pressure data of the speakers.1 To resolve these three issues, we propose the following: 1) a sound pressure sensor for business-card-type sensors and 2) a high-accuracy speaker identification algorithm using sound pressure data with a low sampling rate. We found that 1) the peak hold circuit removes the spikes from the measured sound pressure data, 2) the experiments show the effectiveness of the proposed scheme under different numbers of users, environmental noises, and reverberation conditions as well as for long or short utterances, and 3) the synchronization error between the sensors is always within ±30 μs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.