Abstract

In this article, a novel pitch determination algorithm based on harmonic differences method (HDM) is proposed. Most of the algorithms today rely on autocorrelation, cepstrum, and lastly convolutional neural networks, and they have some limitations (small datasets, wideband or narrowband, musical sounds, temporal smoothing, etc.), accuracy, and speed problems. There are very rare works exploiting the spacing between the harmonics. HDM is designed for both wideband and exclusively narrowband (telephone) speech and tries to find the most repeating difference between the harmonics of speech signal. We use three vowel databases in our experiments, namely, Hillenbrand Vowel Database, Texas Vowel Database, and Vowels from the TIMIT corpus. We compare HDM with autocorrelation, cepstrum, YIN, YAAPT, CREPE, and FCN algorithms. Results show that harmonic differences are reliable and fast choice for robust pitch detection. Also, it is superior to others in most cases.

Highlights

  • Pitch is an extraordinarily complicated and distinct feature of human speech and plays a major role in the perception of human conversations as well as in human-computer interactions

  • It has a wide range of applications in emotion and gender recognition, speech synthesis, human-computer interaction, and detection of symptoms of pathological disorder at early stages

  • Is article is organized as follows: in Section 1, we discuss the foundations and importance of pitch detection and tracking, Section 2 deals with literature overview, historical background, algorithms, novelties of this article, datasets, ground truth methods, error measures, difficulties, application areas, and related algorithm domains, Section 3 describes the novel harmonic differences method (HDM) algorithm, Section 4 delineates datasets used in this article and experimental setup, Section 5 presents the Mathematical Problems in Engineering results of wide and narrowband experiments, Section 6 is devoted to gender detection results, and in Section 7, we conclude this work with the evaluations and future studies

Read more

Summary

Introduction

Pitch is an extraordinarily complicated and distinct feature of human speech and plays a major role in the perception of human conversations as well as in human-computer interactions. Pitch helps us to identify some of the important cues about the speaker, such as the identity, gender, emotional state, or about the tones of a musical instrument. It has a wide range of applications in emotion and gender recognition, speech synthesis, human-computer interaction, and detection of symptoms of pathological disorder at early stages. Is article is organized as follows: in Section 1, we discuss the foundations and importance of pitch detection and tracking, Section 2 deals with literature overview, historical background, algorithms, novelties of this article, datasets, ground truth methods, error measures, difficulties, application areas, and related algorithm domains, Section 3 describes the novel HDM algorithm, Section 4 delineates datasets used in this article and experimental setup, Section 5 presents the Mathematical Problems in Engineering results of wide and narrowband experiments, Section 6 is devoted to gender detection results, and in Section 7, we conclude this work with the evaluations and future studies

Literature Review
Datasets and Experimental Setup
Results
Gender Detection Implementations

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.