Audio Series Research Articles

The subject matter of the study is the analysis of the influence of pre-processing stages of the audio on the accuracy of speaker language regognition. The importance of audio pre-processing has grown significantly in recent years due to its key role in a variety of applications such as data reduction, filtering, and denoising. Taking into account the growing demand for accuracy and efficiency of audio information classification methods, evaluation and comparison of different audio pre-processing methods becomes important part of determining optimal solutions. The goal of this study is to select the best sequence of stages of pre-processing audio data for use in further training of a neural network for various ways of converting signals into features, namely, spectrograms and mel-cepstral characteristic coefficients. In order to achieve the goal, the following tasks were solved: analysis of ways of transforming the signal into certain characteristics and analysis of mathematical models for performing an analysis of the audio series by selected characteristics were carried out. After that, a generalized model of real-time translation of the speaker's speech was developed and the experiment was planned depending on the selected stages of pre-processing of the audio. To conclude, the neural network was trained and tested for the planned experiments. The following methods were used: mel-cepstral characteristic coefficients, spectrogram, time mask, frequency mask, normalization. The following results were obtained: depending on the selected stages of pre-processing of voice information and various ways of converting the signal into certain features, it is possible to achieve speech recognition accuracy up to 93%. The practical significance of this work is to increase the accuracy of further transcription of audio information and translation of the formed text into the chosen language, including artificial laguages. Conclusions: In the course of the work, the best sequence of stages of pre-processing audio data was selected for use in further training of the neural network for different ways to convert signals into features. Mel-cepstral characteristic coefficients are better suited for solving our problem. Since the neural network strongly depends on its structure, the results may change with the increase in the volume of input data and the number of languages. But at this stage, it was decided to use only mel-cepstral characteristic coefficients with normalization.

Other| April 01 2014 Reviews: Hidden Structure: Music Analysis Using Computers and Music21: A Toolkit for Computer-Aided Musicology Hidden Structure: Music Analysis Using Computers, by David Cope. Computer Music and Digital Audio Series 23. Middleton, WI: A-R Editions, 2008. xxix, 344 pp. + 1 CD-ROM.Music21: A Toolkit for Computer-Aided Musicology, by Michael Cuthbert. Version 1.5, last modified May 11, 2013. http://web.mit.edu/music21/. Ian Quinn Ian Quinn Ian Quinn is Professor of Music at Yale University, where he teaches music theory, music cognition, and computational methods in music research. Search for other works by this author on: This Site PubMed Google Scholar Journal of the American Musicological Society (2014) 67 (1): 295–307. https://doi.org/10.1525/jams.2014.67.1.295 Views Icon Views Article contents Figures & tables Video Audio Supplementary Data Peer Review Share Icon Share Facebook Twitter LinkedIn MailTo Tools Icon Tools Get Permissions Cite Icon Cite Search Site Citation Ian Quinn; Reviews: Hidden Structure: Music Analysis Using Computers and Music21: A Toolkit for Computer-Aided Musicology. Journal of the American Musicological Society 1 April 2014; 67 (1): 295–307. doi: https://doi.org/10.1525/jams.2014.67.1.295 Download citation file: Ris (Zotero) Reference Manager EasyBib Bookends Mendeley Papers EndNote RefWorks BibTex toolbar search Search Dropdown Menu toolbar search search input Search input auto suggest filter your search All ContentJournal of the American Musicological Society Search The Age of Big Data is here, and we in music studies find ourselves, as usual, both ahead of the curve and behind it. Like our cognate cousins in art history, film studies, and performance studies, we have reasonable excuse for lagging behind computational literati like Franco Moretti and Matthew Jockers, whose Stanford Literary Lab has been opening debates across the humanities1 and making headlines in the popular press. The excuse is technical: it is easy to transform literary texts into searchable data, and hard to do so for music. Insofar as the content of a novel is made out of words, it can be encoded as a long string of small numbers (representing letters and spaces) without losing any information. Even if we allow ourselves to believe that a musical score is an adequate representation of the content of a work, the encoding problem is vastly more complex.... You do not currently have access to this content.

Audio Series Research Articles

Related Topics

Articles published on Audio Series

Conceptual model of the technology for calculating the similarity threshold of two audio sequences

Research of the impact of noise reduction methods on the quality of audio signal recovery

Representation of Popular Islamic Movements on Social Media: A Study of the Hashtag #LogIndiCloseTheDoor

Development of Teaching Materials Based on Differentiated Learning to Improve Critical Thinking Dimensions of The Pancasila Learner Profile

Analysis of the influence of selected audio pre-processing stages on accuracy of speaker language recognition

Bez obrazu, czyli seriale audio. Problematyka kreacji świata przedstawionego i jego recepcji

The Audio Series Production Team's Strategy for "Catatan Pembalasan Fajar" In Retaining "Noice" Application Listeners

The Field Guide audio series: mobile learning using place-based and inquiry-led approaches to promote adolescents’ interest in nature

Analysis of the of training and test data distribution for audio series classification

Antifascist Mothers and Folk Healers: Queer Reinterpretations of Polish and Regional Cultural Archetypes in Familia

Back to the Future: Maximizing Student Learning and Wellbeing in the Virtual Age.

«Звучащее слово» в медиапредпочтениях молодежи

USMLE Step-1 is Going to Pass/Fail, Now What Do We Do?

Audiobook, audio podcast, audio series – modern formats of the media space

Sonic Commentary: Audio Series Volume 29

Sonic Commentary Audio Series Volume 28

The Sun’ll Be Hotter Tomorrow: Growing Up with Climate Chaos

Reviews: Hidden Structure: Music Analysis Using Computers and Music21: A Toolkit for Computer-Aided Musicology

Unforgotten Landscapes: Radio and the Reconstruction of Germany's European Mission in the East in the 1950s

The great American songbooks: musical texts, modernism, and the value of popular culture

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Audio Series Research Articles

Related Topics

Articles published on Audio Series

Conceptual model of the technology for calculating the similarity threshold of two audio sequences

Research of the impact of noise reduction methods on the quality of audio signal recovery

Representation of Popular Islamic Movements on Social Media: A Study of the Hashtag #LogIndiCloseTheDoor

Development of Teaching Materials Based on Differentiated Learning to Improve Critical Thinking Dimensions of The Pancasila Learner Profile

Analysis of the influence of selected audio pre-processing stages on accuracy of speaker language recognition

Bez obrazu, czyli seriale audio. Problematyka kreacji świata przedstawionego i jego recepcji

The Audio Series Production Team's Strategy for "Catatan Pembalasan Fajar" In Retaining "Noice" Application Listeners

The Field Guide audio series: mobile learning using place-based and inquiry-led approaches to promote adolescents’ interest in nature

Analysis of the of training and test data distribution for audio series classification

Antifascist Mothers and Folk Healers: Queer Reinterpretations of Polish and Regional Cultural Archetypes in Familia

Back to the Future: Maximizing Student Learning and Wellbeing in the Virtual Age.

«Звучащее слово» в медиапредпочтениях молодежи

USMLE Step-1 is Going to Pass/Fail, Now What Do We Do?

Audiobook, audio podcast, audio series – modern formats of the media space

Sonic Commentary: Audio Series Volume 29

Sonic Commentary Audio Series Volume 28

The Sun’ll Be Hotter Tomorrow: Growing Up with Climate Chaos

Reviews: Hidden Structure: Music Analysis Using Computers and Music21: A Toolkit for Computer-Aided Musicology

Unforgotten Landscapes: Radio and the Reconstruction of Germany's European Mission in the East in the 1950s

The great American songbooks: musical texts, modernism, and the value of popular culture