Abstract

In making the Machines Intelligent, and enable them to work as human, Speech recognition is one of the most essential requirement. Human Language conveys various types of information such as the energy, pitch, loudness, rhythm etc., in the sound, the speech and its context such as gender, age and the emotion. Identifying the emotion from a speech pattern is a challenging task and the most useful solution especially in the era of widely developing speech recognition systems with digital assistants. Digital assistants like Bixby, Blackberry assistant are building products that consist of emotion identification and reply the user in step with user point of view. The objective of this work is to improve the accuracy of the speech emotion prediction using deep learning models. Our work experiments with the MLP and CNN classification models on three benchmark datasets with 5700 speech files of 7 emotion categories. The proposed model showed improved accuracy.

Highlights

  • Q PDNLQJ WKH 0DFKLQHV ,QWHOOLJHQW DQG HQDEOLQJ WKHP WR ZRUN DV KXPDQ 6SHHFK UHFRJQLWLRQ LV RQH RI WKH HVVHQWLDO UHTXLUHPHQWV 8QGHUVWDQGLQJ RQHV (PRWLRQV DQG UHVSRQGLQJ VXLWDEO\ LQ D KXPDQ FRPSXWHU FRQYHUVDWLRQV PDNHV PDFKLQHV PRUH UHOLDEOH 'HWHUPLQLQJ HIILFLHQW WHFKQLTXHV WR LGHQWLI\ WKH HPRWLRQV LQ WKH VSHHFK VLJQDO KDV D YDULHW\ RI DSSOLFDWLRQV $V ZH KDYH EHHQ XVLQJ PDQ\ FRPSXWHU DSSOLFDWLRQV LQ RXU GD\ WRGD\ OLIH UHFRJQL]LQJ WKH HPRWLRQ KDV D VLJQLILFDQW LQIOXHQFH DQG KDV EHFRPH D GHPDQG IURP PDUNHWV WR PHGLFDO PDQDJHPHQW (PRWLRQ GHWHFWLRQ LV XVHG LQ PHGLFDO ILHOG ZKLFK KHOSV LQ VSRWWLQJ PHQWDO LVVXHV E\ GHWHUPLQLQJ 3DWLHQWV 6SHHFK SDWWHUQV>@ LQ EXVLQHVV PDUNHWLQJ XQGHUVWDQGLQJ FXVWRPHU¶V UHTXLUHPHQWV HQDEOHV FXVWRPL]HG SURPRWLRQ RI WKH SURGXFWV DQG LQ ( &RPPHUFH VLWHV VXFK DV $PD]RQ RU )OLSNDUW WR NQRZ WKH FXVWRPHU IHHGEDFN RI D SURGXFW QHHG HIILFLHQW VSHHFK HPRWLRQ UHFRJQLWLRQ V\VWHPV ,GHQWLI\LQJ HPRWLRQ LV D FKDOOHQJLQJ ZRUN EHFDXVH HPRWLRQV DUH VXEMHFWLYH LQGLYLGXDOV ZRXOG GUDZ RXW WKHP GLIIHUHQWO\ 7KH FRPSOH[LW\ RI 6(5 DOVR LQFOXGHV YDULRXV RWKHU IDFWRUV VXFK DV ODQJXDJH SLWFK HQHUJ\ ORXGQHVV UK\WKP HWF LQ WKH VRXQG VLJQDO DORQJ ZLWK WKH FRQWH[W VXFK DV JHQGHU DJH ZRUGV WLPH GXUDWLRQ RI D VLJQDO DQG HPRWLRQ DOO RI WKHVH ZLOO KDYH DQ LQIOXHQFH RQ WKH NLQG RI HPRWLRQ ZH DUH GHWHUPLQLQJ

  • Research of speech emotion recognition based on deep belief network and SVM 0DWK 3URE (QJJ & +XDQJ : *RQJ : )X DQG ' )HQJ 3A

Read more

Summary

Introduction

Q PDNLQJ WKH 0DFKLQHV ,QWHOOLJHQW DQG HQDEOLQJ WKHP WR ZRUN DV KXPDQ 6SHHFK UHFRJQLWLRQ LV RQH RI WKH HVVHQWLDO UHTXLUHPHQWV 8QGHUVWDQGLQJ RQHV (PRWLRQV DQG UHVSRQGLQJ VXLWDEO\ LQ D KXPDQ FRPSXWHU FRQYHUVDWLRQV PDNHV PDFKLQHV PRUH UHOLDEOH 'HWHUPLQLQJ HIILFLHQW WHFKQLTXHV WR LGHQWLI\ WKH HPRWLRQV LQ WKH VSHHFK VLJQDO KDV D YDULHW\ RI DSSOLFDWLRQV $V ZH KDYH EHHQ XVLQJ PDQ\ FRPSXWHU DSSOLFDWLRQV LQ RXU GD\ WRGD\ OLIH UHFRJQL]LQJ WKH HPRWLRQ KDV D VLJQLILFDQW LQIOXHQFH DQG KDV EHFRPH D GHPDQG IURP PDUNHWV WR PHGLFDO PDQDJHPHQW (PRWLRQ GHWHFWLRQ LV XVHG LQ PHGLFDO ILHOG ZKLFK KHOSV LQ VSRWWLQJ PHQWDO LVVXHV E\ GHWHUPLQLQJ 3DWLHQWV 6SHHFK SDWWHUQV>@ LQ EXVLQHVV PDUNHWLQJ XQGHUVWDQGLQJ FXVWRPHU¶V UHTXLUHPHQWV HQDEOHV FXVWRPL]HG SURPRWLRQ RI WKH SURGXFWV DQG LQ ( &RPPHUFH VLWHV VXFK DV $PD]RQ RU )OLSNDUW WR NQRZ WKH FXVWRPHU IHHGEDFN RI D SURGXFW QHHG HIILFLHQW VSHHFK HPRWLRQ UHFRJQLWLRQ V\VWHPV ,GHQWLI\LQJ HPRWLRQ LV D FKDOOHQJLQJ ZRUN EHFDXVH HPRWLRQV DUH VXEMHFWLYH LQGLYLGXDOV ZRXOG GUDZ RXW WKHP GLIIHUHQWO\ 7KH FRPSOH[LW\ RI 6(5 DOVR LQFOXGHV YDULRXV RWKHU IDFWRUV VXFK DV ODQJXDJH SLWFK HQHUJ\ ORXGQHVV UK\WKP HWF LQ WKH VRXQG VLJQDO DORQJ ZLWK WKH FRQWH[W VXFK DV JHQGHU DJH ZRUGV WLPH GXUDWLRQ RI D VLJQDO DQG HPRWLRQ DOO RI WKHVH ZLOO KDYH DQ LQIOXHQFH RQ WKH NLQG RI HPRWLRQ ZH DUH GHWHUPLQLQJ.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.