Abstract

Using the Kinect sensor device and its related Microsoft software development kit (SDK), a Kinect-SDK speech recognition system can be easily established. However, such speech recognition systems exhibit substandard recognition performance and unreliable recognition decision-making because of the arbitrary placement of only one Kinect sensor. For sensing and control in Industry 4.0, correctness of the command recognized via sensing is essential for target control. For enhancing conventional Kinect-SDK speech recognition, this paper presents a client–server Kinect-SDK speech recognition scheme in which sensor deployment strategies and sensor fusion calculations are implemented using a TCP/IP decision server and multiple TCP/IP Kinect sensor clients. For sensor deployment, three deployment strategies are proposed: central, face-to-face, and diagonal-corner deployment. For sensor fusion calculations, three data fusion algorithms are proposed: sensor fusion by voting, voice energy comparisons, and voice energy comparisons with thresholds. The recognition performance of the conventional Kinect-SDK approach can be significantly improved by finely hybridizing sensor deployments and sensor data fusion; experimental results showed that Kinect-SDK speech recognition using the diagonal-corner deployment strategy hybridized with sensor fusion by voice energy comparisons with thresholds had the highest average recognition accuracy, which was significantly higher than that of the conventional Kinect SDK-speech recognition approach (14.93%). In addition, we implemented this strategy for the operation control of a remote multimedia player and a two-wheel automobile car in a laboratory office space.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.