Combined Application of Speech Recognition and Natural Language Processing Technologies in the Electric Power Industry
Abstract The application of speech recognition technology in the power industry can improve the collaborative efficiency of power grids at all levels and reduce the work intensity of dispatchers, which is one of the indispensable key technologies in the process of intelligent development of power grids. In this study, a power speech recognition model is designed based on the combination of Transformer-based out-of-set word model and n-gram language error checking based model. For model application, a training set is used for model training to test the input features of the model in this paper. Subsequently, a power speech dataset was created, which was used for model comparison to validate the effectiveness of the algorithms in the paper. System design using the algorithms proposed in the paper is carried out to process real-time speech, speech files, and speech information from telephone terminals. The results show that the Spectrogram feature of the speech signal is more suitable as the input feature of the model in this paper, which can reduce the word error rate of the speech recognition model. The model in this paper performs best in all four metrics: Accurary, Precision, Recall, and F1. The parameter count of the proposed method in this paper is 25, the word error rate WER is 8.21%, and the real-time rate RTF is 0.017, which indicates that the algorithm has a good generalization performance on power speech dataset.
4
- 10.1109/icicas48597.2019.00078
- Dec 1, 2019
1
- 10.1109/icaica58456.2023.10405403
- Nov 28, 2023
1
- 10.1109/icsece61636.2024.10729529
- Aug 29, 2024
11
- 10.1109/iccc54389.2021.9674619
- Dec 10, 2021
4
- 10.1109/powercon53785.2021.9697721
- Dec 8, 2021
1
- 10.46300/9106.2022.16.22
- Jan 7, 2022
- International Journal of Circuits, Systems and Signal Processing
1
- 10.1007/978-981-19-9376-3_4
- Jan 1, 2023
1
- 10.1109/iccasit58768.2023.10351710
- Oct 11, 2023
4
- 10.1145/3419635.3419686
- Oct 16, 2020
2
- 10.1007/978-981-13-9783-7_59
- Aug 8, 2019
- Research Article
4
- 10.1088/1742-6596/1852/2/022086
- Apr 1, 2021
- Journal of Physics: Conference Series
In recent years, with the continuous deepening of college teaching reform and the rapid development of artificial intelligence, the teaching methods used in college classrooms have been greatly different from traditional ones. The original teaching programs and teaching methods need to be changed with the development of the times. The main purpose of this article is to improve the correction rate of intelligent voice recognition technology for college piano timbre without violating the safety of artificial intelligence. This article mainly conducts experiments by consulting relevant domestic and foreign literature, using observation and comparison methods, and using some neural network algorithms (BP neural network algorithm and convolutional neural network algorithm) to obtain experimental results, and finally complete our experimental goal-research intelligence The application of speech recognition technology in the tone correction of piano teaching in colleges. The test results show that the use of multiple algorithms can improve the application of intelligent speech recognition technology in the tone correction of piano teaching in colleges and universities, mainly to improve the correction rate and enhance the ability of guiding teaching.
- Conference Article
3
- 10.1145/3482632.3483130
- Sep 24, 2021
With the rapid development of computer speech recognition technology, how to better apply it to oral English teaching has become a concern of majority English Teachers. This paper probes into the operation method of speech recognition technology and its application in oral English teaching, and emphatically introduces several main forms of the application of speech recognition technology in the field of oral English teaching. It including: oral assessment, learning records, intelligent platform, multimedia information retrieval, etc., which expounds the practical significance of the application of speech recognition technology in the field of oral English teaching. The oral English teaching mode based on speech recognition can improve students' oral English ability more effectively, which helps students to build up confidence, improve their interest in learning and achieve good learning results.
- Research Article
- 10.54254/2755-2721/2025.ch23276
- May 19, 2025
- Applied and Computational Engineering
Speech recognition technology has developed from the 1950s to the present, evolving from template matching methods to Hidden Markov Model (HMM) statistical methods, then to machine learning techniques, and finally to the current use of Transformer technology for speech recognition tasks. However, the Transformer model has not yet been widely adopted in the field of speech recognition. This paper explores the characteristics of Transformer model, combines it with the characteristics of speech recognition tasks, analyzes the challenges associated with using Transformer model for these tasks, and provides suggestions for directions of future research, so as to facilitate the application of Transformer models in speech recognition. The paper finds that the reasons for the limited application of Transformer models in speech recognition tasks mainly include their numerous parameters, complex structure, and high computational costs, which have prevented their extensive use in this field. In the future, efforts should focus on enhancing model compression and lightweight design, and improving the attention mechanism to boost the applicability of Transformer models in speech recognition.
- Research Article
2
- 10.46300/9106.2022.16.117
- Mar 30, 2022
- International Journal of Circuits, Systems and Signal Processing
Speech recognition is an important research field in natural language processing. In Chinese and English, which have rich data resources, the performance of end-to-end speech recognition model is close to that of Hidden Markov Model—Deep Neural Network (HMM-DNN) model. However, for the low resource speech recognition task of Chinese English hybrid, the end-to-end speech recognition system does not achieve good performance. In the case of limited mixed data between Chinese and English, the modeling method of end-to-end speech recognition is studied. This paper focuses on two end-to-end speech recognition models: connection timing distribution and attention based codec network. In order to improve the performance of Chinese English hybrid speech recognition, this paper studies how to improve the performance of the coder based on connection timing distribution model and attention mechanism, and tries to combine the two models to improve the performance of Chinese English hybrid speech recognition. In low resource Chinese English mixed data, the advantages of different models are used to improve the performance of end-to-end models, so as to improve the recognition accuracy of speech recognition technology in legal Chinese English simultaneous interpretation.
- Conference Article
3
- 10.1109/iccasit50869.2020.9368869
- Oct 14, 2020
Speech recognition technology, as part of human-computer interaction, is essential for machine intelligence. The utilization of robots as the co-pilot in civil aircraft is the major breakthrough and innovation direction in the civil aviation industry. The application of speech recognition technology to the the robot co-pilot can make the command of the captain directly to the co-pilot program, making it possible to cooperate between the captain and the robot co-pilot. In view of the above background, according a standard callouts speech cropus, using the end-to-end speech recognition methods, three CTC speech recognition models were built. One of the speech recognition models named the Bi-LSTM recurrent neural network speech recognition based on CTC, the training error rate is reduced to 1.2%, and the test error rate is reduced to 3.2%. Therefore, the Bi-LSTM speech recognition model is used as the speech recognition system of the artificial intelligence co-pilot.
- Research Article
1
- 10.1051/matecconf/201822403014
- Jan 1, 2018
- MATEC Web of Conferences
The increase in the global consumption of marketed energy from all fuel sources (except coal) is regarded as a key factor driving power engineering industry (PEI) market growth. The absence of radical change in the structure of investment in PEI until 2030, with domination of investing equipment for the thermal power industry (with the exception of the year 2020) along with the essential growth of investment in the nuclear power industry is stated in the article. The authors focus on the significant potential of nanomaterials development and application for providing the PEI growth based on the new technological solutions and optimized technologies. Most widely used nanomaterials in the PEI worldwide, major fields and promising areas of nanomaterials application in the industry aimed at improving technology of the equipment’s fuel and structural elements construction, increasing efficiency of existing equipment, and developing renewable energy sector are examined. Contemporary trends and prospects for the PEI selected nanomaterials markets, their key players, positive and negative factors of market growth are identified.
- Conference Article
5
- 10.2991/iiicec-15.2015.25
- Jan 1, 2015
Appling the speech recognition technology into the virtual reality system can not only expand the application of speech recognition in scene roaming, but also make up the shortage of the interaction of virtual reality software, and improve the efficiency of interaction between users and the virtual environment. This paper aims to combine the speech recognition technology with the VR technology, and control the users' viewpoint in the VR system by speech. Using Microsoft Speech SDK5.1, the speech recognition program is developed and the interface is designed to connect the speech recognition module with the VR software. Based on the EON SDK, the EON nodes of speech recognition and scene roaming are programmed. The above EON nodes and the built models are imported into EON. Running the speech recognition program and then the scene roaming controlled by speech is realized eventually. The experiment on walking mazement shows that the speech commands can exactly control the users' motion to avoid obstacles and successfully go through the mazement.
- Conference Article
- 10.1109/imcec46724.2019.8983966
- Oct 1, 2019
Traditional systems of speech recognition based interaction need to identify the accurate semantics and grammar in the speech to be recognized. This kind of systems always contain problems such as using a specific language, presetting voice template, they are not suitable for speech recognition interaction in augmented reality children’s books. For these reasons, this paper studies the pronunciation characteristics and interactive needs of children, analyses the differences among template matching method, random model method and artificial neural network method. Then uses template matching method to design a speech recognition system for small vocabulary that can replace the voice template, and applies it to the speech recognition interaction in an augmented reality children's reading application. Finally, the effectiveness of the proposed method is verified through experiments, which provides a new direction for the application of speech recognition technology in augmented reality children’s books.
- Conference Article
6
- 10.1109/vnis.1995.518827
- Jul 30, 1995
Speech recognition technology has potential for application to intelligent transportation systems (ITS) in the area of advanced traveler information systems (ATIS). ATIS areas which can greatly benefit from speech recognition are: pre-trip traffic information; en-route traffic information; route guidance; ride matching and reservation; transit information services; and traffic and transit management control centers. This paper concentrates on telephony applications of speech recognition to two ATIS applications, real-time traffic information and public transit information. We first discuss the general considerations involved in development of a multiline speech interface to a traffic or transit database, including system architecture and sizing, and the development of dialogs and grammars which are user friendly and efficient. We then discuss two prototype applications of this technology using BBN's HARK speaker-independent speech recognition product. The first is to SmartRoute System's SmarTraveler traveler information system, which was an FHWA Operational Test, and the second to Montgomery County Maryland's public transit system. These were small scale tests to help us develop the technology required for full-scale implementations, but showed that the speech interface was practical, and provided much improved ease of use and performance over conventional touch tone based interactive voice response (IVR) systems.
- Research Article
- 10.28925/2663-4023.2024.25.468486
- Jan 1, 2024
- Cybersecurity: Education, Science, Technique
The article provides a comprehensive comparative analysis of methods, technologies, and modern approaches to the use of speech recognition and natural language processing (NLP) technologies in the context of national security and information security. The key aspects of the use of technologies for monitoring communications, detecting suspicious activity and application in the field of intelligence and counterintelligence, the role in ensuring cybersecurity, the possibilities of biometric identification by voice, ethical and legal aspects, and technological challenges are considered. The problem statement focuses on the challenges associated with the widespread adoption of speech recognition and NLP technologies, in particular, the lack of accuracy of algorithms, which creates risks to the reliability of security systems. The author also emphasizes the importance of addressing ethical and legal issues related to the privacy of citizens and the possible misuse of technologies for mass surveillance. The paper provides examples of systems for cybersecurity purposes, such as mass listening and analysis systems, targeted monitoring systems, social media analysis platforms, biometric identification systems, and others. The results section of the study presents a high-level structure of threat protection systems that covers threat channels and levels of protection. The complexity of modern threats that can integrate into several channels simultaneously, in particular using voice information, is considered. The author details the place and role of voice information in the structure of threat protection, emphasizing the importance of integrating various systems and platforms to ensure comprehensive security. Two approaches to building a security system that works with voice information are considered: aggregation of the maximum possible information from existing systems and creation of a system for each specific problem. A comparative analysis of these approaches is carried out, their advantages and disadvantages are identified, and the limitations and risks of using voice recognition methods are described, including the reliability and accuracy of technologies, the availability of data for training models, the cost of implementation, issues of confidentiality and privacy, data security, use in military and intelligence activities, ethical issues, and the risks of voice fraud and artificial voices.
- Conference Article
- 10.1145/3584376.3584513
- Dec 16, 2022
Speech recognition technology is an important technology aimed at realizing the free dialogue between artificial intelligence and human beings. Speech recognition technology still has very important research value in the 21st century. In this paper, the development history of speech recognition technology is described in chronological order, the classification of speech recognition technology is explained from the principle and application of speech recognition, and the principle and process of speech recognition technology are explained in detail, the relevant methods of speech recognition technology are introduced emphatically, and the experimental results of different methods of speech recognition technology are compared Finally, the application of speech recognition technology and the future development of speech recognition technology are illustrated. This paper mainly introduces the methods of speech recognition technology based on hidden Markov model, and also summarizes the methods of artificial neural network and artificial intelligence. It also points out the existing problems of speech recognition technology, and expresses relevant views on the development of speech recognition technology in the future.
- Conference Article
- 10.1109/picmet.2016.7806576
- Sep 1, 2016
The electric power utilities as important social infrastructures should be operated stably without any failure in supply of electricity. For stable operation, it is necessary to input huge amount of resource and investment throughout power generation, transmission and distribution facilities. Particularly, constant inspection and maintenance of the facilities requires highly skilled manpower and advanced technologies. In spite of endless efforts, the electric power industry is facing serious challenges from social, economic and environmental problems. In this regard, a number of robotic systems have been tested and applied for inspection and maintenance in nuclear power plants and high voltage power transmission lines. The Electric Power Research Institute (EPRI) which conducts research, development and demonstration (RD&D) relating to the generation, delivery and use of electricity for the benefit of the public has also required efficient technology management in providing a blue print of robotics technologies in electric power sector for the future. The organization wants to centralize the R&D capability of robotics technologies which are dispersed by each division in order to prevent duplicated investments and manage its R&D capability effectively. This research is a step towards assessing the current robotics technology being used in the power industry and identifying the technologies that would benefit the industry most by using the Technology Development Envelop (TDE) approach.
- Conference Article
7
- 10.1145/1496976.1496978
- Jul 2, 2008
As performance gains in automatic speech recognition systems plateau, improvements to existing applications of speech recognition technology seem more likely to come from better user interface design than from further progress in core recognition components. Among all applications of speech recognition, the usability of systems for transcription of spontaneous speech is particularly sensitive to high word error rates. This paper presents a series of approaches to improving the usability of such applications. We propose new mechanisms for error correction, use of contextual information, and use of 3D visualisation techniques to improve user interaction with a recogniser and maximise the impact of user feedback. These proposals are illustrated through several prototypes which target tasks such as: off-line transcript editing, dynamic transcript editing, and real-time visualisation of recognition paths. An evaluation of our dynamic transcript editing system demonstrates the gains that can be made by adding the corrected words to the recogniser's dictionary and then propagating the user's corrections.
- Research Article
- 10.1121/1.2023679
- Dec 1, 1986
- The Journal of the Acoustical Society of America
This paper presents an overview of several military/government programs in which SCI Technology, Inc. has implemented and tested its speech recognition system. Included are: (1) the Speckled Trout (U.S. Air Force), (2) LHX (Light Helicopter Experimental, U.S. Army), (3) Space Shuttle (NASA), (4) Space Station, (5) AFTI F‐16, and (6) Advanced Tactical Fighter (ATF). Some programs consist of technology demonstrations, while others involve flight testing, and one, Speckled Trout, operationally installing and utilizing a system on a continual basis. In some cases, the hardware consists of a SCI Voice Control Unit (VCU‐5137) and others, a Voice Development System (VDS‐7001). For example, the Space Station application consisted of implementing a VDS in a power system monitoring and switching network in which power switches and loads could be controlled by verbal commands. In another application, two VCUs have been delivered to NASA for future Shuttle flights in which the remote controlled cameras on board will be manipulated (switched, positioned, and focused) using speech, freeing the astronauts' hands for more manually oriented tasks such as controlling the mechanical arm in the cargo bay. The LHX applications included interrogating various on‐board systems (e.g., electrical, hydraulic, transmission) by speech commands and having the system status appear on video monitors. The Speckled Trout program is the first operational speech recognition system to be installed in an aircraft as an integral component of the aircraft's systems. Three VCUs are part of an integrated radio and navigational aides control system in which the system control can be either manual or by voice commands. However, since installation several months ago, the crew reports that radio control has been carried out almost exclusively by verbal commands. The AFTI F‐16 program includes integrating a VDS as part of a flight simulator for testing and evaluation. The ATF program has only recently started and will involve integration as part of an advanced cockpit. This paper will also discuss the evolutionary process that has proven essential to the successful application of speech recognition technology into military and governmental systems of the future.
- Research Article
- 10.12720/lnit.2.1.117-120
- Jan 1, 2014
- Lecture Notes on Information Theory
With the further advance of energy conservation and emissions reduction, the energy performance contracting (EPC) has been widely used and promoted, as it is an effective energy saving way under the market mechanism. While the utility industry is the basic industry and main energy sector of national economy, the low carbon development is imperative. Therefore, in order to achieve the propose of promoting EPC application in electric power industry and accelerate the development of electric power industry's technology independent process, we proposed two suggestions that build the EPC e-commerce system of the power industry and blend EPC in the enterprises energy management system through the link of e-commerce, by analyzing the advantages and application obstacles of the EPC. 1
- Research Article
- 10.2478/amns-2025-1033
- Jun 5, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-1114
- Jun 5, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-0486
- Jan 1, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-0844
- Jan 1, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-0418
- Jan 1, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-0182
- Jan 1, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-1131
- Jan 1, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-0389
- Jan 1, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-0641
- Jan 1, 2025
- Applied Mathematics and Nonlinear Sciences
- Research Article
- 10.2478/amns-2025-0022
- Jan 1, 2025
- Applied Mathematics and Nonlinear Sciences
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.