Intelligent Speech Technology (IST) is revolutionizing healthcare by enhancing transcription accuracy, disease diagnosis, and medical equipment control in smart hospital environments. This study introduces an innovative approach employing federated learning with Multi-Layer Perceptron (MLP) and Gated Recurrent Unit (GRU) neural networks to improve IST performance. Leveraging the “Medical Speech, Transcription, and Intent” dataset from Kaggle, comprising a variety of speech recordings and corresponding medical symptom labels, noise reduction was applied using a Wiener filter to improve audio quality. Feature extraction through MLP and sequence classification with GRU highlighted the model’s robustness and capacity for detailed medical understanding. The federated learning framework enabled collaborative model training across multiple hospital sites, preserving patient privacy by avoiding raw data exchange. This distributed approach allowed the model to learn from diverse, real-world data while ensuring compliance with strict data protection standards. Through rigorous five-fold cross-validation, the proposed Fed MLP-GRU model demonstrated an accuracy of 98.6%, with consistently high sensitivity and specificity, highlighting its reliable generalization across multiple test conditions. In real-time applications, the model effectively performed medical transcription, provided symptom-based diagnostic insights, and facilitated hands-free control of healthcare equipment, reducing contamination risks and enhancing workflow efficiency. These findings indicate that IST, powered by federated neural networks, can significantly improve healthcare delivery, accuracy in patient diagnosis, and operational efficiency in clinical settings. This research underscores the transformative potential of federated learning and advanced neural networks for addressing pressing challenges in modern healthcare and setting the stage for future innovations in intelligent medical technology.
Read full abstract