It is wise to investigate past and present epidemics in the hopes of profiting from them and being better prepared for future ones. COVID-19 is one of the most recent and well-known pandemics; its effects are still felt today. Most or nearly all governments have announced various measures to combat the virus, making it challenging to keep people aware of the most up-to-date and relevant information. As a result, many websites have created and maintained Frequently Asked Questions (FAQs) regarding the pandemic. People naturally tend to ask about multiple points in one question, leading to multi-label questions. Multi-label questions classification is one of Natural Language Processing’s (NLP) most common and complicated tasks. One of classification’s most significant contributions to advancing medical care and facilities is the development of automated question-and-answer systems. These systems can improve the efficiency of healthcare by reducing the burden on healthcare professionals and providing patients with timely and reliable answers to their questions. Due to the Arabic language’s intricate morphology and structure, such a task becomes more challenging when dealing with Arabic text. This study aims to build a multi-label classification model for Arabic medical questions. The investigation of pre-trained neural models significantly improved NLP performance. Recently, pre-trained models have been used in multi-label classification. This study proposes a deep learning model for classifying Arabic multi-label COVID-19 questions by combining the strengths of DeBERTa (Decoding-enhanced BERT with Disentangled Attention) and BiLSTM (Bidirectional Long Short-Term Memory) networks. Deep learning methods are prevalent because they generate dense feature representations automatically and implicitly capture hidden relationships. The DeBERTa model is fine-tuned to generate the representation of word vectors. The BiLSTM model is fed word vectors to extract and represent features deeply. The proposed multi-label classification model categorizes questions into one or more available ten categories. The deep learning model is evaluated using hamming loss, micro-precision, micro-recall, micro-F1, subset accuracy, AUC, and Jaccard index. It showed an effective classification for Arabic questions with encouraging performance. The proposed model achieved values of 0.042 for hamming loss, 0.84 for micro-precision, micro-recall, and micro-F1, 0.71 for subset accuracy, 0.89 for AUC, and 0.72 for Jaccard index. Therefore, this paves the way for adopting an automated multi-label classification model for medical questions in health facilities. Which can help telehealth medical providers present more reliable and effective consultations.