Related Topics
Articles published on Dataset Creation
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
710 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.ejrad.2026.112796
- Jun 1, 2026
- European journal of radiology
- Ali Alsalama + 4 more
Ethmoid sinus CBCT imaging as a biometric instrument: dataset creation for deep learning identification.
- New
- Research Article
- 10.1016/j.cmpb.2026.109315
- Jun 1, 2026
- Computer methods and programs in biomedicine
- Alba Nogueira-Rodríguez + 15 more
PIBAdb: a public cohort of multimodal colonoscopy videos and images including polyps with histological information.
- New
- Research Article
- 10.1016/j.compmedimag.2026.102775
- May 12, 2026
- Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society
- Pietro Leoncini + 10 more
Generative AI pipeline with model-guided filtering for sim-to-real transfer in surgical imaging.
- Research Article
- 10.1111/nicc.70478
- May 1, 2026
- Nursing in critical care
- Suzan Guven + 2 more
Pain is a multifaceted and subjective phenomenon frequently experienced by patients in intensive care units. In non-communicating populations, conventional assessment tools are often inadequate and susceptible to observer bias. Deep learning-based facial analysis has emerged as a promising approach for the objective quantification of observable pain-related behavioural indicators. To evaluate the feasibility and diagnostic accuracy of deep-learning models in categorising pain severity in non-communicative adult intensive care patients, using expert-annotated facial images. Features were extracted via the DenseNet-169 architecture, dimensionally reduced with principal component analysis and classified using support vector machine, random forest and K-nearest neighbours. Data sets were independently annotated by a multidisciplinary team comprising an intensivist, intensive care nurses and a pain specialist. Model performance was comprehensively assessed through accuracy, precision, sensitivity, the F1 score, the area under the receiver operating characteristic curve and Fleiss' kappa coefficient to ensure robust inter-rater reliability. A total of 636 facial images obtained from 120 adult intensive care unit patients were analysed. The support vector machine model achieved the highest overall performance, with an accuracy of 96.9% and an area under the receiver operating characteristic curve of 0.994, demonstrating exceptional sensitivity in severe pain classification. While K-nearest neighbours showed superior performance for moderate pain detection, random forest yielded the lowest accuracy across all data sets. Notably, inter-rater agreement was low (k = 0.16), highlighting the significant variability in expert human judgements and the subjective nature of manual pain assessment. Deep learning-based facial analysis provides a valid, reproducible and standardised method for pain assessment in non-verbal intensive care patients. The creation of a multi-expert annotated data set and the systematic comparison of classifiers across diverse clinical perspectives represent the original contributions of this study. Automated facial expression analysis minimises inter-observer variability by providing an objective decision support mechanism for critical care nurses. This technology facilitates the standardisation of pain management protocols and bolsters patient safety by reducing the inherent risks of subjective assessment bias.
- Research Article
- 10.1016/j.iswa.2026.200648
- May 1, 2026
- Intelligent Systems with Applications
- Phudinan Singkhamfu + 5 more
Intelligent drone-based framework for autonomous inventory data inspection
- Research Article
- 10.3390/data11050091
- Apr 23, 2026
- Data
- Harbil Arregui + 3 more
This paper presents the toolset, methodology and procedure followed to create a dataset from battery electric vehicle trajectories, called DEVRT—Dataset of Electric Vehicle Real Trips. Understanding the behaviour of electric vehicles and their battery consumption under real-life conditions and journeys is required in the shift towards the electrification of transport of people and goods. This paper aims to contribute with the provision of real measurements in different types of routes and environmental contexts at the time of driving to support data analytics and modelling techniques, essential for extracting actionable insights from electric vehicle battery consumption. The preparation, on-route and post-processing steps of the followed methodology are depicted. The outcome dataset consists of probe data collected over 4 days following heterogeneous routes performed by four different drivers using two electric vehicles (one more suitable to city usage and the other one more suitable for longer trips). This probe data is complemented with associated road network characterisation information, traffic flow measurements and weather extracted from auxiliary data sources. The paper presents a comprehensive description of the geographical characteristics of the trajectories, qualitative and quantitative characterisation of planned routes to create these trajectories, and criteria used to select them.
- Research Article
- 10.1080/20964471.2026.2649432
- Apr 16, 2026
- Big Earth Data
- Mohammad H Vahidnia + 1 more
ABSTRACT City landmarks play a pivotal role in fostering a deeper understanding of urban environments and in supporting navigation, wayfinding, and cultural education and engagement. One of the primary objectives of sustainable and smart city management is the design of pervasive computing infrastructures that enable the discovery and retrieval of spatial and non-spatial information from landmarks and points of interest (POIs) in any given context. A promising solution lies in location-based augmented reality (LBAR) platforms for smartphones. Previous research has typically adopted two separate approaches: first, location and inertia sensors-based LBAR systems, which face limitations such as GPS errors, narrow fields of view, and sensor malfunctions; and second, image-based deep learning (DL) systems, typically demand large datasets, impose heavy computational loads, and may introduce latency in real-time applications. To address these gaps, we propose a switching-based pervasive AR design that combines LBAR, DL, and context-awareness. This framework is a complementary switching mechanism in which DL serves as a fallback when LBAR fails to identify a landmark. For this purpose, convolutional neural network (CNN) methods and their lightweight pre-trained variants, such as MobileNetV2, were employed. In addition, we propose a high level of context-awareness for cultural heritage engagement, by incorporating parameters such as time of day, user age, ongoing events, and spatial distance. Importantly, our study adopts an end-to-end approach encompassing dataset creation, web-service development, mobile application implementation, and real-world field testing. Experiments conducted on ten well-known landmarks in Tehran, using images curated from social media and the Internet, confirm the effectiveness of the integrated system. MobileNetV2 achieved reliable real-time landmark detection with 0.9100 ± 0.0215 accuracy and 0.9010 ± 0.0249 macro-F1. The switching LBAR–DL framework improved detection accuracy by approximately 34% compared with LBAR alone, yielding meaningful statistical results for the ablation analysis. User-centered evaluations with volunteers indicated that 90% of respondents were satisfied with overcoming LBAR limitations, and 68% reported a positive experience in context-aware understanding from landmarks. Finally, a GIS-assisted sustainable development analysis in Tehran demonstrated the potential of the proposed system in smart and sustainable city planning.
- Research Article
- 10.1038/s41598-026-48169-z
- Apr 12, 2026
- Scientific reports
- Hao Wang + 11 more
Burn injuries are a common pediatric health threat with depth assessment relying heavily on subjective visual inspection. While objective techniques like laser Doppler imaging exist, their cost and portability limitations restrict use. We propose SAM-DR to address the challenge of scarce annotated burn data by repurposing pre-trained models with minimal fine-tuning. By replacing SAM's segmentation head with dense linear regression, our method not only identifies burn locations but also perceives burn depth through continuous depth prediction. Using 294 smartphone images from 94 patients annotated by 9 clinicians, we conducted a pixel-level comparison of human disagreement. SAM-DR achieved a 0.96 Dice score in wound segmentation, establishing state-of-the-art performance, and the use of interactive thresholding enabled segmentation of different burn depths comparable to human experts, suitable for assisted annotation. We developed an interactive tool based on SAM-DR that supports both clinical diagnosis and data annotation, offering a non-contact solution for burn assessment and dataset creation.
- Research Article
- 10.1016/j.dib.2026.112586
- Apr 1, 2026
- Data in brief
- James Weatherhead + 2 more
Hospitals and vendors now run HIPAA-compliant Business Associate Agreement (BAA) large language models (LLMs) for clinical work. These systems do not use input data for further training, so clinicians can enter Protected Health Information (PHI) into them. LLMs are trained on a fixed corpus with a historical cutoff, therefore their answers often need to be supplemented with more recent clinical evidence from external sources such as live web search or other tools that are often not covered by a BAA. This creates a "safe handoff" point where a clinician's PHI-containing query must be transformed into a HIPAA Safe Harbor compliant version before leaving the protected environment. However, publicly shareable datasets for this setting are scarce; this article describes PHI-rich clinician-style questions paired with HIPAA Safe Harbor annotations at the point where an external tool is called. Existing de-identification benchmarks are typically built from long electronic health record narratives such as discharge summaries and clinic notes, rather than from short, compressed search-style queries such as those that might be used in chat-based clinical LLM interfaces. ASQ-PHI (Adversarial Synthetic Queries for Protected Health Information de-identification) is a fully synthetic benchmark dataset designed for this safe handoff setting; no real patient data, electronic health records, or protected health information were accessed, used, or referenced during dataset creation. It contains 1051 single-turn clinical search queries that are designed to resemble prompts that clinicians might enter into HIPAA-compliant LLMs. Each record uses machine-parsable delimiters to separate the free text query from PHI annotations, which are provided as one JSON object per element specifying the HIPAA Safe Harbor identifier category and exact string value. The corpus includes 832 PHI-positive queries (79.2%) and 219 hard negatives (20.8%) engineered to mimic PHI-like syntax while containing only non-identifying clinical information such as ages under 90 years, diagnoses, medications, and symptoms. Across the dataset, there are 2973 PHI elements labeled from 13 textual HIPAA Safe Harbor identifier types that can be represented as short alphanumeric strings in single-line clinical questions, supporting the measurement of both PHI removal and over-redaction on PHI-free queries. All queries were generated with an adversarial few-shot prompting pipeline using Azure OpenAI GPT-4o. The associated Mendeley Data repository provides the complete dataset file, a Jupyter notebook that implements the generation pipeline, summary statistics, baseline metrics for a commercial PHI detection service, and six figures that describe the dataset. ASQ-PHI is released under an MIT license.
- Research Article
- 10.1016/j.atmosres.2026.108763
- Apr 1, 2026
- Atmospheric Research
- Iman Goudarzi + 4 more
Accurate knowledge of precipitation at high spatio-temporal resolution is essential for climate studies and hydrological applications, particularly in mountainous regions where traditional models often underperform due to coarse resolution and sparse observational networks. In this study, we present a machine learning-based approach to enhance ERA5 reanalysis precipitation estimates using the satellite-derived IMERG (Integrated Multi-satellite Retrievals for GPM) product as a reference. We focus on the Greater Alpine Region (GAR), using extreme gradient boosting combined with Shapley additive explanations to identify the most influential ERA5 variables. This method enables the creation of a new daily rainfall dataset, ML-IMEX-GAR (Machine Learning IMERG backward-EXtended precipitation dataset over GAR), at IMERG's spatial resolution for the historical period 1960–2000. Compared to ERA5, ML-IMEX-GAR reduces the spatiotemporal RMSD against IMERG by approximately 14%, and achieves strong agreement with in-situ observational monthly data, with an R2 of 0.87. These findings demonstrate the potential of machine learning to correct reanalysis biases, improve historical precipitation reconstructions, and support climate change research in data-scarce, complex terrains.
- Research Article
- 10.3384/nejlt.2000-1533.2026.5725
- Mar 27, 2026
- Northern European Journal of Language Technology
- Olha Kanishcheva + 3 more
This paper presents experiments on language identification for a Ukrainian-Russian code-switching dataset. Code-switching, a common phenomenon in multilingual societies, presents significant challenges for natural language processing. This study discusses various issues encountered during dataset creation, emphasizing the complexity of accurately annotating code-switching text. The study describes cases where identifying the language of individual tokens in sentences that switch between Ukrainian and Russian proves difficult even for human annotators. The relatedness of the languages and the use of Cyrillic in both orthographic systems complicate the task, leading to many cases where words are spelled identically despite clear phonetic differences between the languages that are not reflected in writing. The study explores different models and libraries for language identification on the token level. Experimental results suggest that BERT shows promising performance; however, other models, such as CRFs with n-grams, Char-level BiLSTM, and Word-level Neural Networks, are also promising for this task. This research contributes to the development of language processing technologies for multilingual contexts, with potential applications in sentiment analysis, information retrieval, and social media monitoring.
- Research Article
- 10.1038/s41598-026-44805-w
- Mar 25, 2026
- Scientific Reports
- Amir Azadi + 7 more
Youth ice hockey lacks scalable, automated tools to quantify collision exposure and efficiently identify candidate head-impact events from routine game video. This study presents a player-centric, video-based pipeline for detecting physical contact events in single-view full-game footage and demonstrates its utility as a pre-filter to accelerate head-impact dataset creation and support injury surveillance. For training and validation, we constructed labeled player-centric clips around manually annotated contact events; for full-game testing, we applied the same pipeline to continuous game footage. In both settings, the unit of analysis is a fixed-duration (1 s) player-centric clip, and the model outputs one binary label (contact vs. non-contact) per clip. Players were detected using a youth-tuned You Only Look Once (YOLO)v8 model and tracked using StrongSORT with an intersection over union (IoU)-assisted matching cost to improve identity continuity under temporary occlusions. Contact events were manually annotated in 20 youth games (Under-11, Under-13, and Under-15). For each event, annotators recorded the event frame and marked the impacted player (head/body location), which was used to associate the event with the corresponding tracklet (point-in-box). A 60-frame clip spanning a $$\pm 0.5$$ s window around the event (60 fps) was extracted and uniformly subsampled to 30 frames for classification. Non-contact clips were sampled from tracklet windows that did not overlap annotated events, yielding contact: non-contact ratios of $$\approx$$1:4, $$\approx$$1:6, and $$\approx$$1:9. A Temporal Shift Module (TSM) classifier processed each player-centric clip, and we evaluated the effects of crop scale and class imbalance on contact detection performance. The best configuration (15 segments, shift division $$=4$$, $$1.5\times$$ crop, $$\approx$$1:9 training ratio) achieved strong contact detection under evaluation conditions. The final model was then applied to two unseen full-length Under-13 games by partitioning each game into non-overlapping 1 s segments (60 frames at 60 fps), detecting and tracking players, post-processing tracklets, and classifying a player-centric clip for every tracked player. This full-game evaluation performed substantially above a random baseline and provides a practical operating point under realistic class imbalance. As a downstream demonstration, the contact detector served as an effective pre-filter for head-impact review: At the default decision threshold (0.5), 19 of 22 manually identified head impacts occurred in player-centric clips predicted as contact (86.4% head-impact recall), reducing manual review from over three hours to under 30 minutes per game. Overall, the proposed pipeline enables scalable contact-event monitoring in youth hockey and substantially reduces the burden of curating head-impact datasets from full-game video footage.
- Research Article
- 10.1609/aaai.v40i48.42275
- Mar 14, 2026
- Proceedings of the AAAI Conference on Artificial Intelligence
- Alexander Sachuk + 2 more
The management and annotation of complex, multi-modal scientific data remains a major obstacle for AI-driven research due to poor reusability and scalability of current solutions. We propose SciDataMAS, a novel LLM-powered multi-agent system (MAS), which automate scientific data management through a structured data lake with provenance-based organization and an adaptive metadata taxonomy. The system uses specialized workflows for automated dataset creation, data insertion and retrieval. Experiments show the system's proficiency, with modern LLMs like GPT-5 successfully generating rich metadata schemas and filling them with high accuracy. This work provides a foundational step towards fully automated, reusable, and scalable scientific data organization which may lead to generation and accumulation by scientific community well annotated AI-ready datasets.
- Research Article
- 10.1038/s41598-026-41862-z
- Mar 3, 2026
- Scientific reports
- Yihan Dong + 1 more
Fact-checking is crucial as rumours and misinformation negatively impact social networking services (SNS) and online discussions, often leading to the spread of misinformation. Meanwhile, fact-checking with large language models (LLMs) is becoming increasingly popular with the increase in the performance of LLMs. However, the previous works have issues, including overconfidence in the judgment results of LLM and the insufficiency of binary fact-checking due to the text's complexity. On the other hand, using multiple information sources to make judgments reveals another obstacle: the lack of proper scoring mechanisms. Thus, we propose a framework called multi-agent fact-checking (MAFC), which includes multiple agents with unique information sources to measure the text's credibility. Specifically, a brand-new scoring mechanism is also used to calculate credibility according to each agent's judgment results and confidence. We tested our proposed method through several comparative experiments. The results of the experiments prove that the proposed method performs better than other baselines in both the binary fact-checking task and the multi-label fact-checking task. Finally, the challenges and obstacles existing in fact-checking fields, such as the definition standards and dataset creation, are discussed.
- Research Article
- 10.1038/s41746-026-02473-0
- Feb 27, 2026
- NPJ digital medicine
- Zonghai Yao + 6 more
Eviction is a significant yet understudied social determinants of health (SDoH), linked to housing instability, unemployment, and mental health. While eviction appears in unstructured electronic health records (EHRs), it is rarely coded in structured fields, limiting downstream applications. We introduce SynthEHR-Eviction, a scalable pipeline that adapts and integrates human-in-the-loop annotation, automated prompt optimization (APO), and reasoning-augmented fine-tuning for low-resource eviction-related SDoH extraction from clinical notes. Using this pipeline, we created a large public eviction-related SDoH dataset to date, comprising 14 fine-grained categories. Fine-tuned LLMs (e.g., Qwen2.5, LLaMA3) trained on SynthEHR-Eviction achieved Macro-F1 scores of 88.8% (eviction) and 90.3% (other SDoH) on human validated data, outperforming GPT-4o-APO (87.8%, 87.3%), GPT-4o-mini-APO (69.1%, 78.1%), and BioBERT (60.7%, 68.3%), while enabling cost-effective deployment across various model sizes. The pipeline reduces annotation effort by over 80%, accelerates dataset creation, enables scalable eviction detection, and generalizes to other information extraction tasks.
- Research Article
- 10.1088/1361-6501/ae45c0
- Feb 27, 2026
- Measurement Science and Technology
- Zhonghan Li + 4 more
Abstract On-board sensor Fault Diagnosis and Detection (FDD) is a critical component of Unmanned Aerial Vehicle (UAV) health monitoring. Developing real-time FDD for UAVs requires overcoming the conflict between the high performance demanded by their dynamic characteristics and the limited on-board computational capabilities. Furthermore, the resulting lightweight FDD algorithms often face issues such as training instability and poor generalization, which are exacerbated by the imbalance and coupling of sensor fault data. To address these issues, this paper introduces a novel EKF-based, lightweight FDD method for UAV Inertial Measurement Units (IMUs). By integrating multi-rate filtering algorithm with deep learning, our approach accurately diagnoses multiple fault modes, achieving more than 93% precision and accuracy in offline evaluations. Beyond the core model, we propose a complete deployment pipelines for real-time sensor FDD, covering the entire process from offline dataset creation and lightweight model training to final on-board deployment. Online validation experiments were conducted using an IMU with a 100Hz update rate. The proposed approach demonstrated effective real-time detection of mixed-mode faults, achieving a low error rate of 6.24%, an improvement of 71.24% over the baseline model. The single inference latency of this method is less than 11ms, basically meeting the real-time and on-board generalization requirements for UAV applications.
- Research Article
- 10.2196/62734
- Feb 26, 2026
- JMIR research protocols
- António Sampaio Soares + 10 more
Complications following abdominal surgery have a very significant negative impact on the patient and the health care system. Despite the spread of minimally invasive surgery, there is no automated way to use intraoperative video to predict complications. New developments in data storage capacity and artificial intelligence (AI) algorithm creation now allow for this. This project aims to develop and validate deep learning models for accurately predicting postoperative complications, classified using the Clavien-Dindo scale. A key objective is to build and share an open-source dataset containing both intraoperative video data and postoperative outcomes. This prospective cohort study will collect data reflecting day-to-day surgical practice from 1200 patients, focusing on patient outcomes and intraoperative video. Data will be collected from patients undergoing minimally invasive appendectomy, cholecystectomy, and colorectal resection in the urgent and elective settings. Each video will be annotated at the temporal and semantic level by the study team. Comprehensive data collection will encompass three domains: (1) preoperative variables, including patient demographics, comorbidities, laboratory values, and imaging findings; (2) intraoperative data featuring complete surgical video recordings from laparoscopic or robotic monitors, procedure duration, surgical approach, intraoperative complications, and surgeon-defined technical factors; and (3) 30-day postoperative outcomes classified using the Clavien-Dindo scale (grades I-V). This dataset will be shared under a noncommercial CC BY-NC-SA use license to promote scientific collaboration and innovation, with complete anonymization including metadata removal and out-of-body image blurring. For analysis, the dataset will be split into training, validation, and testing sets. Deep learning algorithms will be developed through supervised learning methodology using 2 parallel approaches: data-derived predictors using fine-tuned surgical video foundational models based on vision transformer architectures and surgeon-defined predictors based on documented intraoperative strategies. Algorithms will be trained on the training set to predict the Clavien-Dindo postoperative complication grade and categorize postoperative outcomes in minimally invasive abdominal surgery. Model performance will be analyzed through sensitivity, specificity, positive and negative predictive values, and area under the receiver operating characteristic curve on the validation and testing sets. Data collection started in 2024 and is expected to extend throughout 2025. The planned outputs include the publication of a research protocol, main results, and the open-source dataset. Through this initiative, the project seeks to significantly advance the field of AI-assisted surgery, contributing to safer and more effective practice. Through the creation of an open dataset and the development of state-of-the-art deep learning models, this project seeks to transform the current paradigm in minimally invasive surgery. By providing the surgical AI community with robust, real-world data, the project aspires to catalyze innovations that will enhance surgical safety; refine predictive capabilities; and, ultimately, lead to better clinical outcomes.
- Research Article
- 10.1145/3795686
- Feb 19, 2026
- ACM Computing Surveys
- Heydar Soudani + 3 more
Recent advancements in conversational systems have significantly enhanced human-machine interactions across various domains. However, training these systems is challenging due to the scarcity of specialized dialogue data. Traditionally, conversational datasets were created through crowdsourcing, but this method has proven costly, limited in scale, and labor-intensive. As a solution, the development of synthetic dialogue data has emerged, utilizing techniques to augment existing datasets or convert textual resources into conversational formats, providing a more efficient and scalable approach to dataset creation. In this survey, we offer a systematic and comprehensive review of multi-turn conversational data generation, focusing on three types of dialogue systems: open domain, task-oriented, and information-seeking. We categorize the existing research based on key components like seed data creation, utterance generation, and quality filtering methods, and introduce a general framework that outlines the main principles of conversation data generation systems. Additionally, we examine the evaluation metrics and methods for assessing synthetic conversational data, address current challenges in the field, and explore potential directions for future research. Our goal is to accelerate progress for researchers and practitioners by presenting an overview of state-of-the-art methods and highlighting opportunities to further research in this area.
- Research Article
- 10.52436/1.jutif.2026.7.1.5315
- Feb 15, 2026
- Jurnal Teknik Informatika (Jutif)
- Asro Nasiri + 3 more
Software Defect Prediction (SDP) is a crucial component of software engineering aimed at improving quality and testing efficiency. However, the majority of SDP research often overlooks the fundamental influence of the programming paradigm on the nature and causes of defects. This study presents a comparative analysis to identify the most influential software metrics for predicting defects across two distinct paradigms: Object-Oriented (OOP) and Structured. To ensure modern relevance and reproducibility, we constructed two new datasets from large-scale, open-source projects: Apache Camel (Java) for OOP and Redis (C) for Structured which exhibited realistic defect rates of 14.4% and 21.8%, respectively. The dataset creation process involved mining Git repositories for defect labeling and automated metric extraction using the CK and Lizard tools. Correlation analysis and baseline modeling using Random Forest revealed significant differences between the paradigms. In the OOP system, dominant defect predictors were related to the complexity of the class interface and features (e.g., uniqueWordsQty, totalMethodsQty, WMC, CBO). Conversely, defects in the structured system were strongly correlated with size and algorithmic complexity (e.g., file_tokens, file_loc, file_ccn_sum). Although the baseline models performed well (ROC–AUC = 0.82–0.87), the significant class imbalance resulted in low recall (44–50%). This motivates the need for more context aware approaches. These findings underscore that effective SDP strategies must be tailored to the underlying programming paradigm.
- Research Article
- 10.1038/s41598-026-35247-5
- Feb 4, 2026
- Scientific reports
- Ebrahim Ghaith + 3 more
Hybrid beamforming is a promising approach to alleviate hardware complexity in multi-user multiple-input single-output (MU-MISO) systems while maintaining high data rate performance. Unfortunately, hybrid beamforming architecture design is a challenging non-convex optimization problem due to stringent hardware constraints. However, traditional hybrid beamforming design methods, such as alternating minimization (AltMin) algorithms, rely on iterative optimization procedures that introduce heavy computational overhead and make them impractical for real-time applications. In this paper, we propose a deep learning (DL)-based hybrid beamforming method (DL-HBF) that aims to reduce computational latency while achieving acceptable sum-rate performance. Furthermore, we evaluate these methods based on a realistic channel model to ensure practical significance and their performance on imperfect channel state information (CSI). Additionally, we propose dataset generation procedures, which reduce the dataset creation and training overhead compared to existing DL-based hybrid beamforming methods that help in rapid deployment and scalability. Simulation results show that the proposed DL-HBF achieves an acceptable sum rate compared to traditional methods while reducing the computational complexity and maintaining robustness against channel estimation errors, which provides a practical solution for real-time hybrid beamforming for next-generation wireless systems.