Published in last 50 years
Related Topics
Articles published on Data Dictionary
- New
- Research Article
- 10.1016/j.aei.2025.103670
- Nov 1, 2025
- Advanced Engineering Informatics
- Shihang Zhang + 6 more
Semantic enrichment of BIM models for construction cost estimation in pumped storage hydropower using industry foundation classes and interconnected data dictionaries
- Research Article
- 10.23889/ijpds.v10i2.2956
- Oct 13, 2025
- International Journal of Population Data Science
- Rebecca S Pepe + 1 more
Data transparency lays the groundwork for the ethical use of administrative data. This is particularly true for linked administrative data within integrated data systems (IDS). Data dictionaries, resources that maintain the metadata of the information housed in an IDS, offer a tool to ensure transparency throughout the data life cycle. The FAIR Principles, which assert that data be Findable, Accessible, Interoperable, and Reusable provide a useful framework by which to measure the effectiveness of data dictionaries in the IDS context. This paper uses the FAIR Principles to discuss the ways in which data dictionaries serve as tools in the ethical and transparent use of integrated data as well as the challenges that remain. Linked administrative data is a valuable source of information for programmatic and academic research. Data dictionaries facilitate the ethical handling of this sensitive information and maintain a commitment to transparency in data inquiry and research.
- Research Article
- 10.1287/msom.2025.0134
- Oct 13, 2025
- Manufacturing & Service Operations Management
- Vivek Astvansh + 1 more
Problem definition: A firm’s stakeholders would benefit from a quantitative measure of the managerial disclosure of its operational risk and the implications of this disclosure. However, a quantitative and scalable measure of a firm’s disclosed operational risk is unavailable. Thus, the firm’s stakeholders remain unaware of its disclosed operational risk and the risk’s implications. Methodology/results: U.S. law requires a public firm to textually disclose its operational and nonoperational risks in Item 1A of its annual report (i.e., Form 10-K). We train 64 transformer models on U.S. public firms’ Item 1A text to score 131,920 firm-years (16,959 firms, 2005 to 2024) on eight risk factors: (1) accounting, (2) finance, (3) international, (4) legal, (5) management, (6) marketing, (7) operations, and (8) technology. We measure each transformer’s performance on eight metrics. Next, our Python code retains each risk factor’s best-performing transformer (among the eight). Subsequently, our regression estimates report that a firm’s disclosed operational risk is positively associated with its operational cost and that its disclosed nonoperational risk strengthens this positive association, thus supporting our two hypotheses. Our OSF repository includes an Excel file that contains a data dictionary and count and probability scores of the eight risk factors for 131,920 firm-years (16,959 firms, 2005 to 2024). The repository also includes our Python code file and trained models’ files. Managerial implications: First, our empirical evidence informs managers and corporate stakeholders that a firm’s disclosed operational and nonoperational risks are associated with its operational cost, thus showcasing disclosure’s relevance. Second, our Excel data file provides stakeholders with eight risk factors for 131,920 firm-years (16,959 firms, 2005 to 2024). Third, one can also use our Python code files and trained transformers to measure risk reflected in other sources of firm-generated text (e.g., managers’ answers in earnings calls, CEO interviews, and press releases). Supplemental Material: The online appendix is available at https://doi.org/10.1287/msom.2025.0134 .
- Research Article
- 10.1186/s12961-025-01397-7
- Oct 9, 2025
- Health Research Policy and Systems
- Tigest Tamrat + 27 more
BackgroundDespite the potential for digital tools to facilitate guideline uptake, translating paper-based narrative guidelines into digital formats is resource-intensive and may compromise the fidelity to the recommended content. The World Health Organization (WHO) launched the SMART Guidelines initiative, in which digital adaptation kits (DAKs) are a foundational component. DAKs comprise software requirements documentation, including detailed data dictionary and algorithms--derived from WHO guidelines =for encoding within digital systems.MethodsThis implementation research consists of a formative assessment and impact evaluation on integrating DAKs within national digital systems to improve service delivery outcomes for antenatal care (ANC), family planning, and HIV in two countries (Ethiopia and Ghana). The formative phase will assess the requirements to customize the DAKs to align with the national protocols and subsequently incorporate the localized DAKs’ content into the respective nationally endorsed digital systems: Bahmni in Ethiopia and DHIS2 tracker in Ghana. The impact evaluation will assess the effect of using the DAK-upgraded digital systems using pre–post designs in Ethiopia and Ghana. Primary outcomes of adherence to guideline recommendations will be assessed when digital systems incorporate country-adapted DAK content in comparison with the existing practice. Guideline knowledge questionnaires and in-depth interviews with software developers, health workers and facility managers will supplement the impact evaluation.DiscussionThis research represents one of the first impact evaluations focused on integrating DAKs into existing national digital systems and the effect on service delivery outcomes. The mixed-methods study design will provide learnings for future scale-up and replication across other countries. We expect final results to be available in 2026, and preliminary findings will be shared at relevant fora.Trial registrationhttps://www.isrctn.com/ISRCTN18394724. Registration date: 21 December 2022.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12961-025-01397-7.
- Research Article
- 10.20396/rebpred.v6i00.20618
- Oct 9, 2025
- Revista Brasileira de Preservação Digital
- Raphael Figueiredo Xavier + 1 more
Introduction: Digital preservation in institutional repositories requires metadata capable of ensuring the integrity, authenticity, and longevity of digital objects. In this context, standards such as Dublin Core—widely adopted in platforms like DSpace—show limitations regarding the documentation of technical and preservation aspects, thus justifying the use of complementary schemas such as PREMIS. Objective: This article aims to propose a conceptual mapping between the elements of the PREMIS Object entity and the metadata available in the DSpace repository, focusing on Dublin Core fields, the original bundle, and file header data. Methodology: This is a theoretical-applied study based on document analysis of the DSpace structure, the PREMIS Data Dictionary, and technical metadata extraction practices, with exclusive focus on the Object entity. Results: The study presents a correspondence table between PREMIS Object entity elements and available DSpace data sources, demonstrating the feasibility of structuring preservation metadata packages using existing infrastructure. Conclusion: The proposed mapping provides guidance for institutions interested in incrementally adopting PREMIS, showing that much of the required data is already available within DSpace, requiring only conceptual reorganization.
- Research Article
- 10.1088/1742-6596/3109/1/012077
- Oct 1, 2025
- Journal of Physics: Conference Series
- Liu Ying + 5 more
Abstract This study addresses the long-term operational requirements of the International Lunar Research Station ( ILRS ) by investigating the development of an actionable resource repository framework. Through analyzing scheduling patterns and spatiotemporal distribution characteristics of resource entities in lunar station operations, a hierarchical resource system architecture is designed to enable systematic management. A formal descriptive framework for resource attributes is constructed using an entity-relationship (E-R) model, which explicitly characterizes resource types, functional capabilities, and service availability in operational contexts. By integrating dual constraint mechanisms—data dictionaries for standardized attribute domain definitions and knowledge graphs for semantic relationship modeling—the study achieves structured representation of resource attributes and their associative interactions, completing the digital modeling of both specific resource properties and interactive dynamics. Building on this foundation, a five-layer knowledge hypergraph architecture, and a three-layer dynamic hypergraph architecture based on temporal scales are proposed to comprehensively capture multi-dimensional resource associations. This architecture differentiates between abstract/concrete resource representations, common/individual attribute features, and hierarchical levels of relationship abstraction, enabling systematic description of complex interdependencies, and characterizes the time-varying attributes of resources and the dynamic characteristics of resource entity relationships across different temporal granularities, It incorporates design principles for sustainable scalability and modular customization, essential for adapting to evolving operational demands in lunar environments. The research concludes with discussions on prospective applications of the operational resource repository in facilitating intelligent resource management, optimized task scheduling, and sustainable development of the ILRS . These contributions establish a theoretical framework to enhance the operational efficiency of lunar station resource systems, supporting long-term habitability and scientific productivity in extraterrestrial research infrastructure.
- Research Article
- 10.3390/heritage8100410
- Sep 30, 2025
- Heritage
- Karol Argasiński + 1 more
Preservation of Cultural Heritage (CH) demands precise and comprehensive information representation to document, analyse, and manage assets effectively. While Building Information Modelling (BIM) facilitates as-is state documentation, challenges in semantic interoperability of complex cultural data often limit its potential in heritage contexts. This study investigates the integration of BIM tools with the buildingSMART Data Dictionary (bSDD) platform to enhance semantic interoperability for heritage assets. Using a proof-of-concept approach, the research focuses on a historic tenement house in Tarnów, Poland, modelled with the IFC schema standard and enriched with the MIDAS heritage classification system. The methodology includes transforming the classification system into bSDD data dictionary, publishing thesauri for components, materials, and monument types, and semantic enrichment of the model using Bonsai (formerly BlenderBIM) plugin for Blender. Results demonstrate improved consistency, accuracy, and usability of BIM data for heritage preservation. The integration ensures detailed documentation and facilitates interoperability across platforms, addressing preservation challenges with enriched narratives of cultural significance. This method supports future predictive models for heritage asset conservation, emphasizing the importance of data quality and interoperability in safeguarding shared cultural heritage for future generations.
- Research Article
- 10.1016/j.jtho.2025.09.1756
- Sep 25, 2025
- Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer
- Kimberly A Shoenbill + 12 more
Recommendations for Standardization of Tobacco Use Treatment Data.
- Research Article
- 10.3233/shti251400
- Sep 3, 2025
- Studies in health technology and informatics
- Leslie Diana Wamba Makem + 3 more
The heterogeneity of metadata continues to be a key challenge in the healthcare sector. The Data Dictionary Minimal Information Model (DDMIM) aims to meet the need for interoperability between different standards and data dictionaries to facilitate the exchange of metadata. This paper presents the conception, and the development of a metadata search portal based on the DDMIM specification, designed to improve the discoverability and accessibility of health datasets and enhance interoperability. We conducted a literature review of existing metadata repositories to select potentially relevant ones for further work. A mapping was created to transform metadata from different MDRs into the DDMIM format. In parallel, the requirements for a prototype search portal are being evaluated, which integrates metadata from various public repositories. The results show that a DDMIM-based search portal can effectively integrate heterogeneous metadata sources and improve the finding of health datasets. Such a portal supports the integration of heterogeneous metadata sources and ensures compliance with FAIR principles to optimize the use of health data for research and clinical applications. It is therefore of great importance to address the existing challenges in the field of medical data integration and utilization.
- Research Article
- 10.32627/dimamu.v4i3.1619
- Aug 30, 2025
- Jurnal Dimamu
- Rahma Fitriani + 1 more
In this study it was found that the system used in the data processing process was still manual and semi-computerized using paper and Mirosoft Excel. This still causes difficulties for employees when looking for data on students who have not made payments and who have made payments, recording expenses, and when compiling reports. The method used in this research is System Depelopment Life Cycle (SDLC) which includes planning, needs analysis, design, code, test, maintenance, and hardware and software requirements. The author uses Flowmap analysis, Data Flow Diagram (DFD), Entity Relationship Diagram (ERD), Data Dictionary, Structure Chart. The implementation of the school financial information system in smk ganesha cimanggung is built using Microsoft Visual Foxpro 9.0. The results of designing this information system to facilitate employees in finding the balance of receipts, payments, expenses and making reports
- Research Article
- 10.1016/j.cmi.2025.08.019
- Aug 23, 2025
- Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases
- Alessandro Visentin + 18 more
Lessons from the European mpox outbreak: strengthening cohort research for future pandemic preparedness.
- Research Article
- 10.1016/j.dib.2025.111996
- Aug 19, 2025
- Data in Brief
- Mogeeb A․ A․ Mosleh + 3 more
ArYSL: Arabic Yemeni sign language dataset
- Research Article
- 10.1016/j.jclinepi.2025.111920
- Aug 1, 2025
- Journal of clinical epidemiology
- Ka Hin Tai + 9 more
Key concepts in clinical epidemiology: FAIRification of biomedical research data.
- Research Article
- 10.1016/j.dib.2025.111959
- Aug 1, 2025
- Data in Brief
- Thomas J Trout + 3 more
Colorado sunflower water use, physiology and productivity dataset
- Research Article
- 10.1080/23744731.2025.2534311
- Jul 20, 2025
- Science and Technology for the Built Environment
- Jeffrey Ouellette + 2 more
The ASHRAE RP-1815 project centers on advancing the state of occupant behavior modeling as part of building information modeling (BIM) and energy modeling and simulation (BEM) workflows. Enhancements to the industry data standards IFCs and gbXML for BIM and BEM to support occupant behavior modeling data were a specific area of focus and project deliverable. It emphasizes the crucial role of occupant behavior modeling (OBM) in enhancing energy modeling accuracy and building performance analysis, identifying the need for more effective integration to capture the impact of occupant behavior on building energy use. As part of this initiative, a comprehensive review has been conducted of existing technologies and standards’ support of OBM. This work specifically anticipates use of gbXML and IFC in energy modeling workflows using the DOE’s EnergyPlus energy modeler including the system’s Functional Mockup (FMU) co-simulation capabilities to support dynamic modeling of occupant behavior impacts on energy performance simulation. To support these workflows, revisions to the gbXML and obXML schemas have been developed. IFC targeted enhancements have led to the development of an online data dictionary service. Enhanced BIM to BEM workflows offering OBM are presented together with suggestions for future technical and process enhancements required to support commercial applications.
- Research Article
- 10.1177/18333583251352646
- Jul 17, 2025
- Health information management : journal of the Health Information Management Association of Australia
- Julie L Morrison + 14 more
Background: Clinical Quality Registries (CQRs) capture clinical practice data to monitor the performance of health services against agreed standards of care. Ensuring data timeliness, completeness and reliability are challenges for CQRs, as data are prospectively collected and time sensitive. The Australian Stroke Clinical Registry (AuSCR) commenced in 2009 and includes 67 hospitals voluntarily collecting data on patients with acute stroke (at December 2024). Objective: To describe the methods used to ensure data quality in a national CQR, using the AuSCR as a case study. Method: Methods from the AuSCR were described against The Australian Framework for CQRs (2024), focusing on three operating principles for data quality: "Data collection," "Data elements" and "Ensuring data quality." Results: The AuSCR meets these principles through: (1) an online data platform to import data from primary sources and perform logic checks; (2) provision of comprehensive training, a data dictionary and user manuals for contributors; (3) medical record audits; (4) bi-annual hospital data quality reports and near real-time dashboards including data discrepancies; (5) cross-referencing data against government admissions data. Our processes extend to patient-reported follow-up data collected within 90-180 days of admission. Conclusion: Managing health information in a national CQR involves multiple methods to ensure data quality and minimise clinician data entry time. The AuSCR is an exemplar program to guide the field. Implications for health information management practice: CQRs are rapidly adopting streamlined processes to collect, manage and validate data to maximise the quality of health information for clinical practice improvement.
- Research Article
- 10.1093/jamiaopen/ooaf052
- Jul 3, 2025
- JAMIA Open
- Aasiyah Rashan + 10 more
ObjectiveFederated analysis is a method that allows data analysis to be performed on similar datasets without exchanging any data, thus facilitating international research collaboration while adhering to strict privacy laws. This study aimed to evaluate the feasibility of using federated analysis to benchmark mortality in 2 critical care quality registry databases converted to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), describing challenges to and recommendations for performing federated analysis on data transformed to OMOP CDM.Materials and MethodsTo identify as many challenges as possible and to be able to complete the benchmarking phase, a 2-step approach was taken during implementation. The first step was a naive implementation to allow challenges to surface naturally; the second step was developing solutions for the encountered challenges. Expected patient mortality risk was calculated by applying the Acute Physiology and Chronic Health Evaluation II (APACHE II) model to data from OMOP CDM databases containing adult ICU encounters between July 1, 2019 and December 31, 2022. An analysis script was developed to calculate comparable, registry level standardized mortality ratios. Challenges were recorded and categorized into predefined categories: “data preparation,” “data analysis plan,” and “data interpretation.” Challenges specific to the OMOP CDM were further categorized using published steps from an existing generic harmonization process.ResultsA total of 7 challenges were identified, 4 of which were related to data preparation, 1 to data analysis, and 1 to data interpretation. Out of all 7 challenges, 4 stemmed from decisions made during the implementation of OMOP CDM. Several recommended solutions were distilled from the naive approach.DiscussionFederated analysis facilitated by a CDM is a feasible option for critical care quality registries. However, future analysis is influenced by decisions made during the CDM implementation process. Thus, prior publication of data dictionaries and the use of metadata to communicate data handling and data source classification during CDM implementation will improve the efficiency and accuracy of subsequent analysis.
- Research Article
- 10.1108/jsocm-08-2023-0179
- Jun 19, 2025
- Journal of Social Marketing
- Yue Xi + 4 more
Purpose This study responds to calls for increased theory use, higher levels of theory application and utilisation of approaches that extend beyond how individuals think and feel in social marketing. The capability, opportunity, motivation, behaviour theory was selected for this depth interview study that aimed to identify barriers and enablers towards e-waste reduction. Design/methodology/approach Following ethical clearance, semi-structured in-depth interviews were conducted in Australia with 19 people, including experts, people working in the e-waste management industry and consumers. A data dictionary was developed and used by coders. High inter-coder reliability was achieved. Findings A total of 18 influences were identified. Opportunity was the strongest category with eight environmental influences and a combined ten individual influences (capability and motivation), which demonstrates the importance of extending understanding beyond individual factors. Opportunities to support individuals to reduce e-waste include providing an e-waste management system, providing clear and transparent information, availability of and ease of access to e-waste recycling services, improved product design to support e-waste recovery, monetary support, regulation and policies, circular economy and social support. Research limitations/implications In total, 18 influences offer an understanding of the many ways that a complex problem like e-waste can be alleviated in Queensland, Australia. Originality/value This paper contributes a detailed application of theory demonstrating how a theory can be applied to identify influences to inform intervention planning.
- Research Article
- 10.59934/jaiea.v4i3.1134
- Jun 15, 2025
- Journal of Artificial Intelligence and Engineering Applications (JAIEA)
- Salsalina Sembiring + 4 more
The development of information technology that is growing rapidly has now affected various fields, technology and information are two things that cannot be separated. To process attendance data and employee salaries, it is still done manually by using notes in the attendance book which is carried out by HRD. Proof of absence for employees is still given in the form of physical documents stored in the cupboard, this can affect the salary that will be received by employees if the document is lost and damaged or an error occurs when making an attendance report. in order to keep up with competition in the market with other companies. This design aims to design a simple and integrated website-based information system, if implemented computerized, it is expected to make it easier for employees to find out attendance and payroll reports in a structured way. The methodology used is System Development Life Cycle (SDLC). The tools used are Fishbone Diagram, Data Flow Diagram, PIECES, and Data Dictionary. This research produces a website design for the company which, if continued to the system development stage, can make it easier for companies to manage attendance data and manage payroll data. Accurate and automatic recording of employee attendance, reducing errors that affect salary calculations.
- Research Article
- 10.1016/j.msard.2025.106445
- Jun 1, 2025
- Multiple sclerosis and related disorders
- Juan I Rojas + 51 more
Neuromyelitis optica spectrum disorder in Latin America: a global data share initiative.