Swedish Generations and Gender Survey 2021

  • Abstract
  • Highlights & Summary
  • PDF
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

The Swedish Generations and Gender Survey 2021 (GGS2021) was the second GGS that Sweden carried out. It was a web-based survey with a paper-based option. Like the first GGS in Sweden (GGS2012) it was linked to register data that cover key dimensions of respondents’ life courses. The Swedish GGS2021 contains two new modules implemented to further research on the link between subjective perceptions and fertility. Both modules will be part of the second wave of the international GGS standard questionnaire. In this contribution, we first describe our motivation to carry out the SwedishGGS2021. We then present our two new modules and sketch their theoretical underpinnings. This is followed by a summary of the data collection process and an assessment of data quality. We conclude with some reflections on the implementation of new modules in future international GGSs and on our experience with register-linked surveys.

Similar Papers
  • Research Article
  • Cite Count Icon 42
  • 10.5334/egems.223
A Comparison of Data Quality Assessment Checks in Six Data Sharing Networks
  • Jun 12, 2017
  • eGEMs
  • Tiffany J Callahan + 8 more

Objective:To compare rule-based data quality (DQ) assessment approaches across multiple national clinical data sharing organizations.Methods:Six organizations with established data quality assessment (DQA) programs provided documentation or source code describing current DQ checks. DQ checks were mapped to the categories within the data verification context of the harmonized DQA terminology. To ensure all DQ checks were consistently mapped, conventions were developed and four iterations of mapping performed. Difficult-to-map DQ checks were discussed with research team members until consensus was achieved.Results:Participating organizations provided 11,026 DQ checks, of which 99.97 percent were successfully mapped to a DQA category. Of the mapped DQ checks (N=11,023), 214 (1.94 percent) mapped to multiple DQA categories. The majority of DQ checks mapped to Atemporal Plausibility (49.60 percent), Value Conformance (17.84 percent), and Atemporal Completeness (12.98 percent) categories.Discussion:Using the common DQA terminology, near-complete (99.97 percent) coverage across a wide range of DQA programs and specifications was reached. Comparing the distributions of mapped DQ checks revealed important differences between participating organizations. This variation may be related to the organization’s stakeholder requirements, primary analytical focus, or maturity of their DQA program. Not within scope, mapping checks within the data validation context of the terminology may provide additional insights into DQA practice differences.Conclusion:A common DQA terminology provides a means to help organizations and researchers understand the coverage of their current DQA efforts as well as highlight potential areas for additional DQA development. Sharing DQ checks between organizations could help expand the scope of DQA across clinical data networks.

  • Research Article
  • Cite Count Icon 73
  • 10.1093/jamia/ocaa245
Assessing the practice of data quality evaluation in a national clinical data research network through a systematic scoping review in the era of real-world data.
  • Nov 9, 2020
  • Journal of the American Medical Informatics Association : JAMIA
  • Jiang Bian + 11 more

ObjectiveTo synthesize data quality (DQ) dimensions and assessment methods of real-world data, especially electronic health records, through a systematic scoping review and to assess the practice of DQ assessment in the national Patient-centered Clinical Research Network (PCORnet).Materials and MethodsWe started with 3 widely cited DQ literature—2 reviews from Chan et al (2010) and Weiskopf et al (2013a) and 1 DQ framework from Kahn et al (2016)—and expanded our review systematically to cover relevant articles published up to February 2020. We extracted DQ dimensions and assessment methods from these studies, mapped their relationships, and organized a synthesized summarization of existing DQ dimensions and assessment methods. We reviewed the data checks employed by the PCORnet and mapped them to the synthesized DQ dimensions and methods.ResultsWe analyzed a total of 3 reviews, 20 DQ frameworks, and 226 DQ studies and extracted 14 DQ dimensions and 10 assessment methods. We found that completeness, concordance, and correctness/accuracy were commonly assessed. Element presence, validity check, and conformance were commonly used DQ assessment methods and were the main focuses of the PCORnet data checks.DiscussionDefinitions of DQ dimensions and methods were not consistent in the literature, and the DQ assessment practice was not evenly distributed (eg, usability and ease-of-use were rarely discussed). Challenges in DQ assessments, given the complex and heterogeneous nature of real-world data, exist.ConclusionThe practice of DQ assessment is still limited in scope. Future work is warranted to generate understandable, executable, and reusable DQ measures.

  • Research Article
  • Cite Count Icon 20
  • 10.5334/egems.286
DataGauge: A Practical Process for Systematically Designing and Implementing Quality Assessments of Repurposed Clinical Data.
  • Jul 25, 2019
  • eGEMs (Generating Evidence & Methods to improve patient outcomes)
  • Jose-Franck Diaz-Garelli + 5 more

The well-known hazards of repurposing data make Data Quality (DQ) assessment a vital step towards ensuring valid results regardless of analytical methods. However, there is no systematic process to implement DQ assessments for secondary uses of clinical data. This paper presents DataGauge, a systematic process for designing and implementing DQ assessments to evaluate repurposed data for a specific secondary use. DataGauge is composed of five steps: (1) Define information needs, (2) Develop a formal Data Needs Model (DNM), (3) Use the DNM and DQ theory to develop goal-specific DQ assessment requirements, (4) Extract DNM-specified data, and (5) Evaluate according to DQ requirements. DataGauge’s main contribution is integrating general DQ theory and DQ assessment methods into a systematic process. This process supports the integration and practical implementation of existing Electronic Health Record-specific DQ assessment guidelines. DataGauge also provides an initial theory-based guidance framework that ties the DNM to DQ testing methods for each DQ dimension to aid the design of DQ assessments. This framework can be augmented with existing DQ guidelines to enable systematic assessment. DataGauge sets the stage for future systematic DQ assessment research by defining an assessment process, capable of adapting to a broad range of clinical datasets and secondary uses. Defining DataGauge sets the stage for new research directions such as DQ theory integration, DQ requirements portability research, DQ assessment tool development and DQ assessment tool usability.

  • Research Article
  • 10.54103/2282-0930/29202
A Framework to Improve Data Quality and Manage Dropout in Web-Based Medical Surveys: Insights from an Ai Awareness Study among Italian Physicians
  • Sep 8, 2025
  • Epidemiology, Biostatistics, and Public Health
  • Vincenza Cofini + 9 more

Background Ensuring data quality in self-reported online surveys remains a critical challenge in digital health research, particularly when targeting healthcare professionals [1,2]. Self-reported data are susceptible to multiple biases, including careless responding, social desirability bias, and dropout-related attrition, all of which may compromise the validity of findings [3,4]. In web-based surveys where researcher oversight is limited, structured quality control measures are essential to detect low-quality responses, minimise sampling bias, and enhance data reliability [5]. Previous studies have demonstrated that inadequate quality checks can lead to inflated error rates, reduced statistical power, and misleading conclusions [6]. Objective This study presents a comprehensive methodological framework for optimising data quality in web-based medical surveys, applied to a national study on AI awareness among Italian physicians. Integrating pre-survey validation, real-time dashboards, response-time filtering, and post-hoc careless responding detection would address key challenges in digital research, while providing a replicable model for future studies. Methods We conducted a national web-based survey using a validated instrument (doi:10.1101/2025.04.11.25325592) via the LimeSurvey platform. The survey incorporated two main sections: (1) a core module assessing knowledge, attitudes and practices regarding AI in medicine; (2) clinical scenarios evaluating diagnostic agreement with AI-generated proposals. Multiple quality control strategies were implemented throughout the survey lifecycle. In terms of survey design and logic, the questionnaire employed an adaptive flow structure, whereby respondents were routed through clinical scenarios relevant to their medical speciality. To reduce the incidence of partial completions and missing data, key questions were marked as mandatory, and completion status was actively tracked. In the monitoring and recruitment phase, a real-time dashboard monitored participant distribution (gender/geographical areas/speciality); referral links were rotated to minimise snowball bias [7]. Time-based data quality checks excluded outliers (completion time <1st or >99th percentile) [8]. Completion time for the first section was analysed for all completers to assess correlations between response speed and quality indicators. Dropout patterns were analysed using Kaplan-Meier survival analysis and logistic regression, to identify systematic attrition predictors. Data quality assessments were performed on the outlier-cleaned dataset (n=587). Response quality was assessed using complementary careless responding indicators applied specifically to opinion scale items (Likert 1-5). Two detection methods were used: low response variance analysis, identifying respondents with insufficient variability (SD < 0.5), and excessive same-response detection, flagging participants using identical responses for >75% of items. Internal consistency analysis (Cronbach's α) evaluated scale reliability across different quality levels. Results A total of 736 accesses were recorded on the survey platform. As an initial inclusion criterion, only participants who indicated current registration with the Italian Medical Council were considered eligible: 79 (10.7%) were excluded, yielding a sample of 657 eligible participants (89.3%). Among eligible respondents, 597 completed the first section, yielding a dropout rate of 9.1% (n=60). A Kaplan-Meier survival analysis using total survey time revealed that most dropouts occurred early, with critical points at 45% after demographic, 51% after personal AI knowledge items, 71% after opinion items, and 100% before clinical scenarios. Logistic regression showed no significant predictors of completion (LR χ²(6)=3.46, p= 0.7497; pseudo-R²=0.014; AUC=0.60, 95%CI: 0.50–0.70). Completion time showed no correlation with response quality (Spearman's ρ = -0.019, p = 0.645). Following outlier removal, data quality assessment among 587 who completed the first section revealed two complementary patterns of careless responding: 8.52% (n=50) exhibited low response variance, while 32 (5.45%) demonstrated excessive same response patterns. Cross-classification analysis showed 23 participants (3.92%) flagged by both indicators, with 71.88% of excessive same responders also showing low variance. Overall, 50 participants (10.05%, 95% CI: 7.9%- 12.8%) exhibited careless responding detectable by at least one indicator. Internal consistency analysis showed robust scale reliability (Cronbach's α = 0.754) that remained stable across quality levels. Conclusion The integration of real-time monitoring, adaptive design, time-based validation, and systematic careless responding detection provides a robust methodological framework for web-based medical surveys, particularly for complex topics like AI adoption. Comprehensive data quality assessment revealed a 10.05% careless responding rate among completers, which aligns with the literature. The absence of correlation between completion time and response quality shows that careless responding could reflect attentional rather than temporal factors. Our findings suggest that both phenomena likely reflect situational or contextual factors rather than systematic participant characteristics or survey design flaws. This supports the validity and generalizability of the final dataset while providing a replicable quality control framework for future web-based medical research.

  • Research Article
  • Cite Count Icon 20
  • 10.7189/jogh.09.010806
Data quality assessments stimulate improvements to health management information systems: evidence from five African countries.
  • Jun 1, 2019
  • Journal of Global Health
  • Jennifer Yourkavitch + 2 more

BackgroundHealth service data are used to inform decisions about planning and implementation, as well as to evaluate performance and outcomes, and the quality of those data are important. Data quality assessments (DQA) afford the opportunity to collect information about health service data. Through its Rapid Access Expansion Programme (RAcE), the World Health Organization (WHO) funded non-governmental organizations (NGO) to support Ministries of Health (MOH) in implementing integrated community case management (iCCM) programs in the Democratic Republic of Congo, Malawi, Mozambique, Niger and Nigeria. WHO contracted ICF to support grantee monitoring and evaluation efforts, part of which was to conduct DQAs to enhance program monitoring and decision making. The contribution of DQAs to data-driven decision making has been documented and the purpose of this paper is to describe how DQAs contributed to health management information system (HMIS) strengthening and the findings of subsequent DQAs in those areas.MethodsICF created a mixed-methods DQA for iCCM data, comprising a review of the data collection and management system, a data tracing component and key informant interviews. The DQA was applied twice in each RAcE site, which enables a general comparison of system-level attributes before and after the first DQA application. For this qualitative assessment, we reviewed DQA reports to collate information about DQA recommendations and how they were addressed before a subsequent DQA, along with the findings of the second DQA.ResultsFindings from the first DQA in each RAcE site stimulated NGO and MOH efforts to strengthen different aspects of the HMIS in each country, including modifying data collection tools in the Democratic Republic of Congo; training community health workers (CHWs) and supervisors in Malawi; strengthening supervision in Mozambique; improving CHW registers and strengthening staff capacity at all levels to report data in Niger; establishing a data review system in Abia State, Nigeria; and, establishing processes to improve data use and quality in Niger State, Nigeria.ConclusionData quality assessments stimulated context-specific efforts by NGOs and MOHs to improve iCCM data quality. DQAs can serve as a collaborative and evidence-based activity to influence discussions of data quality and stimulate HMIS strengthening efforts.

  • Research Article
  • Cite Count Icon 417
  • 10.13063/2327-9214.1244
A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data.
  • Sep 11, 2016
  • eGEMs (Generating Evidence & Methods to improve patient outcomes)
  • Michael G Kahn + 19 more

Objective:Harmonized data quality (DQ) assessment terms, methods, and reporting practices can establish a common understanding of the strengths and limitations of electronic health record (EHR) data for operational analytics, quality improvement, and research. Existing published DQ terms were harmonized to a comprehensive unified terminology with definitions and examples and organized into a conceptual framework to support a common approach to defining whether EHR data is ‘fit’ for specific uses.Materials and Methods:DQ publications, informatics and analytics experts, managers of established DQ programs, and operational manuals from several mature EHR-based research networks were reviewed to identify potential DQ terms and categories. Two face-to-face stakeholder meetings were used to vet an initial set of DQ terms and definitions that were grouped into an overall conceptual framework. Feedback received from data producers and users was used to construct a draft set of harmonized DQ terms and categories. Multiple rounds of iterative refinement resulted in a set of terms and organizing framework consisting of DQ categories, subcategories, terms, definitions, and examples. The harmonized terminology and logical framework’s inclusiveness was evaluated against ten published DQ terminologies.Results:Existing DQ terms were harmonized and organized into a framework by defining three DQ categories: (1) Conformance (2) Completeness and (3) Plausibility and two DQ assessment contexts: (1) Verification and (2) Validation. Conformance and Plausibility categories were further divided into subcategories. Each category and subcategory was defined with respect to whether the data may be verified with organizational data, or validated against an accepted gold standard, depending on proposed context and uses. The coverage of the harmonized DQ terminology was validated by successfully aligning to multiple published DQ terminologies.Discussion:Existing DQ concepts, community input, and expert review informed the development of a distinct set of terms, organized into categories and subcategories. The resulting DQ terms successfully encompassed a wide range of disparate DQ terminologies. Operational definitions were developed to provide guidance for implementing DQ assessment procedures. The resulting structure is an inclusive DQ framework for standardizing DQ assessment and reporting. While our analysis focused on the DQ issues often found in EHR data, the new terminology may be applicable to a wide range of electronic health data such as administrative, research, and patient-reported data.Conclusion:A consistent, common DQ terminology, organized into a logical framework, is an initial step in enabling data owners and users, patients, and policy makers to evaluate and communicate data quality findings in a well-defined manner with a shared vocabulary. Future work will leverage the framework and terminology to develop reusable data quality assessment and reporting methods.

  • Research Article
  • Cite Count Icon 5
  • 10.1177/1833358320908957
Collaborative data familiarisation and quality assessment: Reflections from use of a national dataset to investigate palliative care for Indigenous Australians.
  • Mar 27, 2020
  • Health information management : journal of the Health Information Management Association of Australia
  • John A Woods + 5 more

Data quality is fundamental to the integrity of quantitative research. The role of external researchers in data quality assessment (DQA) remains ill-defined in the context of secondary use for research of large, centrally curated health datasets. In order to investigate equity of palliative care provided to Indigenous Australian patients, researchers accessed a now-historical version of a national palliative care dataset developed primarily for the purpose of continuous quality improvement. (i) To apply a generic DQA framework to the dataset and (ii) to report the process and results of this assessment and examine the consequences for conducting the research. The data were systematically examined for completeness, consistency and credibility. Data quality issues relevant to the Indigenous identifier and framing of research questions were of particular interest. The dataset comprised 477,518 records of 144,951 patients (Indigenous N = 1515; missing Indigenous identifier N = 4998) collected from participating specialist palliative care services during a period (1 January 2010-30 June 2015) in which data-checking systems underwent substantial upgrades. Progressive improvement in completeness of data over the study period was evident. The data were error-free with respect to many credibility and consistency checks, with anomalies detected reported to data managers. As the proportion of missing values remained substantial for some clinical care variables, multiple imputation procedures were used in subsequent analyses. In secondary use of large curated datasets, DQA by external researchers may both influence proposed analytical methods and contribute to improvement of data curation processes through feedback to data managers.

  • Book Chapter
  • Cite Count Icon 6
  • 10.1007/978-3-030-37453-2_30
DMN for Data Quality Measurement and Assessment
  • Jan 1, 2019
  • Álvaro Valencia-Parra + 4 more

Data Quality assessment is aimed at evaluating the suitability of a dataset for an intended task. The extensive literature on data quality describes the various methodologies for assessing data quality by means of data profiling techniques of the whole datasets. Our investigations are aimed to provide solutions to the need of automatically assessing the level of quality of the records of a dataset, where data profiling tools do not provide an adequate level of information. As most of the times, it is easier to describe when a record has quality enough than calculating a qualitative indicator, we propose a semi-automatically business rule-guided data quality assessment methodology for every record. This involves first listing the business rules that describe the data (data requirements), then those describing how to produce measures (business rules for data quality measurements), and finally, those defining how to assess the level of data quality of a data set (business rules for data quality assessment). The main contribution of this paper is the adoption of the OMG standard DMN (Decision Model and Notation) to support the data quality requirement description and their automatic assessment by using the existing DMN engines.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.procs.2022.01.218
An ERP Data Quality Assessment Framework for the Implementation of an APS system using Bayesian Networks
  • Jan 1, 2022
  • Procedia Computer Science
  • Jan-Phillip Herrmann + 6 more

An ERP Data Quality Assessment Framework for the Implementation of an APS system using Bayesian Networks

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/aiccsa.2017.178
Data Quality Assessment in the Integration Process of Linked Open Data (LOD)
  • Oct 1, 2017
  • Hana Haj Ahmed

Linked Open Data (LOD) entails a set of best practices for publishing and connecting structured data on the Web, which allows sharing and exchanging information in an inter-operable and reusable manner. The increasing adoption of these principles has lead to the creation of a globally distributed and huge informative space that covers various domains such as government, libraries, life sciences, and media. This offers a great opportunity to end-users to build semantic applications by exploring and consuming heterogeneous and dispersed possibly interlinked data. Thus, consuming linked data can be considered as a typical scenario of linked data integration in which a user requires to combine data residing in large and varying quality LOD datasets.In this paper, we examine the specifics of linked data integration and focus on three key challenges, namely data quality profiling and assessment, conflict resolution and quality improvement. We postulate that data quality assessment can act both as a deciding factor for conflict resolution and as an indicator of low quality data which need to be improved.

  • Research Article
  • Cite Count Icon 5
  • 10.1007/s12145-014-0196-9
A general framework for spatial data inspection and assessment
  • Jan 25, 2015
  • Earth Science Informatics
  • Yiliang Wan + 4 more

The quality aspects of spatial data are very important in the decision-making process. However, the quality inspection of spatial data is still dependent on manual checking, and there is an urgent need to develop an automatic or semi-automatic generic system for spatial data quality inspection. In this paper, we present a general framework that automatically copes with spatial data quality inspection based on various spatial data quality standards and specifications. The framework involves all descriptions of given spatial data, a data quality model characterized by quality elements, scheme batch checking and spatial data quality assessment based on quality control and assessment procedures. It is implemented in Unified Modeling Language with four main sets of classes: data dictionary, quality model, scheme checking and quality assessment. Accordingly, we have designed four structured Extensible Markup Language files for the framework to organize and describe the data dictionary, quality model, scheme check and quality assessment. It is very easy for users to describe the data requirements using the data dictionary file, and to extend the quality elements or check rules using the quality model file. Users can design the specified checks and quality assessment schemes without coding by configuring the scheme check files and quality assessment scheme files. The framework also incorporates a checking tool capable of solving the difficulties inherent in the diversity of spatial data quality standards and specifications. The proposed framework and its implementation, as a quality inspection system, will facilitate automatic multiple spatial data quality inspection and acceptance. As a result, the quality of diversified spatial data can be ensured and improved, which is extremely important in the era of spatial big data.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.1186/s13049-023-01145-2
Database quality assessment in research in paramedicine: a scoping review
  • Nov 11, 2023
  • Scandinavian journal of trauma, resuscitation and emergency medicine
  • Neil Mcdonald + 5 more

BackgroundResearch in paramedicine faces challenges in developing research capacity, including access to high-quality data. A variety of unique factors in the paramedic work environment influence data quality. In other fields of healthcare, data quality assessment (DQA) frameworks provide common methods of quality assessment as well as standards of transparent reporting. No similar DQA frameworks exist for paramedicine, and practices related to DQA are sporadically reported. This scoping review aims to describe the range, extent, and nature of DQA practices within research in paramedicine.MethodsThis review followed a registered and published protocol. In consultation with a professional librarian, a search strategy was developed and applied to MEDLINE (National Library of Medicine), EMBASE (Elsevier), Scopus (Elsevier), and CINAHL (EBSCO) to identify studies published from 2011 through 2021 that assess paramedic data quality as a stated goal. Studies that reported quantitative results of DQA using data that relate primarily to the paramedic practice environment were included. Protocols, commentaries, and similar study types were excluded. Title/abstract screening was conducted by two reviewers; full-text screening was conducted by two, with a third participating to resolve disagreements. Data were extracted using a piloted data-charting form.ResultsSearching yielded 10,105 unique articles. After title and abstract screening, 199 remained for full-text review; 97 were included in the analysis. Included studies varied widely in many characteristics. Majorities were conducted in the United States (51%), assessed data containing between 100 and 9,999 records (61%), or assessed one of three topic areas: data, trauma, or out-of-hospital cardiac arrest (61%). All data-quality domains assessed could be grouped under 5 summary domains: completeness, linkage, accuracy, reliability, and representativeness.ConclusionsThere are few common standards in terms of variables, domains, methods, or quality thresholds for DQA in paramedic research. Terminology used to describe quality domains varied among included studies and frequently overlapped. The included studies showed no evidence of assessing some domains and emerging topics seen in other areas of healthcare. Research in paramedicine would benefit from a standardized framework for DQA that allows for local variation while establishing common methods, terminology, and reporting standards.

  • Research Article
  • Cite Count Icon 5
  • 10.1016/j.cmpb.2021.106147
Robust estimation of infant feeding indicators by data quality assessment of longitudinal electronic health records from birth up to 18 months of life
  • May 2, 2021
  • Computer Methods and Programs in Biomedicine
  • Ricardo García-De-León-Chocano + 5 more

Robust estimation of infant feeding indicators by data quality assessment of longitudinal electronic health records from birth up to 18 months of life

  • Research Article
  • Cite Count Icon 8
  • 10.7189/jogh.09.010805
ICCM data quality: an approach to assessing iCCM reporting systems and data quality in 5 African countries.
  • Jun 1, 2019
  • Journal of Global Health
  • Lwendo Moonzwe Davis + 5 more

BackgroundEnsuring the quality of health service data is critical for data-driven decision-making. Data quality assessments (DQAs) are used to determine if data are of sufficient quality to support their intended use. However, guidance on how to conduct DQAs specifically for community-based interventions, such as integrated community case management (iCCM) programs, is limited. As part of the World Health Organization’s (WHO) Rapid Access Expansion (RAcE) Programme, ICF conducted DQAs in a unique effort to characterize the quality of community health worker-generated data and to use DQA findings to strengthen reporting systems and decision-making.MethodsWe present our experience implementing assessments using standardized DQA tools in the six RAcE project sites in the Democratic Republic of Congo, Malawi, Mozambique, Niger, and Nigeria. We describe the process used to create the RAcE DQA tools, adapt the tools to country contexts, and develop the iCCM DQA Toolkit, which enables countries to carry out regular and rapid DQAs. We provide examples of how we used results to generate recommendations.ResultsThe DQA tools were customized for each RAcE project to assess the iCCM data reporting system, trace iCCM indicators through this system, and to ensure that DQAs were efficient and generated useful recommendations. This experience led to creation of an iCCM DQA Toolkit comprised of simplified versions of RAcE DQA tools and a guidance document. It includes system assessment questions that elicit actionable responses and a simplified data tracing tool focused on one treatment indicator for each iCCM focus illness: diarrhea, malaria, and pneumonia. The toolkit is intended for use at the national or sub-national level for periodic data quality checks.ConclusionsThe iCCM DQA Toolkit was designed to be easily tailored to different data reporting system structures because iCCM data reporting tools and data flow vary substantially. The toolkit enables countries to identify points in the reporting system where data quality is compromised and areas of the reporting system that require strengthening, so that countries can make informed adjustments that improve data quality, strengthen reporting systems, and inform decision-making.

  • Research Article
  • Cite Count Icon 27
  • 10.1504/ijiq.2014.068656
A classification of data quality assessment and improvement methods
  • Jan 1, 2014
  • International Journal of Information Quality
  • Philip Woodall + 2 more

Data quality (DQ) assessment and improvement in larger information systems would often not be feasible without using suitable ‘DQ methods’, which are algorithms that can be automatically executed by computer systems to detect and/or correct problems in datasets. Currently, these methods are already essential, and they will be of even greater importance as the quantity of data in organisational systems grows. This paper provides a review of existing methods for both DQ assessment and improvement and classifies them according to the DQ problem and problem context. Six gaps have been identified in the classification, where no current DQ methods exist, and these show where new methods are required as a guide for future research and DQ tool development.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon
Setting-up Chat
Loading Interface