Abstract One of the main purposes of public health institute is to tracking the health status of its citizens and understanding its evolution over time. Given the absence of a singular, comprehensive data source, we developed a framework to select databases specific to each disease and outcome of interest. This process includes a thorough evaluation of the operational case definitions, as well as the strengths and weaknesses of each database, with a particular focus on their sensitivity and specificity. These qualities are appraised following a dedicated scoring form into different categories (i.e., high, medium, low). The selection criteria for data sources hinge on several key questions: Is the database exhaustive or merely a sample? Does the case definition rely on direct medical diagnoses or proxies? Does the source capture all cases? Is the data collected at regional or national level? Are there mechanisms for yearly or periodic updates? To establish the most accurate national prevalence estimates for each outcome, we often need to adjust for data misclassification-applying corrections for biases such as self-reported data and interpolating for missing entries. In some instances, a pooled estimate from multiple data sources is developed. A specific case will be presented to illustrate our approach in practice. This critical evaluation is pivotal, as the quality of data underpins the entire health information pyramid. The integrity of our data directly influences our ability to convert data into actionable health wisdom, ultimately affecting public health decision-making and policy formulation.
Read full abstract