Abstract Background In Italy, regional governments maintain data warehouses of linked administrative health data (AHD) linked to demographic information including mortality. Legal reporting pathways for death certificate detail bypass the Region, resulting in significant amounts of missing data on cause and location of death in regional AHD warehouses and limiting epidemiological research on mortality and incidence. In this study, we determine for the first time in Italy the degree of agreement between last hospital diagnosis and primary cause of death, and how that agreement changes over time. Previous studies have found agreement ranging from 40% to 60%. Methods The COV-CVD cohort is comprised of 7.3 million adults aged 30+ from the Lombardy region of northwest Italy, representing all Lombardy residents registered with the health system as of 31 December 2019. Concordance was assessed using the first three characters of the reported cause of death (ICD-10 codes) and primary diagnosis at last hospitalization (converted from ICD-9CM). Results In the COV-CVD cohort there were 401,085 deaths between 1 January 2020 and 23 June 2023. Cause and location of death is missing for 50% and 54% of these deaths, respectively. 60% of deceased individuals had a hospitalization in the year before their death. Agreement between primary diagnosis at last hospitalization and cause of death was 27% overall and 40.5% for deaths in hospital. Missing location of death was reduced from 54% to 26%. Conclusions When using AHD from Italy, primary diagnosis at last hospitalization prior to death is not an appropriate proxy for cause of death, even when the death occurred in hospital. Different coding systems (ICD-10, ICD-9CM) further complicate comparisons and definitions in observational studies. Key messages • In Italy, primary diagnosis at last hospitalization before death is not an appropriate proxy for cause of death. • Poor linkage of mortality and health data limits epidemiological research.