Abstract Background Dementia is one of the leading causes of death in elderly people. In Korea, the government supports dementia screening for people aged 60 and older, but it is not a population-based organized screening program. It is difficult to achieve early diagnosis and early treatment. The objective of this study is to detect unrecognized dementia using Korean National Claims Data for early diagnosis. Methods This study used the National Health Insurance Claims data in Korea. The case group is the new dementia visit as an unrecognized group, while the control group was randomly selected from the general population without dementia. The predictors included health utilization, socioeconomic and demographic, procedure codes, diagnostic codes and health screenings data. The health utilization variables included length of stay, the number of outpatient visits, health expenditures and other encounter information. All medical utilization data was aggregated monthly. In the case group, 12 months of data from 6-month or 2-years before the onset of dementia were used. In the control group, 12 months of data from randomly selected time point were included. Transformer, embedding methods, and Time2Vec were used as the deep learning methods. This study included 453,306 incidents of dementia, representing almost all cases reported from 2010 to 2019 in Korea. The control group consisted of 669,873. Results After training the deep learning prediction model, the AUC for 6-month dementia prediction on the test dataset was 0.87, and for 2-year dementia prediction, it was 0.76. The most important variables were health utilization variables, especially health expenditures. The health utilization trajectories of the case group were significantly different from those of the control group. Conclusions Based on the claims data, we could detect unrecognized dementia case. We can extend the model to other chronic disease such as cancer and stroke, which are significant burden of disease. Key messages • Screening in healthcare facilities or using lab test results would be more accurate, but it would be difficult to apply to a large population due to the cost-effectiveness and timeliness constraints. • On the other hand, the detection model using the existing claims data would be helpful to to develop cost-effective mass screening program. It can cover the weakness of current screening program.