To identify the agreement on Lung CT Screening Reporting and Data System 4X categorization between radiologists and an expert-adjudicated reference standard and to investigate whether training led to improvement of the agreement measures and diagnostic potential for lung cancer. Category 4 nodules in the Korean Lung Cancer Screening Project were identified retrospectively, and each 4X nodule was matched with one 4A or 4B nodule. An expert panel re-evaluated the categories and determined the reference standard. Nineteen radiologists were asked to determine the presence of CT features of malignancy and 4X categorization for each nodule. A review was performed in two sessions, and training material was given after session 1. Agreement on 4X categorization between radiologists and the expert-adjudicated reference standard and agreement between radiologist-assessed 4X categorization and lung cancer diagnosis were evaluated. The 48 expert-adjudicated 4X nodules and 64 non-4X nodules were evenly distributed in each session. The proportion of category 4X decreased after training (56.4% ± 16.9% vs. 33.4% ± 8.0%; p < 0.001). Cohen's κ indicated poor agreement (0.39 ± 0.16) in session 1, but agreement improved in session 2 (0.47 ± 0.09; p = 0.03). The increase in agreement in session 2 was observed among inexperienced radiologists (p < 0.05), and experienced and inexperienced reviewers exhibited comparable agreement performance in session 2 (p > 0.05). All agreement measures between radiologist-assessed 4X categorization and lung cancer diagnosis increased in session 2 (p < 0.05). Radiologist training can improve reader agreement on 4X categorization, leading to enhanced diagnostic performance for lung cancer. • Agreement on 4X categorization between radiologists and an expert-adjudicated reference standard was initially poor, but improved significantly after training. • The mean proportion of 4X categorization by 19 radiologists decreased from 56.4% ± 16.9% in session 1 to 33.4% ± 8.0% in session 2. • All agreement measures between the 4X categorization and lung cancer diagnosis increased significantly in session 2, implying that appropriate training and guidance increased the diagnostic potential of category 4X.