Early identification of infants at-risk is imperative for proper referral to intervention programs. The Alarm Distress Baby Scale (ADBB) is an eight-item observer-rated screening tool detecting social withdrawal in infants. Previously, a shortened five-item version of the scale (m-ADBB) has been proposed. To date, few studies have examined the validity of the two scales, and no studies have examined the validity of the ADBB after implementation as a universal screening tool in primary care. The aim of this study is to use Item Response Theory (IRT) to examine the construct validity of the ADBB when used by public health visitors in primary care. Participants were 24,752 infants (aged: 2-12.9 months) screened by public health visitors using the ADBB. Screenings were categorized into three waves according to the infant's age at the screening time (2-3.9 months, 4-7.9 months, and 8-12.9 months). Analyses were conducted separately on each wave. We checked IRT assumptions: (a) Unidimensionality, (b) Monotonicity, (c) Local independence, and (d) No DIF in relation to infant sex and gestational age. The 2PLM was used to assess model fit and estimate model parameters. Items fulfilled assumptions regarding unidimensionality, monotonicity, and no clinical and significant DIF. Local independence was not present for all items (i.e. 2, 7, and 8). The items showed moderate to good discriminatory abilities (alpha values ≥ 1.11) and discriminated best above average levels of social withdrawal (theta values ≥ 1.33). Items 7 and 8 showed nearly identical ICC suggesting that the two items discriminate equally well at the same level of social withdrawal. In addition, items 4 and 6 discriminated best at very high levels of social withdrawal, which might be of limited interest for screening purposes. Finally, the items showed similar patterns in terms of discrimination and location parameters across the three waves. The ADBB shows several psychometric strengths when used by public health visitors in primary care, and the items show good discriminatory abilities at the levels of social withdrawal of interest for screening purposes. Yet, the results also suggest that for first-line screening, the validity of the scale might be improved with the removal of items 4, 6, and 8 as suggested in the m-ADBB. However, before recommending implementation of the m-ADBB, studies comparing the criterion-related validity of the two scales are needed.