Abstract
Computer-assisted coding of job descriptions to standardized occupational classification codes facilitates evaluating occupational risk factors in epidemiologic studies by reducing the number of jobs needing expert coding. We evaluated the performance of the 2nd version of SOCcer, a computerized algorithm designed to code free-text job descriptions to US SOC-2010 system based on free-text job titles and work tasks, to evaluate its accuracy. SOCcer v2 was updated by expanding the training data to include jobs from several epidemiologic studies and revising the algorithm to account for nonlinearity and incorporate interactions. We evaluated the agreement between codes assigned by experts and the highest scoring code (a measure of confidence in the algorithm-predicted assignment) from SOCcer v1 and v2 in 14,714 jobs from three epidemiology studies. We also linked exposure estimates for 258 agents in the job-exposure matrix CANJEM to the expert and SOCcer v2-assigned codes and compared those estimates using kappa and intraclass correlation coefficients. Analyses were stratified by SOCcer score, score distance between the top two scoring codes from SOCcer, and features from CANJEM. SOCcer's v2 agreement at the 6-digit level was 50%, compared to 44% in v1, and was similar for the three studies (38%-45%). Overall agreement for v2 at the 2-, 3-, and 5-digit was 73%, 63%, and 56%, respectively. For v2, median ICCs for the probability and intensity metrics were 0.67 (IQR 0.59-0.74) and 0.56 (IQR 0.50-0.60), respectively. The agreement between the expert and SOCcer assigned codes linearly increased with SOCcer score. The agreement also improved when the top two scoring codes had larger differences in score. Overall agreement with SOCcer v2 applied to job descriptions from North American epidemiologic studies was similar to the agreement usually observed between two experts. SOCcer's score predicted agreement with experts and can be used to prioritize jobs for expert review.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.