Background: Analyses of gender in academic authorship are key to characterizing representation in surgical fields, but current methods of manual data collection are time-consuming and error prone. The purpose of this study was to design a program to automatically extract publication data and verify the accuracy of this program in comparison to manually-collected data in a pilot study of three orthopaedic surgery journals. Methods: Publications from three orthopaedic subspecialty journals between January 2019 and June 2021 were identified via PubMed search. For each publication, online publication date, journal issue month, first author name, and senior author name were collected from PubMed listings by hand and programmatically in a Python script (JournalADE). Gender was determined using Gender API. Results: The percent of publications for which manually- and program-collected online publication dates were within 14 days of each other was above 95% for all journals. There was 98.3% (95% CI=97.84-98.76%) agreement for online publication date, with a mean difference of 6.43 (SD 0.87) days. Journal issue month agreement was 99.6% (95% CI=99.37-99.83%). Agreement for first author gender was 97.33% (95% CI=96.75-97.91%) and for senior author gender was 96.77% (95% CI=96.14-97.4%). Estimated labor time for manual collection was 100 hr, compared to 15 min for JournalADE. Conclusions: When comparing the JournalADE- and manually-collected data, rates of agreement were high at a fraction of the time. This supports the efficacy of JournalADE and sets the stage for its use in future studies of gender in authorship.
Read full abstract