Abstract

AbstractIn this article, it is shown how item text can be represented by (a) 113 features quantifying the text's linguistic characteristics, (b) 16 measures of the extent to which an information‐retrieval‐based automatic question‐answering system finds an item challenging, and (c) through dense word representations (word embeddings). Using a random forests algorithm, these data then are used to train a prediction model for item response times and predicted response times then are used to assemble test forms. Using empirical data from the United States Medical Licensing Examination, we show that timing demands are more consistent across these specially assembled forms than across forms comprising randomly‐selected items. Because an exam's timing conditions affect examinee performance, this result has implications for exam fairness whenever examinees are compared with each other or against a common standard.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.