Measurement is at the heart of scientific research. As many-perhaps most-psychological constructs cannot be directly observed, there is a steady demand for reliable self-report scales to assess latent constructs. However, scale development is a tedious process that requires researchers to produce good items in large quantities. In this tutorial, we introduce, explain, and apply the Psychometric Item Generator (PIG), an open-source, free-to-use, self-sufficient natural language processing algorithm that produces large-scale, human-like, customized text output within a few mouse clicks. The PIG is based on the GPT-2, a powerful generative language model, and runs on Google Colaboratory-an interactive virtual notebook environment that executes code on state-of-the-art virtual machines at no cost. Across two demonstrations and a preregistered five-pronged empirical validation with two Canadian samples (NSample 1 = 501, NSample 2 = 773), we show that the PIG is equally well-suited to generate large pools of face-valid items for novel constructs (i.e., wanderlust) and create parsimonious short scales of existing constructs (i.e., Big Five personality traits) that yield strong performances when tested in the wild and benchmarked against current gold standards for assessment. The PIG does not require any prior coding skills or access to computational resources and can easily be tailored to any desired context by simply switching out short linguistic prompts in a single line of code. In short, we present an effective, novel machine learning solution to an old psychological challenge. As such, the PIG will not require you to learn a new language-but instead, speak yours. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Read full abstract