Researchers often combine both positively and negatively worded items when constructing Likert scales. This combination, however, may introduce method effects due to the variances in item wording. Although previous studies have tried to quantify these effects by using factor analysis on scales with different content, the impact of varied item wording on participants' choices among specific options remains unexplored. To address this gap, we utilized four versions of the Undergraduate Learning Burnout (ULB) scale, each characterized by a unique valence of item wording. After collecting responses from 1,131 college students, we employed unidimensional, multidimensional, and bi-factor Graded Response Models for analysis. The results suggested that the ULB scale supports a unidimensional structure for the learning burnout trait. However, the inclusion of different valences of wording within items introduced additional method factors, explaining a considerable degree of variance. Notably, positively worded items demonstrated greater discriminative power and more effectively counteracted the biased outcomes associated with negatively worded items, especially between the "Strongly Disagree" and "Disagree" options. While there were no substantial differences in the overall learning burnout traits among respondents of different scale versions, slight variations were noted in their distributions. The integration of both positive and negative wordings reduced the reliability of the learning burnout trait measurement. Consequently, it is recommended to use exclusively positively worded items and avoid a mix in item wording during scale construction. If a combination is essential, the bi-factor IRT model might help segregate the method effects resulting from the wording valence.