BackgroundVocational education and training (VET) aims to enable young adults or trainees to participate in the workplace, and to promote their vocational capacities. In order to examine trainees’ competencies at the end of VET, appropriate instruments are needed. This contribution aims: (1) to give an outline of such an instrument, one that has been designed to evaluate vocational competencies in the field of economics, and (2) to present the results of an empirical comparison of two possible test modes: computer-based assessment (CBA) versus paper-based assessment (PBA). The use of new technologies offers various opportunities for competence measurement: in particular, the computer as an assessment tool presents an authentic work tool drawn from professional life and promises novel ways of designing assessments. However, the current assessment practice in Germany is dominated by the use of traditional PBA, and there is less evidence about the possible effects of CBA. This study addresses the question of whether there are significant differences in the various ways of representing and measuring commercial competence with respect to specific content, item format, and, finally, motivational aspects.MethodsA sample of 387 trainees from the German VET system was used to compare these two kinds of assessment. The analyses were realized using Item Response Theory and, particularly, Differential Item Functioning to detect differences between PBA and CBA at the item level. In addition to the performance data, motivational aspects, such as emotional state and test attractiveness, were also taken into account by a pre-/post-questionnaire.ResultsThe study demonstrates that both test formats (CBA and PBA) can represent commercial competence in a valid and reliable way, but differences were found for certain items in the number of correct responses. The PBA shows a slight advantage in respect of overall item and model fit. Another key finding of our comparative study, at item level, is important from an instructive viewpoint: (domain) specific items are easier to solve in CBA than in PBA, whereas more general items are answered correctly more frequently in the latter. Contrary to expectations, we could not confirm the overall dominance of CBA against PBA on the basis of test takers’ motivation, but values from CBA were more stable over time.ConclusionsThe study facilitated making the strengths and weaknesses of both test formats evident, and this implies the possibility of identifying opportunities for further development in assessment practice and in designing tests. Selected design criteria and aspects of test administration are discussed, with the aim of seeking to optimize test development in order to create the best possible estimates for young adults’ competence and capacity to participate in the world of work.