ABSTRACT Although research has studied argumentation in science learning, it is necessary to understand the criteria teachers and students apply to evaluate arguments in the context of classroom production. Researchers have employed Toulmin's argument pattern as the preferred model to analyze arguments. However, the model is unfeasible for people to evaluate arguments while arguing or immediately after the discussion. This study explored the criteria two student judges applied to evaluate classmate pre-service chemistry teachers' arguments developed in a remote simulated jury. We applied a multi-level method for discourse analysis, affording key findings: (1) Student judges developed a symmetrical shared-built explanation, an unprecedented finding as discourse analysts consider asymmetry an inherent feature of explanations; (2) They applied diverse criteria to evaluate the arguments; (3) Evaluation activity was accomplished through interactive-explanation-related discursive procedures, such as informing about issues related to the evaluations; asking for confirmation and clarification; presenting arguments developed and silenced in the simulated jury. Implications include a set of exemplary criteria for teachers to evaluate students' arguments and instruct them on how to evaluate arguments. In conclusion, we discuss the relevance of the findings and the multi-level method for the science education research community, providing new ways to analyze instructional discourse.