Objective: We report on a comparative study of the language used by middle school students in their answers to a constructed response test of science inquiry knowledge. Background: Text analyses using statistical models have been conducted across a number of disciplines to identify topics in a journal, to extract topics in Twitter messages, and to investigate political preferences. In education, relatively few studies have analyzed the text of students’ written answers to investigate topics underlying the answers. Methodology: Two types of linguistic analysis were compared to investigate their utility in understanding students’ learning of scientific investigation practices. A statistical method, latent Dirichlet allocation (LDA), was used to extract topics from the texts of student responses. In the LDA model, topics are viewed as multinomial distributions over the vocabulary of documents. These topics were examined for content and used to characterize student responses on the constructed response items. The change from pre-test to post-test in proportions of use of each of the topics was related to students’ learning. Next, a qualitative method, systemic functional linguistic (SFL) analysis, was used to analyze the text of student responses on the same test of science inquiry knowledge. Student assessments were analyzed for two linguistic features that are important for convincing scientific communication: technical vocabulary usage and high lexical density. In this way, we investigated whether human judgement regarding the changes observed from texts based on the SFL framework agreed with the inference regarding the changes observed from the texts through LDA. Research questions: Two research questions were investigated in this study: (1) What do the LDA and SFL analyses tell us about students’ answers? (2) What are the similarities and differences of the two analyses? Data: The data for this study were taken from an NSF-funded host study on teaching science inquiry skills to middle school students who were a mix of both native English speakers and English-language learners. The primary objective was to enable participants to learn to take ownership of scientific language through the use of language-rich science investigation practices. The LDA analysis used a sample of 252 students’ pre-and post-assessments. The SFL analysis used a second sample of 90 students’ pre- and post-assessments. Results: In the LDA analysis, three topics were detected in student responses: “preponderance of everyday language (Topic 1),” “preponderance of general academic language (Topic 2),” and “preponderance of discipline-specific language (Topic 3).” Students’ use of topics changed from pre-test to post-test. Students on the post-test tended to have higher proportions of Topic 3 than students on the pre-test. In the SFL analysis, students tended to use more technical vocabulary and have higher lexical density in their written responses on the post-test than on the pre-test. Discussion: Results from the LDA and SFL analyses suggest that students responded using more discipline-specific language on the post-test than on the pre-test. In addition, the results of the two linguistic features from the SFL analysis, technical vocabulary usage and lexical density, were compared with the results from the LDA analysis. • Conclusion: Results of the LDA and SFL analyses were consistent with each other and clearly showed that students improved in their ability to use the discipline-specific and academic terminology of the language of scientific communication.