Introduction: In academic written texts, linguistic and rhetorical features are often interactively used as a vehicle for writers to construct their texts in order to accomplish specific communicative purposes. However, the effective integration of these resources may pose challenges for developing writers.
 Purpose: This study employed a corpus-based genre analysis approach to investigate phrasal complexity features and rhetorical functions in data commentaries written by Iranian undergraduate and graduate students. Through this approach, we aimed to examine a relatively unexplored genre of data commentary in terms of its phrasal complexity features, rhetorical functions, and their relationships. By analyzing these relationships, we sought to provide insights into the writing practices of Iranian undergraduate and graduate students in the context of data commentaries.
 Method: This study employed a convenient sampling method to select a total of 76 university students, which included 47 undergraduate students and 29 graduate students. The participants were involved in generating a corpus of 380 data commentaries, which were then thoroughly examined and compared. To identify instances of phrasal complexity features, the researchers utilized the AntConc software tool, applying regular expressions (regex) to extract potential occurrences. Additionally, a Python program was developed and implemented to calculate the frequencies of the identified PCFs. The researchers manually annotated the rhetorical function of the data commentaries to determine their specific usage.
 Results: Statistical analysis such as Mann Whitney U test and Spearman correlation test, revealed that graduate students significantly utilized more phrasal complexity features including attributive adjectives, nominalizations, and prepositional phrases (of) compared to undergraduate students. However, a qualitative analysis showed that the use of these linguistic features is influenced by the writing topics. Regarding rhetorical functions, graduate students used more moves and/or steps related to presenting and commenting data, while undergraduate students produced more moves or steps concerning personal asides. Moreover, certain phrasal complexity features and the moves and/or steps were found to be correlated, aligning with recent corpus-based studies.
 Conclusion: The study concludes with pedagogical implications.