With the prevalence of generative AI tools like ChatGPT, automated detectors of AI-generated texts have been increasingly used in education to detect the misuse of these tools (e.g., cheating in assessments). Recently, the responsible use of these detectors has attracted a lot of attention. Research has shown that publicly available detectors are more likely to misclassify essays written by non-native English speakers as AI-generated than those written by native English speakers. In this study, we address these concerns by leveraging carefully sampled large-scale data from the Graduate Record Examinations (GRE) writing assessment. We developed multiple detectors of ChatGPT-generated essays based on linguistic features from the ETS e-rater engine and text perplexity features, and investigated their performance and potential bias. Results showed that our carefully constructed detectors not only achieved near-perfect detection accuracy, but also showed no evidence of bias disadvantaging non-native English speakers. Findings of this study contribute to the ongoing debates surrounding the formulation of policies for utilizing AI-generated content detectors in education.