The development of artificial intelligence makes people’s life and work easier and more effective, and computer-based online exams and marking not only improve students’ learning efficiency but also reduce the pressure of teachers’ marking work. For objective questions, marking has gone from manual marking to cursor reader marking to computerized character matching, and the correct rate of marking has soared to 100%; for subjective questions, foreign systems such as PEG and E-rater have been used, and domestic systems such as those using English large corpus similarity matching and those based on natural language understanding using intelligent algorithms have been used for marking. Most of these systems are based on some shallow linguistic features such as rules and LSAs for marking, and there is no deep perception of English language sense. Although the current intelligent marking systems have made a lot of achievements, they do not fundamentally solve the problem of the rationality of intelligent marking of subjective questions. In this article, we propose a regularized discriminant analysis algorithm with good estimation of the mean, and a dimensionality reduction algorithm for high-dimensional missing data by using the relevant research results of random matrix theory to address the problems of traditional machine learning methods in high-dimensional data analysis. Although the linear discriminant analysis algorithm performs well in solving many practical problems, it works poorly in dealing with high-dimensional data. The specific analysis is as follows: in terms of age characteristics, the mobile population under the age of 35 has a significant preference for urban consumer comfort, and it increases with age, peaking at the stage of 30–35 years old and then decreasing rapidly. For this reason, a regularized discriminant analysis algorithm based on random matrix theory is proposed. First, a good estimate of the high-dimensional covariance matrix is made by the nonlinear shrinkage method or the eigenvalue interception method, respectively; then, the estimated high-dimensional covariance matrix is used to calculate the discriminant function values and perform the classification. The classification experiments conducted on simulated and real datasets show that the proposed algorithm is not only more widely applicable but also has a high correct classification rate.