Objective: To construct the diagnostic model of superficial esophageal squamous cell carcinoma (ESCC) and precancerous lesions in endoscopic images based on the YOLOv5l model by using deep learning method of artificial intelligence to improve the diagnosis of early ESCC and precancerous lesions under endoscopy. Methods: 13, 009 endoscopic esophageal images of white light imaging (WLI), narrow band imaging (NBI) and lugol chromoendoscopy (LCE) were collected from June 2019 to July 2021 from 1, 126 patients at the Cancer Hospital, Chinese Academy of Medical Sciences, including low-grade intraepithelial neoplasia, high-grade intraepithelial neoplasia, ESCC limited to the mucosal layer, benign esophageal lesions and normal esophagus. By computerized random function method, the images were divided into a training set (11, 547 images from 1, 025 patients) and a validation set (1, 462 images from 101 patients). The YOLOv5l model was trained and constructed with the training set, and the model was validated with the validation set, while the validation set was diagnosed by two senior and two junior endoscopists, respectively, to compare the diagnostic results of YOLOv5l model and those of the endoscopists. Results: In the validation set, the accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of the YOLOv5l model in diagnosing early ESCC and precancerous lesions in the WLI, NBI and LCE modes were 96.9%, 87.9%, 98.3%, 88.8%, 98.1%, and 98.6%, 89.3%, 99.5%, 94.4%, 98.2%, and 93.0%, 77.5%, 98.0%, 92.6%, 93.1%, respectively. The accuracy in the NBI model was higher than that in the WLI model (P<0.05) and lower than that in the LCE model (P<0.05). The diagnostic accuracies of YOLOv5l model in the WLI, NBI and LCE modes for the early ESCC and precancerous lesions were similar to those of the 2 senior endoscopists (96.9%, 98.8%, 94.3%, and 97.5%, 99.6%, 91.9%, respectively; P>0.05), but significantly higher than those of the 2 junior endoscopists (84.7%, 92.9%, 81.6% and 88.3%, 91.9%, 81.2%, respectively; P<0.05). Conclusion: The constructed YOLOv5l model has high accuracy in diagnosing early ESCC and precancerous lesions in endoscopic WLI, NBI and LCE modes, which can assist junior endoscopists to improve diagnosis and reduce missed diagnoses.