Computer vision education is increasingly important in modern technology curricula; yet, it often lacks a systematic approach integrating both theoretical concepts and practical applications. This study proposes a staged framework for computer vision education designed to progressively build learners’ competencies across four levels. This study proposes a four-staged framework for computer vision education, progressively introducing concepts from basic image recognition to advanced video analysis. Validity assessments were conducted twice with 25 experts in the field of AI education and curricula. The results indicated high validity of the staged framework. Additionally, a pilot program, applying computer vision to acid–base titration activities, was implemented with 40 upper secondary school students to evaluate the effectiveness of the staged framework. The pilot program showed significant improvements in students’ understanding and interest in both computer vision and scientific inquiry. This research contributes to the AI educational field by offering a structured, adaptable approach to computer vision education, integrating AI, data science, and computational thinking. It provides educators with a structured guide for implementing progressive, hands-on learning experiences in computer vision, while also highlighting areas for future research and improvement in educational methodologies.