Abstract: The ability to recognize text from images is a great importance in a range of applications, including document analysis, images captioning and augmented reality. The reliability and accuracy of text extraction from images have been completely transformed by text recognition models using Optical Character Recognition (OCR) and Maximally Stable Extremal Regions (MSER) algorithms. In our study, we propose a text recognition model that leverages the advantages of both OCR and MSER algorithms to enhance the reliability and accuracy of the text extraction process. OCR algorithm serve as the fundamental basis for text recognition by utilizing advanced methods to separate and identify individual characters. None the less these algorithms can encounter difficulties when confronted with intricated backgrounds, images of poor quality, or text arranged in irregular layouts. To address these limitations, we integrate the MSER algorithm, which excels in detecting text regions by identifying maximally stable regions across different scales and intensities. Our proposed model follows a multi-stage approach. First, the input image, the MSER method is used to extract probable text locations. These regions are then refined using pre-processing techniques, such as noise removal and image enhancement, to improve OCR performance. Next, the refined regions are passed through the OCR algorithm, which utilizes machine learning and pattern recognition techniques to recognize the text within each regions. The recognized text is subsequently post-processed to refine the results and improve overall accuracy. The text recognition model is implemented using CNN (OCR which is part of CNN) and the Maximally stable Extremal regions (MSER) algorithms