In this paper, we present a robust algorithm to character classes. In addition, our experiments have shown recognize extracted text from grocery product images captured and image processing. HE use of digital cameras to capture text from natural Sciences,3005 Bern, Switzerland (e-mail: farshideh.einsele@bfh.ch). Science, University of Central Florida, Orlando, FL 32816, USA (e-mail: foroosh@cs.ucf.edu). World Academy of Science, Engineering and Technology International Journal of Computer, Information, Systems and Control Engineering Vol:9 No:1, 2015 159 International Scholarly and Scientific Research & Innovation 9(1) 2015 In te rn at io na l S ci en ce I nd ex V ol :9 , N o: 1, 2 01 5 w as et .o rg /P ub lic at io n/ 10 00 02 57 into Fourier domain and ignore the phase information to gain rotation invariant features. Scale and translation invariance is gained with normalization and considering the centroid as the center of the coordinate system. The experimental results show that the centroid and complex coordinate signatures have a high precision and recall rate whereas the curvature and cumulative angular functions deliver unreliable results. Dionisio et al. in [11] also report a contour-based shape classification technique based on polygon approximation that is invariant under rotation and scaling. The vertices of polygon approximation are formed by high curvature points of the profile and are selected by the Fourier transform of the object contour. A series of features are computed from the polygonal approximation and a minimum distance classifier is used for object recognition. Although such contour-based invariants deliver promising results using Fourier descriptors for character recognition and are also reported in the survey of Trier et al. [2], the reported test and training databases are synthetically deformed patterns and the features are invariant with respect to translation, scaling, rotation and do not consider other transformations coming from real world captured images (e.g. shearing, shadowing, bad illumination and perspective distortion). Besides Trier et al.state in [2] that a statistical classification system should consider the so called curse of dimensionality meaning that it should be training-based containing a minimum number of patterns that is 8-10 times bigger than the number of the chosen features. As already stated, when dealing with a database containing real world character images, database generation is an expensive and time-consuming drawback. To sum up, the above mentioned works have been performed either by considering a synthetically degraded database or are training-based approaches. In this paper, we present a method for camera-based character recognition that uses a small real world database extracted from images of grocery products captured by a cellular phone with a resolution of 5 mega pixels. The presented method does not need a training set that should rely on the previously described term of curse of dimensionality. Therefore our proposed method can be applied directly to the extracted text with no use of cost-intensive image enhancement algorithms and delivers promising results. The remainder of this paper is organized as follows: in section II, we introduce the specificities of text in product images. Section III reports about our proposed character recognition algorithm including used feature extraction and classification methods. Section IV presents our evaluation results and section V is about our conclusions and a short sketch of our future works. II. CHALLENGES OF PRODUCT TEXT RECOGNITION We extract text from images taken with cell phones from grocery products. Text extraction from camera-based images is a relatively well researched area with plenty of existing works in the literature [12]. However, text extraction methods from camera-based images is tightly related to a specific application and there does not exist a valid generic method for the extraction of text within different camera-based scenarios. We therefore use a text extraction algorithm that has been developed for the specificities of the text from grocery products and is explained in detail in [13]. The resolution of the used cell phone camera is 5 mega pixels and the images are taken from different angles with the camera having a similar distance to products as the one a common grocery shopper would have when he crosses grocery aisles. The extracted text has mostly a height between 20-50 pixels and characters can be mostly labeled and segmented using connected components algorithms. Table I shows some extracted words in our database.