Document analysis and understanding (DAU) systems aim not only at the recognition of text but also at the extraction of relevant information out of a scanned document. Numerous studies have introduced efficient algorithms for document analysis, some of these studies proposed an articulated and/or a translation stage. Also, little work has been done in the English/Arabic (E/A) translation area. The main objective of this paper is to introduce a combination between the three trends (document understanding, E/A translation, and handling the output in an articulated voice). This paper focuses on the realisation of a bilingual articulated E/A system based on optical character recognition (OCR). The input of the proposed system will be an English or Arabic text through scanner or video camera; the output will be an articulated voice. The proposed scheme consists of two phases. The first phase consists of many processes, beginning with converting scanned document into an electronically processable form. Then, in the segmentation step, the essential problem in Arabic OCR, that is, how to cope with the various shapes of the same character, is solved. A new methodology for segmenting Arabic characters is presented. At the end of this phase, an efficient method of text recognition based on hybrid description (ANN, FFT) is used. In order to verify the performance of this phase, experiments with printed text were performed. The error rates were less than 0.1%. Results showed that the proposed scheme in this phase is very robust. In the second phase, a database with 3000 audio files was encapsulated to convert each word from the input text (the output of the first phase) to its correspondence. This research can help in many real-time applications such as immediate translation, a machine reader for blind people, and learning.