OCR Engine Research Articles

We have built a suite of tools in Python to proficiently analyze text reuse and intertextuality for a specific kind of set of medieval Arabic texts (commentaries) available in print. We take these printed editions, scan them, pre-process the images, give it to an OCR engine, clean the results, and store it in a data structure that mimics the explicit intertextual relation the texts have, and continue to perform data analysis on it. Digital approaches to medieval Arabic texts have either been at the micro-level in what has become known as a ‘digital edition’, i.e. the digital representation of one text, densely annotated, most commonly in TEI-XML, or it has been done at the macro-level in what is called a ‘digital corpus’, consisting of thousands of loosely encoded and sparsely annotated plain text files, accompanied by an entire infrastructure and high-performing software to perform broadly scoped queries. The micro-level generally is at the level of tens of thousands of words while the macro-level can be at the level of over a billion words. The micro-level is explicitly designed to be human readable first, while the macro-level is built to be machine readable first. At the micro-level, every little detail needs to be correct and in order, while at the macro-level a fairly large margin of error is still negligible as a mere rounding error. Amidst these levels we have been seeking a meso-level of digital analysis: neither edition nor corpus, but rather a group of texts at the level of hundreds of thousands to millions of words, with a small but perceptible margin of error, and a light but noticeable level of annotations, principally geared towards machine readability, but with ample opportunity for visual inspection and manual correction. In this paper we explain the rationale for our approach, the technical achievements it has led us to, and the results we so far obtained.

Read full abstract

The development of the Automatic License Plate Recognition (ALPR) system has received much attention for the English license plate. However, despite being the sixth-largest population around the world, no significant progress can be tracked in the Bengali language countries or states for the ALPR system addressing their more alarming traffic management with inadequate road-safety measures. This paper reports a computationally efficient and reasonably accurate Automatic License Plate Recognition (ALPR) system for Bengali characters with a new end-to-end DNN model that we call Bengali License Plate Network (BLPnet). The cascaded architecture for detecting vehicle regions before vehicle license plate (VLP) is proposed to eliminate false positives, resulting in higher detection accuracy of VLP. Besides, a lower set of trainable parameters is considered for reducing the computational cost, making the system faster and more compatible for a real-time application. With a Convolutional Neural Network (CNN) based new Bengali OCR engine and word-mapping process, the model is characters-rotation invariant, and can readily extract, detect and output the complete license plate number of a vehicle. The model feeding with 17 frames per second (fps) of real-time video footage can detect a vehicle with the Mean Squared Error (MSE) of 0.0152, and the mean license-plate-character recognition accuracy of 95%. While compared to the other models, an improvement of 5% and 20% were recorded for the BLPnet over the prominent YOLO-based ALPR model and the Tesseract model for the number-plate detection accuracy and time requirement, respectively. • We propose an end-to-end Deep Neural Network (DNN) model called BLPnet. • BLPnet operates in two separate detection phases minimizing false positive detection of number plate. • With a lower set of trainable parameters, BLPnet offers impressive computational efficiency. • BLPnet’s new CNN based OCR engine offers rotation-invariant character recognition with notably higher accuracy.

Read full abstract

OCR Engine Research Articles

Related Topics

Articles published on OCR Engine

Optical character recognition system using artificial intelligence

Efficient Vehicle Registration Recognition System: Enhancing Accuracy and Power Efficiency through Digital Image Processing

Neither Corpus Nor Edition: Building a Pipeline to Make Data Analysis Possible on Medieval Arabic Commentary Traditions

Build a Trained Data of Tesseract OCR engine for Tifinagh Script Recognition

Applicability of OCR Engines for Text Recognition in Vehicle Number Plates, Receipts and Handwriting

Design and Implementation for BIC Code Recognition System of Containers using OCR and CRAFT in Smart Logistics

Automatic Car Number Plate Detection using Morphological Image Processing

Are Searches in OCR-generated Archives Trustworthy?

Vehicle Number Plate Recognition Using Raspberry Pi

Adaptive dewarping of severely warped camera-captured document images based on document map generation.

Enhancing Optical Character Recognition on Images with Mixed Text Using Semantic Segmentation

Handwritten Documents Validation using Pattern Recognition and Transfer Learning

BLPnet: A new DNN model and Bengali OCR engine for Automatic Licence Plate Recognition

A Novel Memory and Time-Efficient ALPR System Based on YOLOv5.

Resonate: Website on Text to Speech

A Machine Learning and NLP Approach for Analyzing Eligibility Based on Resume and CV

DETECTION AND RECOGNITION OF HINDI TEXT FROM NATURAL SCENES AND ITS TRANSLITERATION TO ENGLISH

Traffic Violation Data Security System

In-depth analysis of the impact of OCR errors on named entity recognition and linking

Blpnet: A New Dnn Model and Bengali Ocr Engine for Automatic License Plate Recognition

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

OCR Engine Research Articles

Related Topics

Articles published on OCR Engine

Optical character recognition system using artificial intelligence

Efficient Vehicle Registration Recognition System: Enhancing Accuracy and Power Efficiency through Digital Image Processing

Neither Corpus Nor Edition: Building a Pipeline to Make Data Analysis Possible on Medieval Arabic Commentary Traditions

Build a Trained Data of Tesseract OCR engine for Tifinagh Script Recognition

Applicability of OCR Engines for Text Recognition in Vehicle Number Plates, Receipts and Handwriting

Design and Implementation for BIC Code Recognition System of Containers using OCR and CRAFT in Smart Logistics

Automatic Car Number Plate Detection using Morphological Image Processing

Are Searches in OCR-generated Archives Trustworthy?

Vehicle Number Plate Recognition Using Raspberry Pi

Adaptive dewarping of severely warped camera-captured document images based on document map generation.

Enhancing Optical Character Recognition on Images with Mixed Text Using Semantic Segmentation

Handwritten Documents Validation using Pattern Recognition and Transfer Learning

BLPnet: A new DNN model and Bengali OCR engine for Automatic Licence Plate Recognition

A Novel Memory and Time-Efficient ALPR System Based on YOLOv5.

Resonate: Website on Text to Speech

A Machine Learning and NLP Approach for Analyzing Eligibility Based on Resume and CV

DETECTION AND RECOGNITION OF HINDI TEXT FROM NATURAL SCENES AND ITS TRANSLITERATION TO ENGLISH

Traffic Violation Data Security System

In-depth analysis of the impact of OCR errors on named entity recognition and linking

Blpnet: A New Dnn Model and Bengali Ocr Engine for Automatic License Plate Recognition