Deciphering complex text-based CAPTCHAs with deep learning

Asadullah Kehar

doi:10.17485/ijst/v13i13.126

Abstract

Background: CAPTCHA is a mechanism to distinguish humans from bots. It has become standard means of protection from the misuse of resources on World Wide Web. Different types of CAPTCHAs are implemented but text-based schemes are the most widely used due to its easiness and robustness. A user is asked to type in the text from an image. The image is intentionally distorted to dodge the bots. Recognizing the text is easy for humans but very hard for computers. Method/Findings: In this work, a text-based CAPTCHA scheme with background clutter and partially connected characters is decoded. The main steps consist on preprocessing, segmentation and recognition. Several digital image processing techniques were applied during preprocessing, segmentation steps and convolutional neural network (CNN) was used for recognition process. Since massive data is required for CNN therefore data was generated synthetically. A complex text-based CAPTCHA scheme with varying number of letters: 3, 4 and 5 letters is decoded with the overall precision of 77.5%, 64.2% and 51.9% respectively. Keywords: CAPTCHAs; HIPs; image processing; machine learning; CNN

Highlights

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a computer test-program meant to distinguish between a computer and human
The text based CAPTCHAs are most popular because these are easy for most users and provide better security
In text-based CAPTCHA, a user is asked to type in noisy, distorted, string of random characters

Summary

Introduction

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a computer test-program meant to distinguish between a computer and human. The bots may send junk e-mails, post unauthorized advertisements and fill servers with heavy traffic These misuses can decrease performance of internet servers. In text-based CAPTCHA, a user is asked to type in noisy, distorted, string of random characters. Distortions are intentionally introduced in text-string to assure protection from bots. CAPTCHA is a mechanism to distinguish humans from bots It has become standard means of protection from the misuse of resources on World Wide Web. Different types of CAPTCHAs are implemented but text-based schemes are the most widely used due to its easiness and robustness. Method/Findings: In this work, a text-based CAPTCHA scheme with background clutter and partially connected characters is decoded. Several digital image processing techniques were applied during preprocessing, segmentation steps and convolutional neural network (CNN) was used for recognition process. A complex text-based CAPTCHA scheme with varying number of letters: 3, 4 and 5 letters is decoded with the overall precision of 77.5%, 64.2% and 51.9% respectively

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Indian Journal of Science and Technology	Publication Date: Apr 2, 2020
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Deciphering complex text-based CAPTCHAs with deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology

Lead the way for us

Similar Papers

Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
-
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34
--
12 Apr 2022
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34

Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis
Xinyi Du-Harpur ... Magnus D Lynch
Journal of Investigative Dermatology | VOL. 141
Xinyi Du-Harpur, et. al.Xinyi Du-Harpur ... Magnus D Lynch
12 Sep 2020
Journal of Investigative Dermatology | VOL. 141

Plants meet machines: Prospects in machine learning for plant biology
Pamela S Soltis ... Alina Zare
Applications in Plant Sciences | VOL. 8
Pamela S Soltis, et. al.Pamela S Soltis ... Alina Zare
01 Jun 2020
Applications in Plant Sciences | VOL. 8

Machine Learning Applications in Orthopaedic Imaging.
Vincent M Wang ... Albert J Kozar
The Journal of the American Academy of Orthopaedic Surgeons | VOL. 28
Vincent M Wang, et. al.Vincent M Wang ... Albert J Kozar
15 May 2020
The Journal of the American Academy of Orthopaedic Surgeons | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deciphering complex text-based CAPTCHAs with deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology