A Study to Recognize Printed Gujarati Characters Using Tesseract OCR
@article{Audichya2017AST, title={A Study to Recognize Printed Gujarati Characters Using Tesseract OCR}, author={Milind Kumar Audichya}, journal={International Journal for Research in Applied Science and Engineering Technology}, year={2017}, pages={1505-1510} }
Optical Character Recognition (OCR) is a widely-known technique to recognize the printed text using computer with the help of various peripheral devices. Research works for OCR of many languages scripts is in process and many languages are still far away. Gujarati script is one of the least focused script in research area of OCR as compared to other scripts. A wellknown Open Source OCR Engine called Tesseract which is already used for the recognition of numerous scripts, can be used to…
4 Citations
Implementation of Words and Characters Segmentation of Gujarati Script Using MATLAB
- Computer Science
- 2019
A novel algorithm is proposed which considers input from a text file containing Gujarati script and segments words and characters and is validated with 10 numbers written in words and implemented using MATLAB.
Ekstraksi Karakter Citra Menggunakan Optical Character Recognition Untuk Pencetakan Nomor Kendaraan Pada Struk Parkir
- Computer Science
- 2020
The license plate number can be printed on the parking receipt by extracting the characters from the vehicle image which is generally acquired at the parking entrance portal by using the Optical Character Recognition method using the Tesseract library.
Deep Learning Approach for Spoken Digit Recognition in Gujarati Language
- Computer ScienceInternational Journal of Advanced Computer Science and Applications
- 2022
This research paper seeks to achieve recognition of ten Gujarati digits from zero to nine by using a deep learning approach and maximum 98.7% accuracy is achieved for spoken digits in Gujarati language.
Shot Boundary Detection for Gujarati News Video
- Computer Science
- 2018
This paper presents an efficient video shot boundary detection method based on visual information-based approach which use histogram difference and rank to determine shot boundary.
References
SHOWING 1-10 OF 10 REFERENCES
An OCR for separation and identification of mixed English — Gujarati digits using kNN classifier
- Computer Science2013 International Conference on Intelligent Systems and Signal Processing (ISSP)
- 2013
An OCR system that separates and identify mixed English-Gujarati digits and gives average accuracy of 99.26% for Gujarati digits, 99.20% for English digits, and overall accuracy 99.23%.
Classification of offline gujarati handwritten characters
- Computer Science2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
- 2015
A new Combination of Structural and Statistical methods (Freeman chain code, Hu's invariant moment and center of mass) to extract feature vectors results into good amount of accuracy.
Zone identification in the printed Gujarati text
- Computer ScienceEighth International Conference on Document Analysis and Recognition (ICDAR'05)
- 2005
A sophisticated method for accurate zone detection in images of printed Gujarati is proposed and it is expected that this approach shall make the way smoother for the design and development of Gujarati OCR systems for complete character sets.
Optical Character Recognition by Open source OCR Tool Tesseract: A Case Study
- Computer Science
- 2012
A comparative study of this tool with other commercial OCR tool Transym OCR by considering vehicle number plate as input and compared these tools based on various parameters are concluded.
Integrating Bangla script recognition support in tesseract OCR
- Computer Science
- 2009
This paper presents a complete methodology to integrate Bangla script recognition support in Tesseract, and shows how this support can be integrated into existing OCR engines.
Gujarati Handwritten Character Recognition Using Hybrid Method Based On Binary Tree-Classifier And K-Nearest Neighbour
- Computer Science
- 2013
A hybrid approach based on tree classifier and k-Nearest Neighbor for recognition of handwritten Gujarati characters and a success rate of 63% is achieved is acceptable, as it is one of the few attempts to recognize whole character set of Gujarati handwritten characters.
Shirorekha Chopping Integrated Tesseract OCR Engine for Enhanced Hindi Language Recognition
- Computer Science
- 2012
This paper presents a complete methodology to improve The Hindi Language Recognition accuracy, and presents comparison with other Devanagari OCR engines available on the basis of recognition accuracy, processing time, font variations and database size.
An Overview of the Tesseract OCR Engine
- Computer ScienceNinth International Conference on Document Analysis and Recognition (ICDAR 2007)
- 2007
The Tesseract OCR engine, as was the HP Research Prototype in the UNLV Fourth Annual Test of OCR Accuracy, is described in a comprehensive overview. Emphasis is placed on aspects that are novel or at…
How to train Tesseract 3.01 - Cédric Verstraeten
- 16 02 2017. [Online]. Available: https://blog.cedric.ws/how-to-train-tesseract-301.
- 2017
Gujarati language -wikipedia