Error Detection in Indic OCRs

@article{Vinitha2016ErrorDI,
  title={Error Detection in Indic OCRs},
  author={V. S. Vinitha and C. V. Jawahar},
  journal={2016 12th IAPR Workshop on Document Analysis Systems (DAS)},
  year={2016},
  pages={180-185}
}
A good post processing module is an indispensable part of an OCR pipeline. In this paper, we propose a novel method for error detection in Indian language OCR output. Our solution uses a recurrent neural network (RNN) for classification of a word as an error or not. We propose a generic error detection method and demonstrate its effectiveness on four popular Indian languages. We divide the words into their constituent aksharas and use their bigram and trigram level information to build a… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-3 OF 3 CITATIONS

Error Detection and Corrections in Indic OCR Using LSTMs

  • 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
  • 2017
VIEW 9 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

A Framework for Document Specific Error Detection and Corrections in Indic OCR

  • 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
  • 2017
VIEW 1 EXCERPT
CITES METHODS

An Empirical Study of Effectiveness of Post-Processing in Indic Scripts

  • 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
  • 2017
VIEW 1 EXCERPT
CITES METHODS

References

Publications referenced by this paper.
SHOWING 1-10 OF 19 REFERENCES

Error Detection in Highly Inflectional Languages

  • 2013 12th International Conference on Document Analysis and Recognition
  • 2013
VIEW 18 EXCERPTS
HIGHLY INFLUENTIAL

A shape based post processor for Gurmukhi OCR

  • Proceedings of Sixth International Conference on Document Analysis and Recognition
  • 2001
VIEW 10 EXCERPTS
HIGHLY INFLUENTIAL

A Fast Alignment Scheme for Automatic OCR Evaluation of Books

  • 2011 International Conference on Document Analysis and Recognition
  • 2011
VIEW 1 EXCERPT

Experiences of integration and performance testing of multilingual OCR for printed indian scripts

D. Arya, T. Patnaik, +5 authors G. S. Lehal
  • J-MOCR Workshop,ICDAR, 2011.
  • 2011
VIEW 1 EXCERPT

Limits on the Application of Frequency-Based Language Models to OCR

  • 2011 International Conference on Document Analysis and Recognition
  • 2011
VIEW 2 EXCERPTS

Recurrent neural networks for fuzzy data

  • Integrated Computer-Aided Engineering
  • 2011
VIEW 1 EXCERPT

A Novel Connectionist System for Unconstrained Handwriting Recognition

  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2009
VIEW 1 EXCERPT

Similar Papers

Loading similar papers…