A hierarchical, HMM-based automatic evaluation of OCR accuracy for a digital library of books

@article{Feng2006AHH,
  title={A hierarchical, HMM-based automatic evaluation of OCR accuracy for a digital library of books},
  author={Shaolei Feng and R. Manmatha},
  journal={Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06)},
  year={2006},
  pages={109-118}
}
A number of projects are creating searchable digital libraries of printed books. These include the Million Book Project, the Google Book project and similar efforts from Yahoo and Microsoft. Content-based on line book retrieval usually requires first converting printed text into machine readable (e.g. ASCII) text using an optical character recognition (OCR) engine and then doing full text search on the results. Many of these books are old and there are a variety of processing steps that are… CONTINUE READING
Highly Cited
This paper has 55 citations. REVIEW CITATIONS

Citations

Publications citing this paper.
Showing 1-10 of 33 extracted citations

55 Citations

051015'09'12'15'18
Citations per Year
Semantic Scholar estimates that this publication has 55 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…