An OCR System with OCRopus for Scientific Documents Containing Mathematical Formulas

  title={An OCR System with OCRopus for Scientific Documents Containing Mathematical Formulas},
  author={Fumihiro Furukori and Shinpei Yamazaki and T. Miyagishi and Keiichiro Shirai and Masayuki Okamoto},
  journal={2013 12th International Conference on Document Analysis and Recognition},
This paper describes the installation of a mathematical formula recognition module into an open source OCR system: OCRopus. In particular we consider the identification of inline formulas utilizing existing modules. Text lines including math formulas are first processed using a N-gram language model to reduce the number of formula candidates by thresholding the conditional probability of words. Then the formula candidates are classified into formulas and texts by SVM using geometric features… CONTINUE READING

From This Paper

Figures, tables, results, connections, and topics extracted from this paper.
1 Extracted Citations
11 Extracted References
Similar Papers

Citing Papers

Publications influenced by this paper.

Referenced Papers

Publications referenced by this paper.
Showing 1-10 of 11 references

Mathematical methods in physics and engineering

  • J. W. Dettman
  • ser. Dover Books on Engineer. Dover Publications…
  • 1969
1 Excerpt

Similar Papers

Loading similar papers…