- Xiaoyan Lin, Liangcai Gao, Xuan Hu, Zhi Tang, Yingnan Xiao, Xiaozhong Liu
- SIGIR
- 2014

The semantics of mathematical formulae depend on their spatial structure, and they usually exist in layout presentations such as PDF, LaTeX, and Presentation MathML, which challenges previous textâ€¦ (More)

- Xiaoyan Lin, Liangcai Gao, Zhi Tang, Xiaofan Lin, Xuan Hu
- 2011 International Conference on Documentâ€¦
- 2011

Recognizing mathematical expressions in PDF documents is a new and important field in document analysis. It is quite different from extracting mathematical expressions in image-based documents. Inâ€¦ (More)

- Liangcai Gao, Zhi Tang, Xiaoyan Lin, Yongtao Wang
- Proceedings of the 21st International Conferenceâ€¦
- 2012

The primary information units in a newspaper are the articles. Article reconstruction from newspapers including article aggregation and reading order recovery is known to be a quite challenging taskâ€¦ (More)

- Xuan Hu, Liangcai Gao, Xiaoyan Lin, Zhi Tang, Xiaofan Lin, Josef B. Baker
- JCDL
- 2013

Mathematical formulae in structural formats such as MathML and LaTeX are becoming increasingly available. Moreover, repositories and websites, including ArXiv and Wikipedia, and growing numbers ofâ€¦ (More)

- Liangcai Gao, Zhi Tang, Xiaofan Lin, Ying Liu, Ruiheng Qiu, Yongtao Wang
- JCDL
- 2011

Nowadays PDF documents have become a dominating knowledge repository for both the academia and industry largely because they are very convenient to print and exchange. However, the methods ofâ€¦ (More)

- Yuehan Wang, Liangcai Gao, Simeng Wang, Zhi Tang, Xiaozhong Liu, Ke Yuan
- JCDL
- 2015

Nowadays, mathematical information is increasingly available in websites and repositories, such like ArXiv, Wikipedia and growing numbers of digital libraries. Mathematical formulae are highlyâ€¦ (More)

- Leipeng Hao, Liangcai Gao, Xiaohan Yi, Zhi Tang
- 2016 12th IAPR Workshop on Document Analysisâ€¦
- 2016

Because of the better performance of deep learning on many computer vision tasks, researchers in the area of document analysis and recognition begin to adopt this technique into their work. In thisâ€¦ (More)

- Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou Sun, Liangcai Gao
- IEEE/ACM Joint Conference on Digital Libraries
- 2014

Citation relationship between scientific publications has been successfully used for scholarly bibliometrics, information retrieval and data mining tasks, and citation-based recommendation algorithmsâ€¦ (More)

- Xiaoyan Lin, Liangcai Gao, Zhi Tang, Josef B. Baker, Volker Sorge
- International Journal on Document Analysis andâ€¦
- 2013

An important initial step of mathematical formula recognition is to correctly identify the location of formulae within documents. Previous work in this area has traditionally focused on image-basedâ€¦ (More)

- Xiaoyan Lin, Liangcai Gao, Zhi Tang, Xuan Hu, Xiaofan Lin
- DRR
- 2012

With the tremendous popularity of PDF format, recognizing mathematical formulas in PDF documents becomes a new and important problem in document analysis field. In this paper, we present a method ofâ€¦ (More)