Detecting figures and part labels in patents: competition-based development of graphics recognition algorithms

  title={Detecting figures and part labels in patents: competition-based development of graphics recognition algorithms},
  author={Christoph Riedl and Richard Zanibbi and Marti A. Hearst and Siyu Zhu and Michael Menietti and Jason Crusan and Ivan Metelsky and Karim R. Lakhani},
  journal={International Journal on Document Analysis and Recognition (IJDAR)},
Most United States Patent and Trademark Office (USPTO) patent documents contain drawing pages which describe inventions graphically. By convention and by rule, these drawings contain figures and parts that are annotated with numbered labels but not with text. As a result, readers must scan the document to find the description of a given part label. To make progress toward automatic creation of ‘tool-tips’ and hyperlinks from part labels to their associated descriptions, the USPTO hosted a… 
Text Detection in Natural Scenes and Technical Diagrams with Convolutional Feature Learning and Cascaded Classification
A text detection system to analyze and utilize visual information in a data driven, automatic and intelligent way, including patch-based coarse-to-fine detection (Text-Conv), connected component extraction using region growing, and graph-based word segmentation (Word-Graph).
Diagram Image Retrieval and Analysis: Challenges and Opportunities
This paper investigates recent research on diagram image retrieval and analysis, with an emphasis on methods using content-based image retrieval (CBIR), textures, shapes, topology and geometry, and point out future research opportunities from technical and application perspectives.
Digital Art Feature Association Mining Based on the Machine Learning Algorithm
This study observes the heuristic algorithm of digital art feature association mining based on minimum confidence and carries out feature matching based on digital art features of data for the recommendation algorithm under different situation modes.
Machine learning approaches to facial and text analysis: Discovering CEO oral communication styles
We demonstrate how a novel synthesis of three methods — (1) unsupervised topic modeling of text data to generate new measures of textual variance, (2) sentiment analysis of text data, and (3)
Mining BIG Data : The Future of Exploration Targeting Using Machine Learning
  • Geology, Computer Science
  • 2017
A standard workflow for the effective application of supervised machine learning to exploration targeting is proposed, which requires high quality geoscientific data, solid interpretations, a good dose of common sense, and in most cases several iterations to understand what the software is predicting.
Learning from Mixed Signals in Online Innovation Communities
We study how contributors to innovation contests improve their performance through direct experience and by observing others as they synthesize learnable signals from different sources. Our researc...


Image search in patents: a review
  • Naeem Bhatti, A. Hanbury
  • Computer Science
    International Journal on Document Analysis and Recognition (IJDAR)
  • 2012
The importance, requirements, and challenges of a patent image retrieval system are introduced and an overview of the algorithms developed for the retrieval and analysis of CAD and technical drawings, diagrams, data flow diagrams, circuit diagrams,Data charts, flowcharts, plots, and symbol recognition are presented.
Text line extraction in graphical documents using background and foreground information
  • P. Roy, U. Pal, J. Lladós
  • Computer Science
    International Journal on Document Analysis and Recognition (IJDAR)
  • 2011
A novel method to segment such text lines and the method is based on the foreground and background information of the text components, using a water reservoir concept to effectively utilize the background information.
Evaluating structural pattern recognition for handwritten math via primitive label graphs
This work defines new metrics obtained by Hamming distances over label graphs, which allow classification, segmentation and parsing errors to be characterized separately, or using a single measure, at the primitive level.
Detection of dimension sets in engineering drawings
  • C. Lai, R. Kasturi
  • Computer Science
    Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93)
  • 1993
A new rule-based text/graphics separation algorithm and a model-based procedure for detecting arrowheads in any orientation have been developed for detecting dimension sets in engineering drawings drawn to ANSI drafting standards.
Whole-Book Recognition
  • Pingping Xiu, H. Baird
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2012
An algorithm which expects to be initialized with approximate iconic and linguistic models-derived from OCR results and dictionaries-and then, guided entirely by evidence internal to the test set, corrects the models which, in turn, yields higher recognition accuracy.
Patent Retrieval
This survey of work done on patent data in relation to Information Retrieval in the last 20–25 years is a survey of the sources of difficulty and the existing document processing and retrieval methods of the domain, and provides a motivation for further research in the area.
DMOS, a generic document recognition method: application to table structure analysis in a general and in a specific way
  • Bertrand Coüasnon
  • Computer Science
    International Journal of Document Analysis and Recognition (IJDAR)
  • 2005
The Description and Modification of Segmentation (DMOS) method is proposed, which is made of a new grammatical language (Enhanced Position Formalism—EPF) and an associated parser able to deal with noise, which has been successfully used to produce recognition systems on musical scores, mathematical formulae and even tennis courts in videos.
Flowchart recognition for non-textual information retrieval in patent search
This paper presents a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically, and reports the obtained results on this dataset.
A Survey of Methods and Strategies in Character Segmentation
H holistic approaches that avoid segmentation by recognizing entire character strings as units are described, including methods that partition the input image into subimages, which are then classified.
Machine printed text and handwriting identification in noisy document images
This paper addresses the problem of the identification of text in noisy document images by treating noise as a separate class and model noise based on selected features.