Ilya Waldstein

Many historical newspapers are being digitized. We aim to support access to them via text analysis of the OCRd content. However, the OCR includes many errors; so extracting meaningful content from it is difficult. A pipeline of processing steps is proposed. Here, we describe the first two steps: segmentation and genre identification. The segmentation(More)
This paper describes how CBR can be used to compare, reuse, and adapt inductive models that represent complex systems. Complex systems are not well understood and therefore require models for their manipulation and understanding. We propose an approach to address the challenges for using CBR in this context, which relate to finding similar inductive models(More)
