Learn More
Automated publishing systems require large databases containing document page layout templates. Most of these layout templates are created manually. A lower cost alternative is to extract document page layouts from existing documents. In order to extract the layout from a scanned document image, it is necessary to perform Optical Font Recognition (OFR)(More)
Managing large document databases has become an important task. Being able to automatically compare document layouts and classify and search documents with respect to their visual appearance proves to be desirable in many applications. We propose a new algorithm that approximates a metric function between documents based on their visual similarity. The(More)
We present a method for the automated composition of personalized newspapers. Traditional newsprint composition is a laborious and expensive manual process. We develop a two level hierarchical page layout model that models aesthetic design choices using local (within article region) and global (page level) prior probability distributions. Given content to(More)
Document database management; document visual similarity; similarity pyramid; Isomap Managing large document databases has become an important task. Sorting documents with respect to their visual similarity and layout features, and visualization of the whole document database is a desirable application. A user may wish to search for documents in a database(More)
  • 1