Iuliu Vasile Konya

Learn More
An ever-growing amount of digitized content urges libraries and archives to integrate new media types from a large number of origins such as publishers, record labels and film archives, into their existing collections. This is a challenging task, since the multimedia content itself as well as the associated metadata is inherently heterogeneous—the different(More)
Print media collections of considerable size are held by cultural heritage organizations and will soon be subject to digitization activities. However, technical content quality management in digitization workflows strongly relies on human monitoring. This heavy human intervention is cost intensive and time consuming, which makes automization mandatory. In(More)
Reliable and generic methods for skew detection are a necessity for any large-scale digitization projects. As one of the first processing steps, skew detection and correction has a heavy influence on all further document analysis modules, such as geometric and logical layout analysis. This paper introduces a generic, scaleindependent algorithm capable of(More)
This work introduces a practical method for performing logical layout analysis on heterogeneous periodical collections. The described module is incorporated into the Fraunhofer document image understanding system and has been successfully used as part of mass digitization projects on more than 500 000 scanned pages. Our primary target are documents with(More)
Scanned document images are nowadays becoming available in increasingly higher resolutions. Meanwhile, the variations in image quality within typical document collections increase due to images coming from different scan service providers, time periods or digitization methods. Binarization is a crucial first step for many document analysis algorithms.(More)
Document deskewing is a crucial pre-processing step in any document analysis system. In mass digitization projects, the ability to automatically assess the success of this step enables significant reductions in the amount of work performed by human operators. The current paper extends our generic skew and orientation detection algorithm with the ability to(More)
Print media collections of considerable size are held by cultural heritage organizations and will soon be subject to digitization activities. However, technical content quality management in digitization workflows strongly relies on human monitoring. This heavy human intervention is cost intensive and time consuming, which makes automization mandatory. In(More)