Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks

@article{Alberti2017HistoricalDI,
  title={Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks},
  author={Michele Alberti and Mathias Seuret and Vinaychandran Pondenkandath and R. Ingold and M. Liwicki},
  journal={Proceedings of the 4th International Workshop on Historical Document Imaging and Processing},
  year={2017}
}
In this paper, we present a novel approach to perform deep neural networks layer-wise weight initialization using Linear Discriminant Analysis (LDA). Typically, the weights of a deep neural network are initialized with: random values, greedy layer-wise pre-training (usually as Deep Belief Network or as auto-encoder) or by re-using the layers from another network (transfer learning). Hence, many training epochs are needed before meaningful weights are learned, or a rather similar dataset is… Expand
A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis
TLDR
A comprehensive empirical survey on the effect of ImageNet pre-training for diverse historical document analysis tasks, including character recognition, style classification, manuscript dating, semantic segmentation, and content-based retrieval finds a clear trend across different network architectures that ImageNetPre-training has a positive effect on classification as well as content- based retrieval. Expand
A Pitfall of Unsupervised Pre-Training
TLDR
It is proved that even if a Stacked Convolutional Auto-Encoder is good at reconstructing pictures, it is not necessarily good at discriminating their classes, because it is biased by the decoder quality. Expand
Labeling, Cutting, Grouping: An Efficient Text Line Segmentation Method for Medieval Manuscripts
TLDR
This work proposes a novel method which uses semantic segmentation at pixel level as intermediate task, followed by a text-line extraction step, and demonstrates that semantic pixel segmentation can be used as strong denoising pre-processing step before performing text line extraction. Expand
Trainable Spectrally Initializable Matrix Transformations in Convolutional Neural Networks
TLDR
This work introduces a new architectural component to Neural Network (NN), i.e., trainable and spectrally initializable matrix transformations on feature maps, implemented as auto-differentiable PyTorch modules that can be incorporated into any neural network architecture. Expand
Study of using hybrid deep neural networks in character extraction from images containing text
Study of ancient inscriptions is important and vital in reconstructing history. The scripts used in these inscriptions may belong to different eras and are classifi ed based on the dynasty that ruledExpand
Document Layout Analysis
TLDR
This survey paper presents a critical study of different document layout analysis techniques and discusses comprehensively the different phases of the DLA algorithms based on a general framework that is formed as an outcome of reviewing the research in the field. Expand
DeepDIVA: A Highly-Functional Python Framework for Reproducible Experiments
TLDR
DeepDIVA is introduced: an infrastructure designed to enable quick and intuitive setup of reproducible experiments with a large range of useful analysis functionality and case studies in the area of handwritten document analysis where researchers benefit from the integrated functionality. Expand
HSMA_WOA: A hybrid novel Slime mould algorithm with whale optimization algorithm for tackling the image segmentation problem of chest X-ray images
TLDR
A new hybrid approach based on the thresholding technique to overcome ISP for COVID-19 chest X-ray images by integrating a novel meta-heuristic algorithm known as a slime mould algorithm (SMA) with the whale optimization algorithm to maximize the Kapur’s entropy. Expand
A Hybrid COVID-19 Detection Model Using an Improved Marine Predators Algorithm and a Ranking-Based Diversity Reduction Strategy
TLDR
A hybrid COVID-19 detection model based on an improved marine predators algorithm (IMPA) for X-Ray image segmentation is proposed and the ranking-based diversity reduction (RDR) strategy is used to enhance the performance of the IMPA to reach better solutions in fewer iterations. Expand

References

SHOWING 1-10 OF 33 REFERENCES
PCA-Initialized Deep Neural Networks Applied to Document Image Analysis
TLDR
This paper describes how to turn a PCA into an auto-encoder, by generating an encoder layer of the PCA parameters and furthermore adding a decoding layer, and investigates the effectiveness of PCAbased initialization for the task of layout analysis. Expand
Data-dependent Initializations of Convolutional Neural Networks
TLDR
This work presents a fast and simple data-dependent initialization procedure, that sets the weights of a network such that all units in the network train at roughly the same rate, avoiding vanishing or exploding gradients. Expand
Deep Linear Discriminant Analysis
TLDR
Deep Linear Discriminant Analysis is introduced which learns linearly separable latent representations in an end-to-end fashion and produces competitive results on MNIST and CIFAR-10 and outperforms a network trained with categorical cross entropy on a supervised setting of STL-10. Expand
A Fast Learning Algorithm for Deep Belief Nets
TLDR
A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. Expand
Deep Learning
TLDR
Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data. Expand
What You Expect is NOT What You Get! Questioning Reconstruction/Classification Correlation of Stacked Convolutional Auto-Encoder Features
TLDR
It is concluded that both, reconstruction score and training error should not be used jointly to evaluate the quality of the features produced by a Stacked Convolutional Auto-Encoders for a classification task. Expand
DIVA-HisDB: A Precisely Annotated Large Dataset of Challenging Medieval Manuscripts
TLDR
A publicly available historical manuscript database DIVA-HisDB is introduced for the evaluation of several Document Image Analysis (DIA) tasks and a layout analysis ground-truth which has been iterated on, reviewed, and refined by an expert in medieval studies is provided. Expand
Deep learning in neural networks: An overview
TLDR
This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks. Expand
The optimised internal representation of multilayer classifier networks performs nonlinear discriminant analysis
Abstract This paper illustrates why a nonlinear adaptive feed-forward layered network with linear output units can perform well as a pattern classification device. The central result is thatExpand
Determining Optimum Structure for Artificial Neural Networks
TLDR
Investigations of the relationship between the network structure and the accuracy of the classification are reported here, using a MATLAB tool-kit to take the advantage of scientific visualisation. Expand
...
1
2
3
4
...