We are not contortionists: Coupled adaptive learning for head and body orientation estimation in surveillance video
In this paper we focus on robust, real-time human head pose estimation in low resolution RGB data without any smoothing motion priors e.g. direction of motion. Our main contributions lie in three major areas. First, we show that a generative Deep Belief Network model can be learned on human head data from multiple types of data sources. These sources have similar underlying data that are not necessarily labelled or have the same kind of ground truth. Second, we perform discriminative training using multiple disparate supervisory labels to fine tune the model for head pose estimation. Third, we present state-of-the-art results on two publicly available datasets using this new approach. Our implementation computes head pose for a head image in 0.8 milliseconds, making it real-time and highly scalable.