In this paper, we propose a robust heterogeneous feature based image alignment method that utilizes points, lines and regions in a unified framework. The image motion is decomposed into progressively complex components, i.e., translation, similarity, affine, and projective motion models, and alignment is obtained with deliberatively selected suitable feature types and associated descriptors. Large convergence range is obtained by gradually constraining the search range of features in each stage. Notably, point and line features are jointly used and formulated in a RANSAC (Random Sample Consensus) framework for robust estimation of a homography between low textured images. Further improvement is obtained with region based direct method. Experiments demonstrate superior alignment results of our approach to both gradient-based direct method and tradition point feature based alignment method.