Micro aerial vehicle (MAV) can be used as an efficient image acquiring tool. However, aerial video sequence has a large amount of redundant information, which will delay the scene reconstruction process and decrease the structure accuracy. Meanwhile conventional frame decimation approaches normally have time-consuming feature matching step. This paper proposes a hierarchical frame decimation approach for video-based structure from motion with aerial sequential frames. For preliminary decimation, both the MAV's parameters and image quality are incorporated, which can sharply decrease the number of frames. For detailed decimation, both degenerate situations and width of baseline are considered. The proposed approach is tested with several aerial sequential image datasets. The results show that our approach promotes the efficiency and accuracy of the three dimensional outdoor scene reconstructions for aerial sequential frames.