In this paper, we propose an online scene reconstruction algorithm with monocular camera since there are many advantages on modeling and visualization of an environment with physical scene reconstruction instead of resorting to sparse 3D points. The goal of this algorithm is to simultaneously track the camera position and map the 3D environment, which is close to the spirit of visual SLAM. There're plenty of visual SLAM algorithms in the current literature which can provide a high accuracy performance, but many of them rely on stereo cameras. It's true that we'll face many more challenges to accomplish this task with monocular camera. However, the advantages of cheaper and easier deployable hardware setting have made monocular approach more attractive. Specifically, we apply a maximum a posteriori Bayesian approach with optimization technique to simultaneously track the camera and build a dense point cloud. We also propose a feature expansion method to expand the density of points, and then online reconstruct the scene with a delayed approach. Furthermore, we utilize the reconstructed model to accomplish visual localization task without extracting the features. Finally, a number of experiments have been conducted to validate our proposed approach, and promising performance can be observed.