In this paper, we present a view synthesis method named <i>Visto</i> which aims to generate seamless novel views from a monocular view input. We formulate the problem as joint optimization of inter-view texture similarity and geometry preservation, which significantly differs from traditional view synthesis framework. In this way, the image characteristics of virtual view are inherently inherited from the reference view without introducing any image prior or texture modeling technique. The energy function is minimized using Gauss-Seidel-like approach, and the quality of the virtual view is refined iteratively. The proposed approach also tolerates small depth map errors. Further more, the algorithm is parallel friendly. The simulation results outperform several existing state-of-the-art monocular view synthesis systems.