Assessing quality of experience (QoE) for three-dimensional (3-D) video is challenging. In this paper, we propose a new full-reference stereoscopic image quality metric, by simulating the behaviors of visual perception with simple and complex receptive field properties and constructing the models of monocular and binocular visual perception. To be more specific, the stereoscopic images are first classified into noncorresponding and corresponding regions. Then, monocular energy responses are generated for the noncorresponding region based on stimuli from different spatial frequencies and orientations, and binocular energy responses are generated for the corresponding region based on stimuli from different spatial frequencies, orientations and disparities, respectively. Finally, gradient similarities between the energy responses of the original and distorted stereoscopic images are measured for noncorresponding and corresponding regions, respectively, and all results are fused to get an overall score. Experiments on three 3-D image quality assessment (IQA) databases demonstrate that in comparison with the most related existing methods, the devised algorithm achieves higher agreement with subjective assessment, making it better suited for the evaluation and optimization of stereoscopic image processing algorithms.