Query-by-example video retrieval is receiving an increasing attention in recent years. One of the state-of-art approaches is the Bag-of-visual Words (BoW) based technique, where images are described by a set of local features mapped to a discrete set of visual words. Such techniques, however, ignores spatial relations between visual words. In this paper, we present a content based video retrieval technique based on selected Words-of-Interest (WoI) that utilizes visual words spatial proximity constraint identified from the query. Experiments carried out on a public video database demonstrate promising results of our approach that outperform the classical BoW approach.