Ramakrishna Vedantham

Learn More
With recent advances in mobile computing, the demand for visual localization or landmark identification on mobile devices is gaining interest. We advance the state of the art in this area by fusing two popular representations of streetlevel image data—facade-aligned and viewpoint-aligned— and show that they contain complementary information that can be(More)
Mobile phones have evolved into powerful image and video processing devices equipped with high-resolution cameras, color displays, and hardware-accelerated graphics. They are also increasingly equipped with a global positioning system and connected to broadband wireless networks. All this enables a new class of applications that use the camera phone to(More)
We survey popular data sets used in computer vision literature and point out their limitations for mobile visual search applications. To overcome many of the limitations, we propose the Stanford Mobile Visual Search data set. The data set contains camera-phone images of products, CDs, books, outdoor landmarks, business cards, text documents, museum(More)
Many mobile visual search (MVS) systems transmit query data from a mobile device to a remote server and search a database hosted on the server. In this paper, we present a new architecture for searching a large database directly on a mobile device, which can provide numerous benefits for network-independent, low-latency, and privacy-protected image(More)
Most mobile visual search (MVS) systems query a large database stored on a server. This paper presents a new architecture for searching a large database directly on a mobile device, which has numerous benefits for networkindependent, low-latency, and privacy-protected image retrieval. A key challenge for on-device MVS is storing a memory-intensive database(More)
Mobile phones are an attractive platform for landmark-based pedestrian navigation systems. To be practical, such a system must be able to automatically generate lightweight directions that can be displayed on these mobile devices. We present a system that leverages an online collection of geotagged photographs to automatically generate navigational(More)
To perform fast image matching against large databases, a Vocabulary Tree (VT) uses an inverted index that maps from each tree node to database images which have visited that node. The inverted index can require gigabytes of memory, which significantly slows down the database server. In this paper, we design, develop, and compare techniques for inverted(More)
Continuous recognition and tracking of objects in live video captured on a mobile device enables real-time user interaction. We demonstrate a streaming mobile augmented reality system with 1 second latency. User interest is automatically inferred from camera movements, so the user never has to press a button. Our system is used to identify and track book(More)
We present a fast and efficient geometric re-ranking method that can be incorporated in a feature based image-based retrieval system that utilizes a Vocabulary Tree (VT). We form feature pairs by comparing descriptor classification paths in the VT and calculate geometric similarity score of these pairs. We propose a location geometric similarity scoring(More)
Computer vision techniques can enhance landmark-based navigation by better utilizing online photo collections. We use spatial reasoning to compute camera poses, which are then registered to the world using GPS information extracted from the image tags. Computed camera pose is used to augment the images with navigational arrows that fit the environment. We(More)