Prithwijit Guha

Learn More
In this work, we attempt to tackle the problem of skeletal tracking of a human body using the Microsoft Kinect sensor. We use cues from the RGB and depth streams from the sensor to fit a stick skeleton model to the human upper body. A variety of Computer Vision techniques are used with a bottom up approach to estimate the candidate head and upper body(More)
Tracking multiple agents in a monocular visual surveillance system is often challenged by the phenomenon of occlusions. Agents entering the field of view can undergo two different forms of occlusions, either caused by crowding or due to obstructions by background objects at finite distances from the camera. The agents are primarily detected as foreground(More)
Occlusions are a central phenomenon in multiobject computer vision. However, formal analyses (LOS14, ROC20) proposed in the spatial reasoning literature ignore many distinctions crucial to computer vision, as a result of which these algebras have been largely ignored in vision applications. Two distinctions of relevance to visual computation are (a) whether(More)
Commercial detection in news broadcast videos involves judicious selection of meaningful audio-visual feature combinations and efficient classifiers. And, this problem becomes much simpler if these combinations can be learned from the data. To this end, we propose an Multiple Kernel Learning based method for boosting successful kernel functions while(More)
In this paper, we describe a cost-effective Multiple-Camera Vision system using low cost simple FireWire web cameras. The FireWire cameras, like other FireWire devices operate on the high speed FireWire bus. Current supported bandwidth is 400 Mbps. Right from its introduction, the FireWire (synonymously known as IEEE 1394) bus interface specification has(More)
Background subtraction is an essential task in several static camera based computer vision systems. Background modeling is often challenged by spatio-temporal changes occurring due to local motion and/or variations in illumination conditions. The background model is learned from an image sequence in a number of stages, viz. preprocessing, pixel/region(More)
We propose occlusion primitives to define a set of timevarying predicates on trackers for heterogeneous objects moving in unknown environments. Input pixel information is processed at inter-dependent levels to generate abstract categories for agents and actions. The scene background is learned online at the lowest layer, using feedback from the tracking(More)