Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees


We propose an exact, general and efficient coarse-to-fine energy minimization strategy for semantic video segmentation. Our strategy is based on a hierarchical abstraction of the supervoxel graph that allows us to minimize an energy defined at the finest level of the hierarchy by minimizing a series of simpler energies defined over coarser graphs. The strategy is exact, i.e., it produces the same solution as minimizing over the finest graph. It is general, i.e., it can be used to minimize any energy function (e.g., unary, pair wise, and higher-order terms) with any existing energy minimization algorithm (e.g., graph cuts and belief propagation). It also gives significant speedups in inference for several datasets with varying degrees of spatio-temporal continuity. We also discuss the strengths and weaknesses of our strategy relative to existing hierarchical approaches, and the kinds of image and video data that provide the best speedups.

DOI: 10.1109/ICCV.2013.234

Extracted Key Phrases

5 Figures and Tables

Citations per Year

Citation Velocity: 7

Averaging 7 citations per year over the last 3 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@article{Jain2013CoarsetoFineSV, title={Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees}, author={Aastha Jain and Shuanak Chatterjee and Ren{\'e} Vidal}, journal={2013 IEEE International Conference on Computer Vision}, year={2013}, pages={1865-1872} }