Optimized implementation of an MVC decoder

Abstract

3D video is getting more popular in various applications. For a 3D video experience, at least two different views of the same scene are necessary. Despite serious interests in consumer industries and open-source communities, the main challenge in designing a real-time 3D communication system is that by today several mandatory components required for such a system are generally not available. In particular, to the best of our knowledge, no MVC-compatible software decoder currently achieves real-time performance. In our work, we address the challenge of implementing an open-source decoder for multi-view video (MVV) representations. We focus on the design and implementation of a real-time decoder based on the FFmpeg framework, that is compliant with H.264/AVC Annex H (referred to as MVC). As such, we address and implement the missing components in the H.264 implementation of FFmpeg according to MVC. Namely, we first extend the parsing routines to be able to handle MVC-compliant bitstream by implementing support for new NAL unit types and parameter sets. In addition, we implement new buffers and enhanced structs to store MVC-dependent data, such as the structs for SPS and PPS and buffers for Subset SPS and inter-view reference lists. Second, we extend the decoding routines by extending DBP and modifying the reference picture handling. For that reason, we modify the existing code and implement additional functions, as required by MVC. Additionally, we investigate multi-threading capabilities and optimize the implementation to be able to decode selected views only. Finally, we add configuration options by extending the command line interface with additional parameters. We test our implementation on a commodity desktop computer and achieve decoding times of 18 ms per frame on average (for a single frame in all 8 views). This means, we achieve real-time performance for sequences with up to 50 frames per second. In addition to the implementation, we perform experiments on different MVV sequences to optimize the coding of multi-view sequences in terms of quality-complexity trade-offs; we perform this by varying the prediction schemes and quantization parameters as well as the scenes. Our findings are that there is dependence of coding on scene characteristics and prediction schemes. Finally, we analyze the impact of quantization on virtual view rendering and its relation to prediction schemes. We experience, that quantization has impact on virtual view rendering while applying of prediction schemes do not.

8 Figures and Tables

Cite this paper

@inproceedings{Britz2013OptimizedIO, title={Optimized implementation of an MVC decoder}, author={Jochen Britz and Thorsten Herfet and Goran Petrovic}, year={2013} }