Vinay Melkote

Learn More
This paper proposes a novel approach to jointly optimize spatial prediction and the choice of the subsequent transform in video and image compression. Under the assumption of a separable first-order Gauss-Markov model for the image signal, it is shown that the optimal Karhunen-Loeve Transform, given available partial boundary information, is well(More)
Predictive coding eliminates redundancy due to correlations between the current and past signal samples, so that only the innovation, or prediction residual, needs to be encoded. However, the decoder may, in principle, also exploit correlations with future samples. Prior decoder enhancement work mainly applied a non-causal filter to smooth the regular(More)
Current video coding schemes employ motion compensation to exploit the fact that the signal forms an auto-regressive process along the motion trajectory, and remove temporal redundancies with prior reconstructed samples via prediction. However, the decoder may, in principle, also exploit correlations with received encoding information of future frames. In(More)
Scalable video coding (SVC) employs inter-frame prediction at the base and/or the enhancement layers. Since the base layer can be encoded/decoded independent of the enhancement layers, we consider here the potential gains when prediction at the enhancement layers is delayed to accumulate and incorporate additional future information from the base layer. We(More)
Current scalable audio coders typically optimize performance at a particular layer without regard to impact on other layers, and are thus unable to provide a performance trade-off between different layers. In the particular case of MPEG Scalable Advanced Audio Coding (S-AAC) and Scalable-to-Lossless (SLS) coding, the base-layer is optimized first followed(More)
Current audio coding standards employ the modified discrete cosine transform (MDCT) where overlapped frames of audio are windowed and transformed to the frequency domain. Encoding parameters are chosen so as to minimize a distortion measure subject to a rate constraint. At the decoder, inverse transformation involves additional windowing and overlap-add of(More)
A novel scalable coding approach is proposed for video transmission over lossy networks, which builds on two estimation-theoretic (ET) paradigms previously developed by our group: (1) an ET approach to enhancement layer prediction in scalable video coding (ET-SVC) that optimally combines all available information from both the current base layer and prior(More)
Temporal prediction in standard video coding is performed in the spatial domain, where each pixel is predicted from a motion-compensated reconstructed pixel in a prior frame. This paper is premised on the realization that such standard prediction treats each pixel independently and ignores underlying spatial correlations, while transform-domain prediction(More)
This paper focuses on prediction optimality in spatially scalable video coding. It is inspired by the earlier estimation-theoretic prediction framework developed by our group for quality (SNR) scalability, which achieved optimality by fully accounting for relevant information from the current base layer (e.g., quantization intervals) and the enhancement(More)
End-to-end distortion estimation is critical to effective errorresilient video coding. The recursive optimal per-pixel estimate (ROPE) is a known approach to compute up to second moments of decoder-reconstructed pixels, and thereby optimally estimate the distortion. ROPE accurately accounts for encoding/decoding operations that are recursive in the pixel(More)