Nikolaos D. Zervas

Learn More
In this paper the three main hardware architectures for the two-dimensional discrete wavelet transform (2D-DWT) are reviewed. Also optimization techniques applicable to all three architectures are described. The main contribution of this work is the quantitative comparison among these design alternatives for the 2D-DWT. The comparison is performed in terms(More)
Exploitation of data re-use in combination with the use of custom memory hierarchy that exploits the temporal locality of data accesses may introduce significant power savings, especially for dataintensive applications. The effect of the data-reuse decisions on the power dissipation but also on area and performance of multimedia applications realized on(More)
In this paper, two basic approaches for implementing the 9/7 Filtering Unit, used in the Discrete Wavelet Transform, are addressed. The first is the lifting scheme approach and the second is the conventional, convolutional filter approach. Two architectures are examined for each approach, a simple – straightforward one and an optimized one, substituting the(More)
A new method for the implementation of the binary-tree decomposition of the convolution-based wavelet transform, called the Local Wavelet Transform (LWT) has been recently proposed in the literature. While it produces exactly the same results as the classical row-column implementation of the transform, it has many implementation benefits. In this paper,(More)
This paper focuses on I-cache behaviourenhancement through the application of high-levelcode transformations. Specifically, a flow for theiterative application of the I-Cache performanceoptimizing transformations is proposed. Theprocedure of applying transformation is driven by aset of analytical equations, which receive parametersrelated to code and(More)
This paper describes an efficient implementation of an image coding system based on the independent wavelet-tree coding concept. The system consists of a transform and a (de)coding engine that operate in a pipelined fashion. The main focus of this paper will be on the encoding part since, due to the system architecture, the decoder has identical memory(More)
Multimedia applications are characterized by an increased number of data transfer and storage operations due to real time requirements. Appropriate transformations can be applied at the algorithmic level to improve crucial implementation characteristics. In this paper, the effect of the data-reuse transformations on power consumption, area and performance(More)
Power savings that can be achieved by data-reuse decisions targeting at a custom memory hierarchy for multimedia applications executing on embedded cores are examined in this paper. Exploiting the temporal locality of memory accesses in data-intensive applications a set of data-reuse transformations on a typical motion estimation algorithm is determined.(More)
A methodology for power optimization of the data memory hierarchy and instruction memory, is introduced. The effect of the methodology on a set of widely used multimedia application kernels, namely Full Search (FS), Hierarchical Search (HS), and Parellel Hierarchical One Dimension Search (PHODS), is demonstrated . Three different target architecture models(More)
Data Memory hierarchy optimization and partitioning for a widely used multimedia application kernel known as the hierarchical motion estimation algorithm is undertaken, with the use of global loop and data-reuse transformations for three different embedded processor architecture models. Exhaustive exploration of the obtained results clarifies the effect of(More)