Michalis D. Galanis

Learn More
In this paper we study the performance improvements and trade-offs derived from an optimized mapping approach applied on a parametric coarse grained reconfigurable array architecture. The processing elements' local register files and the processing elements' interconnection network is exploited for caching memory data values with data reuse opportunities.(More)
A high-performance data-path to implement DSP kernels is introduced in this paper. The data-path is realized by a Flexible Computational Component (FCC), which is a pure combinational circuit and it can implement any 2x2 template (cluster) of primitive resources. Thus, the data-path's performance benefits from the intra-component chaining of operations. Due(More)
A performance comparison for the 64-bit block cipher (Triple-DES, IDEA, CAST-128, MISTY1, and KHAZAD) FPGA hardware implementations is given in this paper. All these ciphers are under consideration from the ISO/IEC 18033-3 standard in order to provide an international encryption standard for the 64-bit block ciphers. Two basic architectures are implemented(More)
In this paper, the hardware implementations of five representative stream ciphers are compared in terms of performance and consumed area in an FPGA device. The ciphers used for the comparison are the A5/1, W7, E0, RC4 and Helix. The first three ones have been used for the security part of well-known standards, especially wireless communication protocols.(More)
KASUMI block cipher is used for the security part of many synchronous wireless standards. In this paper two architectures and efficient implementations of the 64-bit KASUMI block cipher are presented. In the first one, the pipeline technique (inner-round and outer-round pipeline) is used and throughput value equal to 3584 Mbps at 56 MHz is achieved. The(More)
It is widely known that parallel operation execution in multiprocessor systems generates a respective increase in memory accesses. Since the memory and bus subsystems provide a limited access bandwidth, the applications performance cannot be that high as the multiprocessor system capabilities promise. This is the case for the 2-Dimensional coarse-grained(More)
In this paper, we present a software framework that implements a formalized methodology for partitioning Digital Signal Processing applications between reconfigurable hardware blocks of different granularity. A hybrid generic reconfigurable architecture is considered, so that the methodology is applicable to a large variety of hybrid reconfigurable systems.(More)
Several mesh-like coarse-grained reconfigurable architectures have been devised in the last few years accompanied with their corresponding mapping flows. One of the major bottlenecks in mapping algorithms on these architectures is the limited memory access bandwidth. Only a few mapping methodologies encountered the problem of the limited bandwidth while(More)