Giorgos Dimitrakopoulos

Learn More
Parallel-prefix adders offer a highly efficient solution to the binary addition problem and are well-suited for VLSI implementations. In this paper, a novel framework is introduced, which allows the design of parallel-prefix Ling adders. The proposed approach saves one-logic level of implementation compared to the parallel-prefix structures proposed for the(More)
The need for efficient implementation of simple crossbar schedulers has increased in the recent years due to the advent of on-chip interconnection networks that require low latency message delivery. The core function of any crossbar scheduler is arbitration that resolves conflicting requests for the same output. Since, the delay of the arbiters directly(More)
In this work, we propose a new algorithm for designing diminished-1 modulo 2 þ 1 multipliers. The implementation of the proposed algorithm requires nþ 3 partial products that are reduced by a tree architecture into two summands, which are finally added by a diminished-1 modulo 2 þ 1 adder. The proposed multipliers, compared to existing implementations,(More)
Large systems-on-chip (SoCs) and chip multiprocessors (CMPs), incorporating tens to hundreds of cores, create a significant integration challenge. Interconnecting a huge amount of architectural modules in an efficient manner, calls for scalable solutions that would offer both high throughput and low-latency communication. The switches are the basic building(More)
The efficiency of modern Networks-on-Chip (NoC) is no longer judged solely by their physical scalability, but also by their ability to deliver high performance, Quality-of-Service (QoS), and flow isolation at the minimum possible cost. Although traditional architectures supporting Virtual Channels (VC) offer the resources for flow partitioning and(More)
Two architectures for parallel-prefix modulo 2 − 1 adders are presented in this paper. For large wordlengths we introduce the sparse modulo 2 − 1 adders that achieve significant reduction of the wiring complexity without imposing any delay penalty. Then, the Ling-carry formulation of modulo 2 − 1 addition is presented. Ling modulo adders save one logic(More)
In this paper, a new leading-zero counter (or detector) is presented. New boolean relations for the bits of the leading-zero count are derived that allow their computation to be performed using standard carry-lookahead techniques. Using the proposed approach various design choices can be explored and different circuit topologies can be derived for the(More)
High-end embedded processors demand complex on-chip cache hierarchies satisfying several contradicting design requirements such as high-performance operation and low energy consumption. This paper introduces light-power (LP) nonuniform cache architecture (NUCA), a tiled-cache addressing both goals. LP-NUCA places a group of small and low-latency tiles(More)
Networks-on-FPGA consist of a network of switches connected with point-to-point links and can cover sufficiently the communication needs of complex systems implemented on FPGA platforms. The efficient implementation of such networks requires the appropriate tuning of their components to the characteristics of the FPGA's logic and memory resources. In this(More)