Computationally efficient locality-aware interconnection topology for multi-processor system-on-chip (MP-SoC)

  title={Computationally efficient locality-aware interconnection topology for multi-processor system-on-chip (MP-SoC)},
  author={Haroon-Ur-Rashid Khan and Feng Shi and Weixing Ji and Yujin Gao and Yizhuo Wang and Caixia Liu and Ning Deng and Jiaxin Li},
  journal={Chinese Science Bulletin},
This paper evaluates the Triplet Based Architecture, TriBA — a new idea in chip multiprocessor architectures and a class of Direct Interconnection Network (DIN). TriBA consists of a 2D grid of small, programmable processing units, each physically connected to its three neighbors so that advantageous features of group locality can be fully and efficiently utilized. Any communication model can be well characterized by locality properties and, any topology has its intrinsic, structural, locality… 

Design and Evaluation of Efficient Router Architecture for Triplet-Based Network-on-Chip Topology

Simulation results show that the X Router cannot only decrease traffic latency and energy consumption, but also improve throughput over the baseline router architecture.

Express Router Microarchitecture for Triplet-based Hierarchical Interconnection Network

This paper proposes a topology-related express router microarchitecture for THIN, named E-THIN, which reduces up to 55% packet latency, improves up to 12% throughput, and reduces 8% energy consumption per packet over a baseline router microARCHitecture.

Improving Router Efficiency in Network on Chip Triplet-Based Hierarchical Interconnection Network with Shared Buffer Design

The cyclic queue is allowed the simultaneous access to the shared buffer, which is one of the characteristics of TBHIN, and results illustrate that the packet latency is reduces up to 29% by shared buffer design in comparison to conventional buffer design.

Exploring grouped coherence for clustered hierarchical cache

A novel vertical caching protocol combined with grouped coherence, in which the coherence domain expand on demand is proposed, to provide a ‘best-effort’ single-copy delivery which allows the shared data only in the first common shared level.

3D floorplanning of low-power and area-efficient Network-on-Chip architecture

A Long-wire-connected and Multi-channel 3D Network-on-chip Design for Many-core System

To reduce traffic jam caused by various data competitions for channel, this work presents a low delay and energy efficient network-on-chip with three channels for different type's data that achieves better performance in latency and energy.

A Novel Speedup Evaluation for Multicore Architecture Based Topology of On-Chip Memory

A novel method ETOM (Evaluation on Topology of On-chip Memory) to evaluate the speedup of multicore is presented and a novel multicore architecture TriBA can be obtained which performance is better than 2DMesh.

The Column-Partition and Row-Partition Turn Model

  • Yuan CaiW. LuoD. Xiang
  • Computer Science
    2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC)
  • 2018
A simple and intuitive method to judge the deadlock-free feature of the routing algorithm which is of important significance to avoid deadlock is presented and the proof of correctness of this method is given.



A New Hierarchical Interconnection Network for Multi-core Processor

The results show that THIN is a better candidate for constructing the NOC than 2-D mesh, when there are not too many cores.

The Y-architecture: yet another on-chip interconnect solution

A new on-chip interconnect scheme called Y-architecture is proposed, which can utilize the on- chip routing resources more efficiently than traditional Manhattan interconnect architecture by allowing wires routed in three directions.

Exploiting Wiring Resources on Interconnection Network: Increasing Path Diversity

The paper shows that SDM is a technique to take into account in on-chip networks since it allows to highly increase the network accepted traffic at the expense of a small latency increase or even no increase.

Performance evaluation and design trade-offs for network-on-chip interconnect architectures

This paper develops a consistent and meaningful evaluation methodology to compare the performance and characteristics of a variety of NoC architectures and explores design trade-offs that characterize the NoC approach and obtains comparative results for a number of common NoC topologies.

Interconnections in multi-core architectures: understanding mechanisms, overheads and scaling

Examination of the area, power, performance, and design issues for the on-chip interconnects on a chip multiprocessor shows that designs that treat interconnect as an entity that can be independently architected and optimized would not arrive at the best multi-core design.

A Triplet-based Computer Architecture Supporting Parallel Object Computing

TriBA is an object-oriented chip multi-processor that supports truly parallel execution of objects from hardware and achieves the unification of software architecture and computer, and also relieves the burden of parallel programming.

Structured interconnect architecture: a solution for the non-scalability of bus-based SoCs

The butterfly fat tree (BFT) can meet this objective when used as the overall MP-SoC interconnect architecture, thereby offering an attractive alternative for SoC interConnect that does not suffer from the non-scalability aspect of the buses in regards to the clock cycle.

A New Routing Algorithm in Triple-Based Hierarchical Interconnection Network

  • B. QiaoS. FengWeixing Ji
  • Computer Science
    First International Conference on Innovative Computing, Information and Control - Volume I (ICICIC'06)
  • 2006
The analysis based on the simulation of DDRA (distributed deterministic routing algorithm) shows it is not only very simple and easy to be implemented in hardware, but has high efficiency.

Diagonal routing in high performance microprocessor design

This prototype chip proved that the diagonal routing method was effective in reducing the total net length and improving path delay in the microprocessor design.

Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction

This paper proposes and evaluates single-ISA heterogeneous multi-core architectures as a mechanism to reduce processor power dissipation and results indicate a 39% average energy reduction while only sacrificing 3% in performance.