Microarchitecture of the Godson-2 Processor

  title={Microarchitecture of the Godson-2 Processor},
  author={W. Hu and Fuxing Zhang and Zusong Li},
  journal={Journal of Computer Science and Technology},
The Godson project is the first attempt to design high performance general-purpose microprocessors in China. This paper introduces the microarchitecture of the Godson-2 processor which is a 64-bit, 4-issue, out-of-order execution RISC processor that implements the 64-bit MIPS-like instruction set. The adoption of the aggressive out-of-order execution techniques (such as register mapping, branch prediction, and dynamic scheduling) and cache techniques (such as non-blocking cache, load… Expand
Implementing a 1GHz Four-Issue Out-of-Order Execution Microprocessor in a Standard Cell ASIC Methodology
This paper introduces the microarchitecture and physical implementation of the Godson-2E processor, which is a four-issue superscalar RISC processor that supports the 64-bit MIPS instruction set. TheExpand
Microarchitecture and Performance Analysis of Godson-2 SMT Processor
Experimental results indicate that the performance of Godson-2 SMT processor is improved significantly by fully exploiting thread-level parallelism and optimized utilization of functional units. Expand
Hardware/Software Interface Design of Godson-2 Simultaneous Multithreading Processor
The results show that the Godson-2 SMT processor designed with the interface can get a much better performance and the hardware/software interface and operating system design in this paper are also very useful for the multi-core processor design. Expand
Code reordering on limited branch offset
This paper analyzes the effect of limited branch offset of the MIPS-like instruction set, explores two simple methods to handle the exceeded branches, and proposes the bidirectional code layout (BCL) algorithm to reduce the number of branches exceeding the offset limit. Expand
An ultra-fast hybrid simulation framework for ASIP
A hybrid simulation framework which further improves the previous simulation methods by aggressively utilizing the host machine resources is proposed by categorizing instructions of ASIP application into two types, namely custom and basic instructions via binary instrumentation. Expand
Design and Implementation of Floating Point Stack on General RISC Architecture
An optimized register renaming scheme is proposed to eliminate redundant micro-ops in FP programs, resulting in an increased performance while mitigating the burden on register rename table. Expand
Godson-3: A Scalable Multicore RISC Processor with x86 Emulation
The Godson-3 microprocessor aims at high-throughput server applications, high-performance scientific computing, and high-end embedded applications. It offers a scalable network on chip, hardwareExpand
Exploiting idle register classes for fast spill destination
This paper developed a model, called the IRE model, or IREM, to determine the static performance gains of IRE versus spilling to the stack, and finds that IRE method speeds up the execution of the SPECint benchmark suite from 1.7% to 10%. Expand
SimOS-Goodson: A Goodson-Processor Based Multi-Core Full-System Simulator
A new multi-core full-system simulator of Goodson processors, SimOS-Goodson, has been designed and implemented, which decouples the simulation functionality and timing and adopts a new value-prediction approach to implement memory consistency in the simulation environment. Expand
Simplified Multi-Ported Cache in High Performance Processor
  • H. Zhang, Dongrui Fan
  • Computer Science
  • 2007 International Conference on Networking, Architecture, and Storage (NAS 2007)
  • 2007
The authors' technique using a simplified multi-ported banking cache, reduces the delay of select logic in LSQ by 16.1%, and achieves 98.1% of the performance of an ideal dual-ported cache. Expand


The PA-8000 RISC CPU is the first of a new generation of Hewlett-Packard microprocessors designed for high-end systems, and features an aggressive, four-way, superscalar implementation, combining speculative execution with on-the-fly instruction reordering. Expand
The Mips R10000 superscalar microprocessor
The Mips R10000 is a dynamic, superscalar microprocessor that implements the 64-bit Mips 4 instruction set architecture. It fetches and decodes four instructions per cycle and dynamically issues themExpand
Introducing the IA-64 Architecture
The motivation, operation, and benefits of the major features of IA-64 are examined and it is found that instruction-level parallelism (ILP) can be exploited for further performance increases. Expand
The Alpha 21264 microprocessor
A unique combination of high clock speeds and advanced microarchitectural techniques, including many forms of out-of-order and speculative execution, provide exceptional core computational performance in the 21264. Expand
UltraSPARC-III: designing third-generation 64-bit performance
The UltraSPARC-III is the third generation of Sun Microsystems' most powerful microprocessors, which are at the heart of Sun's computer systems and ensures compatibility with all existing SPARC applications and the Solaris operating system. Expand
POWER4 system microarchitecture
The processor microarchitecture as well as the interconnection architecture employed to form systems up to a 32-way symmetric multiprocessor are described. Expand
Computer Architecture: A Quantitative Approach
This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most importantExpand