Godson-3: A Scalable Multicore RISC Processor with x86 Emulation

  title={Godson-3: A Scalable Multicore RISC Processor with x86 Emulation},
  author={W. Hu and J. Wang and Xiang Gao and Yunji Chen and Qi Liu and Guojie Li},
  journal={IEEE Micro},
The Godson-3 microprocessor aims at high-throughput server applications, high-performance scientific computing, and high-end embedded applications. It offers a scalable network on chip, hardware support for x86 emulation, and a reconfigurable architecture. The four-core Godson-3 chip is fabricated with 65-nm CMOS technology. Eight- and 16-core Godson-3 chips are in development. 
System Architecture of Godson-3 Multi-Core Processors
The system architecture of Godson-3 from various aspects including system scalability, organization of memory hierarchy, network-on-chip, inter-chip connection and I/O subsystem is introduced. Expand
Physical Implementation of the 1GHz Godson-3 Quad-Core Microprocessor
The design methodology of the physical implementation of Godson-3A, with particular emphasis on design methods for high frequency, clock tree design, power management, and on-chip variation (OCV) issue is described. Expand
Physical Implementation of the Eight-Core Godson-3B Microprocessor
  • Ru Wang, B. Fan, +7 authors W. Hu
  • Computer Science
  • Journal of Computer Science and Technology
  • 2011
The Godson-3B processor is a powerful processor designed for high performance servers including Dawning Servers that contains 582.6M transistors within 300mm2 area in 65 nm technology and is implemented in parallel with full hierarchical design flows. Expand
The implementation and design methodology of a quad-core version Godson-3 microprocessor
Godson-3A is a quad-core version of Godson-3 series which is a 174mm2, 425 million transistors chip fabricated using 65nm CMOS LP/GP process technology. It can be running at 1GHz with less than 15WExpand
A Scalable Scan Architecture for Godson-3 Multicore Microprocessor
This paper describes the scan test challenges and techniques used in the Godson-3 microprocessor, which is a scalable multicore processor based on the SMOC (Scalable Mesh of Crossbar) on-chip networkExpand
A Robust and Power-Efficient SoC Implementation in 65 nm
Godson2H is a complex SoC (System-on-Chip) of Godson series, which is a 117 mm2, 152 million transistors chip fabricated in 65 nm CMOS LP/GP process technology. It integrates a 1 GHz processor coreExpand
Design for Testability Features of Godson-3 Multicore Microprocessor
This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC)Expand
Design for testability features of godson-3 multicore microprocessor
This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC)Expand
Design and Application of Instruction Set Simulator on Multi-Core Verification
The multi-core chip architecture is introduced first, after that a general methodology to expand a single-core ISS to a multi- core ISS (MCISS) is proposed, and a real-time comparison environment is created for multi- Core verification, and the problems of multi-Core communication and synchronization are addressed gracefully. Expand
A Processor-DMA-Based Memory Copy Hardware Accelerator
A processor DMA based memory copy hardware accelerator is proposed with the goal to reduce the instructions executed on CPU, and exploit the parallelism between computing and data transfer in memory copy by taking advantage of a proposed Direct Memory Access (DMA) engine in the processor. Expand


Niagara: a 32-way multithreaded Sparc processor
The Niagara processor implements a thread-rich architecture designed to provide a high-performance solution for commercial server applications that exploits the thread-level parallelism inherent to server applications, while targeting low levels of power consumption. Expand
Microarchitecture of the Godson-2 Processor
The Godson project is the first attempt to design high performance general-purpose microprocessors in China. This paper introduces the microarchitecture of the Godson-2 processor which is a 64-bit,Expand
The PA-8000 RISC CPU is the first of a new generation of Hewlett-Packard microprocessors designed for high-end systems, and features an aggressive, four-way, superscalar implementation, combining speculative execution with on-the-fly instruction reordering. Expand
UltraSPARC-III: designing third-generation 64-bit performance
The UltraSPARC-III is the third generation of Sun Microsystems' most powerful microprocessors, which are at the heart of Sun's computer systems and ensures compatibility with all existing SPARC applications and the Solaris operating system. Expand
IBM Power5 chip: a dual-core multithreaded processor
The approach to improve chip-level performance of the Power5 was described, which specified increased performance and other functional enhancements of server virtualization, reliability, availability, and serviceability at both chip and system levels. Expand
QEMU, a Fast and Portable Dynamic Translator
  • Fabrice Bellard
  • Computer Science
  • USENIX Annual Technical Conference, FREENIX Track
  • 2005
QEMU supports full system emulation in which a complete and unmodified operating system is run in a virtual machine and Linux user mode emulation where a Linux process compiled for one target CPU can be run on another CPU. Expand
The Mips R10000 superscalar microprocessor
The Mips R10000 is a dynamic, superscalar microprocessor that implements the 64-bit Mips 4 instruction set architecture. It fetches and decodes four instructions per cycle and dynamically issues themExpand
Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture
The Tera-op reliable intelligently adaptive processing system (TRIPS) architecture seeks to deliver system-level configurability to applications and runtime systems. It does so by employing theExpand
The Alpha 21264 microprocessor
A unique combination of high clock speeds and advanced microarchitectural techniques, including many forms of out-of-order and speculative execution, provide exceptional core computational performance in the 21264. Expand
Dynamic Binary Translation and Optimization
Different design trade-offs in the DAISY system and their impact on final system performance are reported, and the results show high degrees of instruction parallelism with reasonable translation overhead and memory usage. Expand