A scalable algorithm for RTL insertion of gated clocks based on ODCs computation
This paper presents a branch prediction algorithm and a 4-way set-associative cache for performance improvement of 32-bit RISC processor and a clock gating algorithm using ODC (observability don't care) operation for a low-power processor. The branch prediction algorithm has a structure using BTB (branch target buffer) and 4-way set associative cache using pseudo LRU (least recently used) algorithm. The proposed algorithm is applied to OpenRISC1200 processor, embedded processor and implemented on Xilinx VIRTEX-4 XC4VLX80 FPGA device and the FPGA executes at the maximum frequency of 53.042MHz. As a result of estimation of performance and dynamic power, the performance of the OpenRISC1200 processor using the proposed algorithm is improved about 5~9% and dynamic power of the processor using Samsung 0.18 mum technology library is reduced by 13.9%.