Youn-Long Lin

Learn More
We propose a graph-theoretic approach for the data path allocation problem. We decompose the problem into three subproblems: (1) register allocation, (2) operation assignment, and (3) connection allocation. The first two subproblems are modeled as two bipartite weighted matching problems and solved using the Hungarian Method [Pap82]. The third subproblem is(More)
We propose a near optimal hardware architecture for deblocking filter in H.264/MPEG-4 AVC. We propose a novel filtering order and a data reuse strategy that result in significant saving in filtering time, local memory usage, and memory traffic. Every 16x16 macroblock requires 192 filtering operations. After a few initialization cycles, our 5-stage pipelined(More)
We present an algorithm for pipelining loop execution in the presence of loop catried dependence. We optimize both the initiation interval and the turn around time of a schedule. Given constraints on the number of functional units and buses, we tirst determine an initiation interval and then incrementally partition the operations into blocks to fit into the(More)
The test scheduling of memory cores can significantly affect the test time and power of system chips. We propose a test scheduling algorithm for BISTed memory cores to minimize the overall testing time under the test power constraint. The proposed algorithm combines several approaches for a near-optimal result, based on the properties of BISTed memory(More)
Three-dimensional (3D) on-chip memory stacking has been proposed as a promising solution to the “memory wall” challenge with the benefits of low access latency, high data bandwidth, and low power consumption. The stacked memory tiers leverage through-silicon-vias (TSVs) to communicate with logic tiers, and thus dramatically reduce the access latency and(More)
We propose a fast algorithm for the transistor-chaining problem in CMOS functional cell layout based on Uehara and van Manuscript received January 4, 1989; revised May 31, 1989. This work was supported in part by ERSO under Contract SF-C-010-1 and by the Cleemput’s layout style [lZ]. Our algorithm takes a transistor-level circuit schematic and outputs a(More)
We propose a performance-driven cell placement method based on a modified force-directed approach. A pseudo net is added to link the source and sink flip-flops of every critical path to enforce their closeness. Given user-specified I/O pad locations at the chip boundaries and starting with all core cells in the chip center, we iteratively move a cell to its(More)