Learn More
This paper represents pioneering work in the area of technology mapping for FPGAs and presents a theoretical breakthrough by solving the LUT-­‐based FPGA technology mapping problem for depth minimization optimally in polynomial time. The key idea was to model the LUT mapping problem as the computation of a minimum-­‐height, K-­‐feasible cut. This was the(More)
Convolutional neural network (CNN) has been widely employed for image recognition because it can achieve high accuracy by emulating behavior of optic nerves in living creatures. Recently, rapid growth of modern applications based on deep learning algorithms has further improved research and implementations. Especially, various accelerators for deep CNN have(More)
As the technology progresses, interconnect delays have become bottlenecks of chip performance. 3D integrated circuits are proposed as one way to address this problem. However, thermal problem is a critical challenge for 3D IC circuit design. We propose a thermal-driven 3D floorplanning algorithm. Our contributions include: (1) a new 3D floorplan(More)
<italic>This paper studies buffer block planning for interconnect-driven floorplanning in deep submicron designs. We first introduce the concept of feasible region (FR) for buffer insertion, and derive closed-form formula for FR. We observe that the FR for a buffer is quite large in general even under fairly tight delay constraint. Therefore, FR gives us a(More)
In this paper, we explore the use of multi-band radio frequency interconnect (or RF-I) with signal propagation at the speed of light to provide shortcuts in a many core network-on-chip (NoC) mesh topology. We investigate the costs associated with this technology, and examine the latency and bandwidth benefits that it can provide. Assuming a 400(More)
Escalating system-on-chip design complexity is pushing the design community to raise the level of abstraction beyond register transfer level. Despite the unsuccessful adoptions of early generations of commercial high-level synthesis (HLS) systems, we believe that the tipping point for transitioning to HLS msystem-on-chip design complexityethodology is(More)
Cut enumeration is a common approach used in a number of FPGA synthesis and mapping algorithms for consideration of various possible LUT implementations at each node in a circuit. Such an approach is very general and flexible, but often suffers high computational complexity and poor scalability. In this paper, we develop several efficient and effective(More)
The multilevel placement package mPL6 combines improved implementations of the global placer mPL5 (ISPD05) and the XDP legalizer and detailed placer (ASPDAC06). It consistently produces robust, high-quality solutions to difficult instances of mixed-size placement in fast and scalable run time. Best-choice clustering (ISPD05) is used to construct a hierarchy(More)
We study the minimum-cost bounded-skew routing tree problem under the pathlength (linear) and Elmore delay models. This problem captures several engineering tradeoffs in the design of routing topologies with controlled skew. Our bounded-skew routing algorithm, called the BST/DME algorithm, extends the DME algorithm for exact zero-skew trees via the concept(More)
Designing an application-specific embedded system in nanometer technologies has become more difficult than ever due to the rapid increase in design complexity and manufacturing cost. Efficiency and flexibility must be carefully balanced to meet different application requirements. The recently emerged configurable and extensible processor architectures offer(More)