Learn More
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to(More)
The propagation delay across long on-chip buses is increasingly becoming a limiting factor in high-speed designs. Crosstalk between adjacent wires on the bus may create a significant portion of this delay. Placing a shield wire between each signal wire alleviates the crosstalk problem but doubles the area used by the bus, an unacceptable consequence when(More)
DSP architectures typically provide indirect addressing modes with autoincrement and decrement. In addition, indexing mode is generally not available, and there are usually few, if any, general-purpose registers. Hence, it is necessary to use address registers and perform address arithmetic to access automatic variables. Subsuming the address arithmetic(More)
System-level design issues become critical as implementation technology evolves towards increasingly complex integrated circuits and the time-to-market pressure continues relentlessly. To cope with these issues, new methodologies that emphasize re-use at all levels of abstraction are a " must " , and this is a major focus of our work in the Gigascale(More)
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to(More)
Recent developments in programmable, highly parallel Graphics Processing Units (GPUs) have enabled high performance implementations of machine learning algorithms. We describe a solver for Support Vector Machine training running on a GPU, using the Sequential Minimal Optimization algorithm and an adaptive first and second order working set selection(More)
mSTW~ In ibis tutotia~ wt take a frtxb hok at tbe probhms posed @ deep submimon ~S~14) geometn.esand re+pen the investigationinto bow DS~l{ ti~ectsan most Jke~ going to a~ectj~ture de,ign methobhgies. ~~e dej-'~bea conpnbensive appmacb to accurate~CbafaCfetiVhg the dtvice and interconnectcharactetitics of present and future process gtncrations. This(More)
Communication-based design represents a formal method approach to of system-on-a-chip design that considers communication between components as important as the computations they perform. “Our network-on-chip&rdqo ; approach partitions the communication into layers to maximize reuse and provide a programmer with an abstraction of the underlying(More)