Learn More
Creating a high throughput sparse matrix vector multiplication (SpMxV) implementation depends on a balanced system design. In this paper, we introduce the innovative SpMxV solver designed for FPGAs (SSF). Besides high computational throughput, system performance is optimized by reducing initialization time and overheads, minimizing and overlapping I/O(More)
Abstract— This paper is concerned with the problem of controlling plants over communication channels, where the plant is subject to two types of unstructured uncertainty: additive uncertainty and stable coprime factor uncertainty. Necessary lower bounds on the rate of transmission (or channel capacity) C , for robust stabilization, are computed explicitly.(More)
This paper proposes a high performance least square solver on FPGAs using the Cholesky decomposition method. Our design can be realized by iteratively adopting a single triangular linear equation solver for modified Cholesky decomposition and forward/backward substitutions. Good performance is achieved by optimizing the Cholesky factorization algorithms,(More)
In the operations of the container terminals, berth scheduling problem is one of the main bottlenecks that restrict the container terminals to reduce the turnaround time of the ships and the operation costs. In this paper we described a nonlinear model for the berth scheduling problem and solved this model by the genetic algorithm GA and the hybrid(More)
Higher peak performance on Field Programmable Gate Arrays (FPGAs) than on microprocessors was shown for sparse matrix vector multiplication (SpMxV) accelerator designs. However due to the frequent memory movement in SpMxV, system performance is heavily affected by memory bandwidth and overheads in real applications. In this paper, we introduce an innovative(More)
Cholesky decomposition has been widely utilized for positive symmetric matrix factorization in solving least square problems. Various parallel accelerators including GPUs and FPGAs have been explored to improve performance. In this paper, Cholesky decomposition is implemented on both FPGAs and GPUs by designing a dedicated architecture for FPGAs and(More)
In performance modeling of parallel synchronous iterative applications, the longest individual execution time among parallel processors determines the iteration time and often must be estimated for performance analysis. This involves the mean maximum calculation which has been a challenge in computer modeling for a long time. For large systems, numerical(More)