Learn More
The high computational cost of complex engineering optimization problems has motivated the development of parallel optimization algorithms. A recent example is the parallel particle swarm optimization (PSO) algorithm, which is valuable due to its global search capabilities. Unfortunately, because existing parallel implementations are synchronous (PSPSO),(More)
Storage, memory, processor, and communications bandwidth are all relatively plentiful and inexpensive. However, a growing expense in the operation of computer networks is electricity usage. Estimates place devices connected to the Internet as consuming about 2%, and growing, of the total electricity produced in the USA—much of this power consumption is(More)
Present day engineering optimization problems often impose large computational demands, resulting in long solution times even on a modern high-end processor. To obtain enhanced computational throughput and global search capability, we detail the coarse-grained parallelization of an increasingly popular global search method, the particle swarm optimization(More)
Dynamic patient-specific musculoskeletal models have great potential for addressing clinical problems in orthopedics and rehabilitation. However, their predictive capability is limited by how well the underlying kinematic model matches the patient's structure. This study presents a general two-level optimization procedure for tuning any multi-joint(More)
The global address space (GAS) programming model provides important potential productivity advantages over traditional parallel programming models. Languages using the GAS model currently have insufficient support from existing performance analysis tools, due in part to their implementation complexity. We have designed the Global Address Space Performance(More)
Chip-multiprocessor (CMP) architectures present a challenge for efficient simulation, combining the requirements of a detailed microprocessor simulator with that of a tightly-coupled parallel system. In this paper, a distributed simulator for target CMPs is presented based on the Message Passing Interface (MPI) designed to run on a host cluster of(More)
System-level design presents special simulation modeling challenges. System-level models address the architectural and functional performance of complex systems. Systems are decomposed into a series of interacting subsystems. Architectures define subsystems, the interconnections between subsystems and contention for shared resources. Functions define the(More)
Given the complexity of parallel programs, developers often must rely on performance analysis tools to help them improve the performance of their code. While many tools support the analysis of message-passing programs, no tool exists that fully supports programs written in programming models that present a partitioned global address space (PGAS) to the(More)
Partial reconfiguration (PR) reveals many opportunities for integration into FPGA design for potential system optimizations such as reduced area, increased performance, and increased functionality. Even though recent advances in Xilinx's Virtex-4 and Virtex-5 FPGA devices and design tools significantly improve the practicality of incorporating PR,(More)