Optimizing assignment of threads to SPEs on the cell BE processor
In the Cell BE, the SPEs communicate over Element Interconnect Bus (EIB). The bandwidth utilization on EIB is reduced due to the congestion created by the simultaneous communications. We observed that the actual bandwidth obtained for inter-SPE communication is strongly influenced by the assignment of threads to SPEs (Thread-SPE affinity). The major contributions of this work are to help understanding the reasons of reduction in bandwidth utilization and develop strategies to build an effective thread SPE mapping schemes in order to optimize the applications that have the inherent inter thread communication. By default, the assignment scheme provided is somewhat random, which sometimes leads to poor affinities and sometimes to good ones. We studied some common communication patterns, for which we could identify a particular affinity that yields performance that is close to twice the average performance of the default affinity. We have observed a performance growth of around 10%-12% by using the above mentioned study in a communication intensive Monte Carlo particle simulation application. We expect that Image and Signal processing applications which follow a pipelined model of operation will be greatly benefited by the optimal Thread-SPE affinity. We also discuss the optimization of affinity on a Cell Blade. We then describe a communication model tool created based on the observations from , which aids in choosing a good affinity, given the communication pattern of the application.