The performance and energy-efficiency advantages of customized architectures are well-known and widely-pursued. To date there has been no systematic basis to balance benefit from specialization with general-purpose coverage, much less to assemble accelerators to collectively support a generalpurpose workload. Our study is a first step to create a systematic basis for heterogeneous architectures, balancing specialization and general-purpose coverage. We analyze a collection of 34 programs drawn from five major benchmark suites and a few independent sources, producing clusters of critical loops based on operation and datatype as well as three dimensions of memory behavior. Computational characteristics of each cluster are an opportunity for customized architecture. The operation and datatype analysis produces 25 multi-loop clusters, corresponding to over 90% of the computations. Memory behavior studies produce a similar number of clusters. The clusters can be exploited individually, or in groups, enabling specialization to be traded systematically against generality as proposed in Borkar and Chien’s “10x10” , influencing choices of accelerators, accelerator breadth, and ensembles of accelerators (in an SoC). For several clusters, we discuss examples of how they might be exploited architecturally for improved performance or energy efficiency. The fact that there are a reasonable number of clusters and that these clusters span multiple application domains, demonstrates that architecture design need not cater to each domain individually, and bodes well for 10x10 research. We also show that the clusters are coherent, distinct, stable and that the applications are a good representation of general-purpose workloads.