We describe the design of a C++ vector-manipulation substrate that allows firstorder optimization algorithms to be expressed in a concise and readable manner, yet still achieve high performance in parallel computing environments. We use standard object-oriented techniques of encapsulation and operator overloading, combined with a novel “symbolic temporaries” delayed-evaluation system that greatly reduces the overhead induced by compiler temporaries and economizes on memory references. We also provide infrastructure to support line-search methods by caching function values and gradients at previously-visited points in a transparent manner that does not “clutter” the principal implementation. We demonstrate the usefulness of our vector-substrate tools by employing them to efficiently solve large-scale LASSO problems using hundreds of processor cores. We reformulate the LASSO problem as a bound-constrained quadratic optimization, and then solve it using the Spectral Projected Gradient (SPG) method implemented through our vector-manipulation substrate. Acknowledgements. This research was supported in part by National Science Foundation grant CCF-1115638. The authors acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing HPC resources that have contributed to the research results reported within this paper. See http://www.tacc.utexas.edu.