• Publications
  • Influence
Floem: A Programming System for NIC-Accelerated Network Applications
TLDR
FLOEM is designed -- a language, compiler, and runtime -- for programming NIC-accelerated applications and is used to explore NIC-offloading designs of real-world applications, including a key-value store and a distributed real-time data analytics system.
Portable performance on heterogeneous architectures
TLDR
A programming model in which the best mapping of programs to processors and memories is determined empirically, and the rich choice space allows the autotuner to construct poly-algorithms that combine many different algorithmic techniques, using both the CPU and the GPU, to obtain better performance than any one technique alone.
E3: Energy-Efficient Microservices on SmartNIC-Accelerated Servers
TLDR
This work presents E3, a microservice execution platform for SmartNIC-accelerated servers that follows the design philosophies of the Azure Service Fabric microservice platform and extends key system components to a SmartN IC to address the above-mentioned challenges.
Chlorophyll
Speaking generally the plant kingdom is green. The green substance is usually spoken of as chlorophyll which really is a mixture of two closely allied substances Chlorophyll a and Chlorophyll b. The
Scaling up Superoptimization
TLDR
LENS is a search algorithm that increases the size of code a superoptimizer can synthesize by rapidly pruning away invalid candidate programs, and it exploits the stochastic search to make random jumps in a large candidate space and a symbolic search to synthesize arbitrary constants.
Scaling up Superoptimization
TLDR
LENS is a search algorithm that increases the size of code a superoptimizer can synthesize by rapidly pruning away invalid candidate programs, and it exploits the stochastic search to make random jumps in a large candidate space and a symbolic search to synthesize arbitrary constants.
Communication-minimizing 2D convolution in GPU registers
TLDR
This work reorganizes the convolution algorithm to prefetch image regions to register, and does more work per thread with fewer threads to enable portability to future architectures and implements a convolution autotuner that sweeps the design space of memory layouts and loop unrolling configurations.
A Comparison of Error Metrics for Learning Model Parameters in Bayesian Knowledge Tracing
TLDR
This work compares several metrics, including log-likelihood (LL), RMSE, and AUC, to evaluate which metric is most suited for learning parameters in the knowledge-tracing model and shows that RMSE is significantly better than LL and A UC.
Chlorophyll: synthesis-aided compiler for low-power spatial architectures
We developed Chlorophyll, a synthesis-aided programming model and compiler for the GreenArrays GA144, an extremely minimalist low-power spatial architecture that requires partitioning the program
High-Coverage Hint Generation for Massive Courses: Do Automated Hints Help CS1 Students?
TLDR
A robust hint generation system that extends the coverage of the mutation-based approach using two complementary techniques and shows that hints contributed to students' progress while still encouraging the students to solve problems by themselves.
...
...