• Corpus ID: 246430305

Flashlight: Enabling Innovation in Tools for Machine Learning

  title={Flashlight: Enabling Innovation in Tools for Machine Learning},
  author={Jacob Kahn and Vineel Pratap and Tatiana Likhomanenko and Qiantong Xu and Awni Y. Hannun and Jeff Cai and Paden Tomasello and Ann Lee and Edouard Grave and Gilad Avidov and Benoit Steiner and Vitaliy Liptchinsky and Gabriel Synnaeve and Ronan Collobert},
machine learning tools and systems by prioritizing open, modular, customizable internals and state-of-the-art, research-ready models and training setups across a variety of domains. Flashlight allows systems researchers to rapidly prototype and experiment with novel ideas in machine learning computation and has low overhead, competing with and often outperforming other popular machine learning frameworks. We see Flashlight as a tool enabling research that can benefit widely used libraries… 

Figures and Tables from this paper

Pseudo-Labeling for Massively Multilingual Speech Recognition

This work proposes a simple pseudo-labeling recipe that works well even with low-resource languages, and can yield a model with better performance for many languages that also transfers well to LibriSpeech.



Chainer: A Deep Learning Framework for Accelerating the Research Cycle

The Chainer framework is introduced, which intends to provide a flexible, intuitive, and high performance means of implementing the full range of deep learning models needed by researchers and practitioners.

Torch7: A Matlab-like Environment for Machine Learning

Torch7 is a versatile numeric computing framework and machine learning library that extends Lua that can easily be interfaced to third-party software thanks to Lua’s light interface.

DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters

Explore new techniques in Microsoft's open source library called DeepSpeed, which advances large model training by improving scale, speed, cost, and usability, unlocking the ability to train

Flux: Elegant machine learning with Julia

  • Mike Innes
  • Computer Science
    J. Open Source Softw.
  • 2018
JuliaFlux is library for machine learning (ML), written using the numerical computing language Julia, and applies automatic differentiation (AD) to seamlessly calculate derivatives and train the model.

Machine Learning Systems are Stuck in a Rut

This paper explains how the evolution of hardware accelerators favors compiler back ends that hyper-optimize large monolithic kernels, and shows how this reliance on high-performance but inflexible kernels reinforces the dominant style of programming model.

MIOpen: An Open Source Library For Deep Learning Primitives

  • Jehandad KhanPaul Fultz Mayank Daga
  • Computer Science
    Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2
  • 2020
This paper introduces MIOpen and provides details about the internal workings of the library and supported features, as well as implementing fusion to optimize for memory bandwidth and GPU launch overheads, and implementing different algorithms to optimize convolutions for different filter and input sizes.

Automatic differentiation in PyTorch

An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.

Beyond Data and Model Parallelism for Deep Neural Networks

A more comprehensive search space of parallelization strategies for DNNs called SOAP, which includes strategies to parallelize a DNN in the Sample, Operation, Attribute, and Parameter dimensions is defined and FlexFlow, a deep learning framework that uses guided randomized search of the SOAP space to find a fast parallelization strategy for a specific parallel machine is proposed.

Learning to Optimize Tensor Programs

A learning-based framework to optimize tensor programs for deep learning workloads that learns domain-specific statistical cost models to guide the search of tensor operator implementations over billions of possible program variants and accelerates the search by effective model transfer across workloads.

PyTorch: An Imperative Style, High-Performance Deep Learning Library

This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.