Array programming with NumPy

  title={Array programming with NumPy},
  author={Charles R. Harris and K. Jarrod Millman and St{\'e}fan van der Walt and Ralf Gommers and Pauli Virtanen and David Cournapeau and Eric Wieser and Julian Taylor and Sebastian Berg and Nathaniel J. Smith and Robert Kern and Matti Picus and Stephan Hoyer and Marten Henric van Kerkwijk and Matthew Brett and Allan Haldane and Jaime Fern'andez del R'io and Marcy Wiebe and Pearu Peterson and Pierre G'erard-Marchant and Kevin Sheppard and Tyler Reddy and Warren Weckesser and Hameer Abbasi and Christoph Gohlke and Travis E. Oliphant},
  pages={357 - 362}
Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of… 
Distributional Generalization: A New Kind of Generalization
We introduce a new notion of generalization -- Distributional Generalization -- which roughly states that outputs of a classifier at train and test time are close *as distributions*, as opposed to
Multiphase turbulence in galactic haloes: effect of the driving
Supernova explosions, active galactic nuclei jets, galaxy–galaxy interactions and cluster mergers can drive turbulence in the circumgalactic medium (CGM) and in the intracluster medium (ICM).
Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
This work adopts device-cloud collaborative ML and builds the first end-to-end and general-purpose system, called Walle, which consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, preparing task input; and a compute container, providing a cross-platform and high-performance execution environment, while facilitating daily task iteration.
Contour: A semi-automated segmentation and quantitation tool for cryo-soft-X-ray tomography
Contour is described, a new, easy-to-use, highly automated segmentation tool that enables accelerated segmentation of tomograms to delineate distinct cellular compartments and shows that high-contrast compartments such as mitochondria, lipid droplets, and features at the cell surface can be easily segmented with this technique in the context of investigating herpes simplex virus 1 infection.
A New 27 Class Sign Language Dataset Collected from 173 Individuals
A new dataset was presented with this paper, created by processing American Sign Language-based photographs collected from 173 volunteers, published as "27 Class Sign Language Dataset” on the Kaggle Datasets web page.
Stress Concentration Factors in Excavation Repairs of Surface Defects in Forgings and Castings
This paper provides an analytical formula for the theoretical stress concentration factor in a common type of excavation repair for large forgings and castings. Mechanical components obtained with
Algorithmic Pulsar Timing
The Algorithmic Pulsar Timer, APT, an algorithm which can accurately phase connect and time isolated pulsars, is created, which is the first of its kind in pulsar timing, and sets the foundation for automated fitting of binary pulsar systems.
Two useful Python tools - dimpy and tablefile for data analysis applications
A Python tool namely ‘dimpy’ is discussed in this paper which can easily generate any multidimensional ‘list’ type array in python.
Contour, a semi-automated segmentation and quantitation tool for cryo-soft-X-ray tomography
Contour is described, a new, easy-to-use, highly automated segmentation tool that enables accelerated segmentation of tomograms to delineate distinct cellular compartments and can extract geometric measurements from 3D segmented volumes, providing a new method to quantitate cryo-soft-X-ray tomography data.
Two useful Python tools and their application in Physics
Python has become a popular programming language among physicists and students/researchers of other fields as well. However, Python still needs improvement to provide ease of use in physical


numarray : A New Scientific Array Package for Python
Python has long had an array module (Numeric) for science and engineering applications; why a replacement? We explain the motivations for developing numarray, which are primarily, though not entirely
xarray: N-D labeled arrays and datasets in Python
This approach combines an application programing interface (API) inspired by pandas with the Common Data Model for self-described scientific data to provide a toolkit and data structures for N-dimensional labeled arrays.
SciPy 1.0: fundamental algorithms for scientific computing in Python
An overview of the capabilities and development practices of SciPy 1.0 is provided and some recent technical developments are highlighted.
Pythran: enabling static optimization of scientific Python programs
Pythran is an open source static compiler that turns modules written in a subset of Python language into native ones that takes advantage of modern C++11 features such as variadic templates, type inference, move semantics and perfect forwarding, as well as classical idioms such as expression templates.
PyTorch: An Imperative Style, High-Performance Deep Learning Library
This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.
Julia: A Fresh Approach to Numerical Computing
The Julia programming language and its design is introduced---a dance between specialization and abstraction, which recognizes what remains the same after computation, and which is best left untouched as they have been built by the experts.
Numba: a LLVM-based Python JIT compiler
This paper presents a just-in-time compiler for Python that focuses in scientific and array-oriented computing, Numba, which compiles a subset of the language into efficient machine code that is comparable in performance to a traditional compiled language.
AUGEM: Automatically generate high performance Dense Linear Algebra kernels on x86 CPUs
A template-based optimization framework, AUGEM, is presented, which can automatically generate fully optimized assembly code for several dense linear algebra kernels, such as GEMM, GEMV, AXPY and DOT, on varying multi-core CPUs without requiring any manual interference from developers.
SunPy—Python for solar physics
Though still in active development, SunPy already provides important functionality for solar data analysis, and future releases will build upon and integrate with current work in the Astropy project and the rest of the scientific python community, to bring greater functionality to SunPy users.