• Corpus ID: 233169038

DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python

@article{Bach2022DoubleMLA,
  title={DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python},
  author={Philipp Bach and Victor Chernozhukov and Malte S. Kurz and Martin Spindler},
  journal={J. Mach. Learn. Res.},
  year={2022},
  volume={23},
  pages={53:1-53:6}
}
DoubleML is an open-source Python library implementing the double machine learning framework of Chernozhukov et al. (2018) for a variety of causal models. It contains functionalities for valid statistical inference on causal parameters when the estimation of nuisance parameters is based on machine learning methods. The object-oriented implementation of DoubleML provides a high flexibility in terms of model specifications and makes it easily extendable. The package is distributed under the MIT… 

Figures from this paper

Coordinated Double Machine Learning

TLDR
This paper argues that a carefully coordinated learning algorithm for deep neural networks may reduce the estimation bias and improves empirical performance of the proposed method through numerical experiments on both simulated and real data.

HiPart: Hierarchical Divisive Clustering Toolbox

This paper presents the HiPart package, an open-source native python library that provides efficient and interpret-able implementations of divisive hierarchical clustering algorithms. HiPart supports

Necessary and sufficient graphical conditions for optimal adjustment sets in causal graphical models with hidden variables

TLDR
The problem of selecting optimal backdoor adjustment sets to estimate causal effects in graphical models with hidden and conditioned variables is addressed and optimality is characterized as maximizing a certain adjustment information which allows to derive a necessary and sufficient graphical criterion for the existence of an optimal adjustment set.

DoubleML - An Object-Oriented Implementation of Double Machine Learning in R

TLDR
This paper serves as an introduction to the double machine learning framework and the R package DoubleML and demonstrates how DoubleML users can perform valid inference based on machine learning methods.

References

SHOWING 1-10 OF 23 REFERENCES

mlr3: A modern object-oriented machine learning framework in R

TLDR
The R (R Core Team, 2019) package mlr3 is a complete reimplementation of the mlr (Bischl et al., 2016) package that leverages many years of experience and learned best practices to provide a state-of-the-art system that is powerful, flexible, extensible, and maintainable.

Scikit-learn: Machine Learning in Python

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing

CausalML: Python Package for Causal Machine Learning

TLDR
The key concepts, scope, and use cases of the causalML package are introduced, which tries to bridge the gap between theoretical work on methodology and practical applications by making a collection of methods in this field available in Python.

SciPy 1.0: fundamental algorithms for scientific computing in Python

TLDR
An overview of the capabilities and development practices of SciPy 1.0 is provided and some recent technical developments are highlighted.

Double/debiased machine learning for difference-in-differences models

This paper provides an orthogonal extension of the semiparametric difference-in-differences estimator proposed in earlier literature. The proposed estimator enjoys the so-called Neyman

Array programming with NumPy

TLDR
How a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data is reviewed.

Distributed Double Machine Learning with a Serverless Architecture

TLDR
A prototype Python implementation DoubleML-Serverless is provided for the estimation of double machine learning models with the serverless computing platform AWS Lambda and its utility is demonstrated with a case study analyzing estimation times and costs.

Data Structures for Statistical Computing in Python

TLDR
P pandas is a new library which aims to facilitate working with data sets common to finance, statistics, and other related fields and to provide a set of fundamental building blocks for implementing statistical models.

Orthogonal Machine Learning: Power and Limitations

TLDR
It is shown that the requirement for double machine learning to provide consistent estimates of parameters of interest can be improved to $n^{-1/(2k+2)}$ by employing a $k$-th order notion of orthogonality that grants robustness to more complex or higher-dimensional nuisance parameters.

Statsmodels: Econometric and Statistical Modeling with Python

TLDR
The current relationship between statistics and Python and open source more generally is discussed, outlining how the statsmodels package fills a gap in this relationship.