Seven Principles for Rapid-Response Data Science: Lessons Learned from Covid-19 Forecasting

  title={Seven Principles for Rapid-Response Data Science: Lessons Learned from Covid-19 Forecasting},
  author={Bin Yu and Chandan Singh},
  journal={Statistical Science},
In this article, we take a step back to distill seven principles out of our experience in the spring of 2020, when our 12-person rapid-response team used skills of data science and beyond to help distribute Covid PPE. This process included tapping into domain knowledge of epidemiology and medical logistics chains, curating a relevant data repository, developing models for short-term county-level death forecasting in the US, and building a website for sharing visualization (an automated AI… 

Data-Centric Epidemic Forecasting: A Survey

This survey delves into various data-driven methodological and practical advancements and introduces a conceptual framework to navigate through them, enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting, and highlights some challenges and open problems found across the forecasting pipeline.

Statistics in times of increasing uncertainty

  • S. Richardson
  • Computer Science
    Journal of the Royal Statistical Society: Series A (Statistics in Society)
  • 2022
The statistical community mobilised vigorously from the start of the 2020 SARS‐CoV‐2 pandemic, and it is argued that these challenges gave impetus to fruitful new directions in the merging of statistical principles with constraints of agility, responsiveness and societal responsibilities.

Data Science in a Time of Crisis: Lessons from the Pandemic

The exceptional shock of the COVID-19 pandemic has brought about an equally exceptional scientific response, over a wide range of disciplines and with a spirit of collaboration and mutual support. ©

Statistical Challenges in Tracking the Evolution of SARS-CoV-2.

The models and methods currently used to monitor the spread of SARS-CoV-2 are described, long-standing and new statistical challenges are discussed, and a method for tracking the rise of novel variants during the epidemic is proposed.



Curating a COVID-19 data repository and forecasting county-level death counts in the United States

This paper presents the continuous curation of a large data repository containing COVID-19 information from a range of sources, and develops and combines multiple forecasts using ensembling techniques, resulting in an ensemble that achieves a coverage rate of more than 94% when averaged across counties for predicting cumulative recorded death counts two weeks in the future.

Imodels: a Python Package for Fitting Interpretable Models

This work provides users a simple interface for fitting and using state-of-the-art interpretable models, all compatible with scikit-learn, and provides a framework for developing custom tools and rule-based models for interpretability.

Scikit-learn: Machine Learning in Python

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing

Statsmodels: Econometric and Statistical Modeling with Python

The current relationship between statistics and Python and open source more generally is discussed, outlining how the statsmodels package fills a gap in this relationship.

Algorithmic Learning in a Random World

A selection of books about type systems in programming languages, information theory, and machine learning that takes the randomness of the world into account, and verification of real time systems.

Agile Software Development: The People Factor

The effects of working in an agile style is described and the problem it addresses and the way in which it addresses the problem are introduced.

Perceptual audio coding using adaptive pre- and post-filters and lossless compression

A subjective listening test of the combined pre-filter/lossless coder and a state-of-the-art perceptual audio coder (PAC) shows that the new method achieves a comparable compression ratio and audio quality with a lower delay.

An interview with Bin Yu. Harvard Data Science Review

  • gorithmic Learning in a Random World
  • 2021

An interview with bin yu

  • Harvard Data Science Review,
  • 2021

An Interview with Bin Yu

  • Harvard Data Science Review
  • 2021