Teaching Data Science

  title={Teaching Data Science},
  author={Robert J. Brunner and Edward J. Kim},

Figures from this paper

Data Science in 2020: Computing, Curricula, and Challenges for the Next 10 Years
Abstract In the past 10 years, new data science courses and programs have proliferated at the collegiate level. As faculty and administrators enter the race to provide data science training and
A Fresh Look at Introductory Data Science
A case study of an introductory undergraduate course in data science designed to address the needs of graduates trained in both the statistical and the computational set of skills required to effectively plan, acquire, manage, analyze, and communicate the findings of such data.
Programming Paradigms for Computational Science: Three Fundamental Models
The key mental models found to be essential to understanding solution design are discussed, and some insights on additional elements found important in understanding the specificities of current practice in data analysis tasks are discussed.
Integrating Data Science into a General Education Information Technology Course: An Approach to Developing Data Savvy Undergraduates
A survey IT course can provide comprehensive introductory data science education by adding a data science module focused on modeling and evaluation, two key steps in the data science process.
What is a Data Science/Analytics Degree?
This panel will foster a debate with respect to what are the key learning objectives within data science / analytics programs and should there be different types of data science related programs (such as an applied data science program or a business analytics program in addition to data science programs).
An Empirical Approach to Understanding Data Science and Engineering Education
This working group report shows an empirical and data-driven view of the data-related education landscape, and includes several recommendations for both academia and industry that are based on this analysis.
Cross-Disciplinary Faculty Development in Data Science Principles for Classroom Integration
This paper presents a cross-disciplinary instructional program model designed to narrow the data science instruction gap for faculty and provides individualized and group-based support structures to instill data science principles and transition them from learners to educators in data science.
Helping Data Science Students Develop Task Modularity
A mixed method study of a project-based data science class, where student effectiveness with respect to dividing a project into appropriately sized modular tasks was evaluated, suggesting that while data science students can appreciate the value of task modularity, they struggle to achieve effective task modularities.
Towards Open-World Scenarios: Teaching the Social Side of Data Science
This article looks at how to make teaching data science classes more relevant to real-world problems and student engagement with real problems has the potential to stimulate learning, exchange, and serendipity on all sides.


Data Science in Statistics Curricula: Preparing Students to “Think with Data”
Examples and resources for instructors to implement data science in their own statistics curricula are provided and examples of assignments designed for courses that foster engagement of undergraduates with data and data science are provided.
Computing in the Statistics Curricula
An approach to teaching these topics in combination with scientific problems and modern statistical methods that focuses on ideas and skills for statistical inquiry and working with data is presented.
ASA 2009 Data Expo
The ASA Statistical Computing and Graphics Data Expo is a biannual data exploration challenge, which consisted of flight arrival and departure details for all commercial flights on major carriers within the USA, from October 1987 to April 2008.
50 Years of Data Science
A vision of data science is presented based on the activities of people who are “learning from data,” and an academic field dedicated to improving that activity in an evidence-based manner is described, being able to accommodate the same short-term goals.
Think Stats
This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python, to help you learn the entire data analysis process.
Scikit-learn: Machine Learning in Python
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing
Mere Renovation is Too Little Too Late: We Need to Rethink our Undergraduate Curriculum from the Ground Up
The last half-dozen years have seen The American Statistician publish well-argued and provocative calls to change our thinking about statistics and how we teach it, among them Brown and Kass, Nolan
Data scientist: the sexiest job of the 21st century.
Harvard Business School's Davenport and Greylock's Patil take a deep dive on what organizations need to know about data scientists: where to look for them, how to attract and develop them, and how to spot a great one.
IPython: A System for Interactive Scientific Computing
The IPython project provides on enhanced interactive environment that includes, among other features, support for data visualization and facilities for distributed and parallel computation.
The NumPy Array: A Structure for Efficient Numerical Computation
This effort shows, NumPy performance can be improved through three techniques: vectorizing calculations, avoiding copying data in memory, and minimizing operation counts.