Fixing Dockerfile Smells: An Empirical Study

  title={Fixing Dockerfile Smells: An Empirical Study},
  author={Giovanni Rosa and Simone Scalabrino and Rocco Oliveto},
— Background. Containerization technologies are widely adopted in the DevOps workflow. The most commonly used one is Docker, which requires developers to define a specification file (Dockerfile) to build the image used for creating containers. There are several best practice rules for writing Dockerfiles, but the developers do not always follow them. Violations of such practices, known as Dockerfile smells, can negatively impact the reliability and the performance of Docker images. Previous studies… 

Figures and Tables from this paper



Refactorings and Technical Debt in Docker Projects: An Empirical Study

The results indicate that developers refactor these Docker projects for a variety of reasons that are specific to the configuration, combination and execution of containers, leading to several new technical debt categories and refactoring types compared to existing refactororing domains.

An empirical study on self-admitted technical debt in Dockerfiles

A manual classification for SATDs in Dockerfile was conducted, finding that about 3.0% of the comments in Docker file are SATD, and SATDs were related to lowering maintainability, testing, and defects.

Revisiting Dockerfiles in Open Source Software Over Time

  • Kalvin EngAbram Hindle
  • Computer Science
    2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)
  • 2021
A historical view of the Dockerfile format is contributed by analyzing the Docker engine changelogs and using the history to enhance the analysis of Dockerfiles to reconfirm previous findings of a downward trend in using OS images and an upward trend of using language images.

Characterizing the Occurrence of Dockerfile Smells in Open-Source Software: An Empirical Study

An empirical study on a large dataset of 6,334 projects to help developers gain some insights into the occurrence of Dockerfile smells, including its coverage, distribution, co-occurrence, and correlation with project characteristics.

Learning from, Understanding, and Supporting DevOps Artifacts for Docker

A toolset, binnacle, is introduced that enabled us to ingest 900,000 GitHub repositories and learn rules and analyzer that can be used to aid developers in the IDE when creating Dockerfiles, and in a post-hoc fashion to identify issues in, and to improve, existing Dockerfiles.

One size does not fit all: an empirical study of containerized continuous deployment workflows

A mixed-methods study to shed light on developers' experiences and expectations with containerized CD workflows and finds two prominent workflows, based on the automated builds feature on Docker Hub or continuous integration services, with different trade-offs.

Configuration smells in continuous delivery pipelines: a linter and a six-month study on GitLab

CD-Linter is proposed, a semantic linter that can automatically identify four different smells in pipeline configuration files and achieves a precision of 87% and a recall of 94% and can be frequently observed in the wild.

World of Code: An Infrastructure for Mining the Universe of Open Source VCS Data

A very large and frequently updated collection of version control data for FLOSS projects named World of Code (WoC), which is capable of supporting trend evaluation, ecosystem measurement, and the determination of package usage, and is expected to spur investigation into global properties of OSS development leading to increased resiliency of the entire OSS ecosystem.

On the Relation between Outdated Docker Containers, Severity Vulnerabilities, and Bugs

It is argued that Docker container scan and security management tools should improve their platforms by adding data about other kinds of bugs and include the measurement of technical lag to offer deployers information of when to update.

A Large-scale Data Set and an Empirical Study of Docker Images Hosted on Docker Hub

The results demonstrate the maturity of the Docker ecosystem: more reliance on ready-to-use language and application base images as opposed to yet-to be-configured OS images, a downward trend of Docker image sizes demonstrating the adoption of best practices of keeping images small, and a declining trend in the number of smells suggesting a general improvement in quality.