A novel approach for estimating Truck Factors

  title={A novel approach for estimating Truck Factors},
  author={Guilherme Amaral Avelino and Leonardo Teixeira Passos and Andr{\'e} C. Hora and Marco T{\'u}lio Valente},
  journal={2016 IEEE 24th International Conference on Program Comprehension (ICPC)},
Truck Factor (TF) is a metric proposed by the agile community as a tool to identify concentration of knowledge in software development environments. It states the minimal number of developers that have to be hit by a truck (or quit) before a project is incapacitated. In other words, TF helps to measure how prepared is a project to deal with developer turnover. Despite its clear relevance, few studies explore this metric. Altogether there is no consensus about how to calculate it, and no… 

Figures and Tables from this paper

Algorithms for estimating truck factors: a comparative study

The results indicate that truck factor developers are in most cases a subset of core developers, a related concept commonly used to denote the key developers of open-source projects.

Bus Factor in Practice

With a survey of 269 engineers, it is found that the bus factor is perceived as an important problem in collective development, and a multimodal bus factor estimation algorithm that uses data on code reviews and meetings together with the VCS data is proposed.

Concentration Of Knowledge In Software Projects: An Empirical Assessment

This dissertation carried out a comparative study of algorithms that estimate Truck Factor, showing that the former are in most cases a subset of the latter, and recommends that measures of Truck Factor should consider the relative importance of the classes in a software project.

Turnover in Open-Source Projects: The Case of Core Developers

The results show that projects classified as Unstable (High Number of Core Developers Leavers and Newcomers) take a longer time to fix issues and bugs and to implement enhancements than other groups.

Should I Bug You? Identifying Domain Experts in Software Projects Using Code Complexity Metrics

A framework to elicit the expertise of developers and recommend experts by analyzing complexity measures over time is proposed and it is shown that aggregated code metrics can be used to identify experts for different software components.

“There and Back Again?” On the Influence of Software Community Dispersion Over Productivity

Estimating and understanding productivity still represents a crucial task for researchers and practitioners. Researchers spent significant effort identifying the factors that influence software

Identifying Critical Projects via PageRank and Truck Factor

  • R. Pfeiffer
  • Computer Science
    2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)
  • 2021
It is hypothesized, that a combination of PageRank (PR) and Truck Factor (TF) can more accurately identify critical projects than Google’s current Criticality Score (CS) and to verify this hypothesis, an experiment is conducted.

External Factors in Sustainability of Open Source Software

This dissertation explores effects of external factors in OSS sustainability, the mechanisms behind them, and proposes tools to make certain risk factors more visible and shows that in many cases simple tools can be used to detect less visible sustainability factors, such as competition and surrounding communities.

The Corrective Commit Probability Code Quality Metric

Analysis of project attributes shows that lower CCP (higher quality) is associated with smaller files, lower coupling, use of languages like JavaScript and C# as opposed to PHP and C++, fewer developers, lower developer churn, better onboarding, and better productivity.

Corrective commit probability: a measure of the effort invested in bug fixing

Lower CCP is associated with smaller files, lower coupling, use of languages like JavaScript and C# as opposed to PHP and C++, fewer code smells, lower project age, better perceived quality, fewer developers, lower developer churn, better onboarding, and better productivity.



On the Difficulty of Computing the Truck Factor

The situation implementing the only approach proposed in literature able to compute the Truck Factor is explored, and an exploratory study with 37 open source projects is conducted for discovering limitations and drawbacks that could prevent its usage.

Algorithmic Complexity of the Truck Factor Calculation

This paper proves that some specific variants of the TF are actually NP-hard to compute, including the promising worst-case metric TF min ,c .

Are Heroes common in FLOSS projects?

A tool to compute the Truck factor and identify Heroes in a project and found that Heroes are common in the considered set of FLOSS projects and that the truck factor is in general low.

Revisiting the applicability of the pareto principle to core development teams in open source software projects

The findings suggest that the Pareto principle is not compatible with the core teams of many GitHub projects, and several of the studied GitHub projects are susceptible to the “bus factor” where the impact of a core developer leaving would be quite harmful.

Are developers complying with the process: an XP study

The results show that the approach enabled researchers to formulate minimal intrusive methods to check for conformance and that for the majority of the investigated XP practices violations could be detected.

An exploratory study of the pull-based software development model

This work explores how pull-based software development works, first on the GHTorrent corpus and then on a carefully selected sample of 291 projects, finding that the pull request model offers fast turnaround, increased opportunities for community engagement and decreased time to incorporate contributions.

Mining the history of synchronous changes to refine code ownership

  • Lile HattoriMichele Lanza
  • Computer Science
    2009 6th IEEE International Working Conference on Mining Software Repositories
  • 2009
This paper illustrates how the information mined by the Syde tool can help to provide a refined notion of code ownership and breaks new ground in terms of how such information can assist developers.

Who should fix this bug?

This paper applies a machine learning algorithm to the open bug repository to learn the kinds of reports each developer resolves and reaches precision levels of 57% and 64% on the Eclipse and Firefox development projects respectively.

A degree-of-knowledge model to capture source code familiarity

It is shown that the degree-of-knowledge model can provide better results than an existing expertise finding approach and also case studies of the use of the model to support knowledge transfer and to identify changes of interest are reported.

Mining usage expertise from version archives

Preliminary results for the ECLIPSE project are presented that demonstrate that the concept of usage expertise, which manifests itself whenever developers are using functionality, e.g., by calling API methods, is introduced.