Traceability in the Wild: Automatically Augmenting Incomplete Trace Links

  title={Traceability in the Wild: Automatically Augmenting Incomplete Trace Links},
  author={Michael Rath and Jacob Rendall and Jin L. C. Guo and Jane Cleland-Huang and Patrick M{\"a}der},
  journal={2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)},
Software and systems traceability is widely accepted as an essential element for supporting many software development tasks. Today's version control systems provide inbuilt features that allow developers to tag each commit with one or more issue ID, thereby providing the building blocks from which project-wide traceability can be established between feature requests, bug fixes, commits, source code, and specific developers. However, our analysis of six open source projects showed that on… 

Figures and Tables from this paper

SpojitR: Intelligently Link Development Artifacts
  • M. Rath, M. Tomova, Patrick Mäder
  • Computer Science
    2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER)
  • 2020
The prototype spojitR is designed to assist developers in tagging commit messages and thus (semi-) automatically creating trace links between commits and an issue they are working on.
Increasing Precision of Automatically Generated Trace Links
This paper changed the interaction-based trace link creation approach so that interaction logs are associated with requirements based on the IDs in the commit-messages, and shows that with this new approach and link improvement techniques precision is above 90% and recall is almost 80%.
Analyzing Requirements and Traceability Information to Improve Bug Localization
  • M. Rath, D. Lo, Patrick Mäder
  • Computer Science
    2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR)
  • 2018
This paper proposes a novel approach TraceScore that also utilizes projects' requirements information and explicit dependency trace links to further close the gap in order to relate a new bug report to defective source code files.
Towards Semantically Guided Traceability
This work proposes an automated technique which leverages source code, project artifacts and an external domain corpus to generate a domain-specific concept model, and uses the generated concept model to improve traceability results and to provide explanations of the results.
Using Interaction Logs for Creationand Maintenance of Trace Links
We present an innovative, data-driven solution to an old requirements engineering problem [1]. Trace links between requirements and source code are beneficial for many software engineering tasks
How Effective Is Automated Trace Link Recovery in Model-Driven Development?
An automated tool for recovering traces between JIRA issues (user stories and bugs) and revisions in a model-driven development (MDD) context is designed based on existing literature that uses process and text-based data to train a machine learning classifier.
Traceability Support for Multi-Lingual Software Projects
This paper analyzes and discusses patterns of intermingled language use across multiple projects, and evaluates several different tracing algorithms including the Vector Space Model, Latent Semantic Indexing (LSI), Latent Dirichlet Allocation (LDA), and various models that combine mono-and cross-lingual word embeddings with the Generative Vector space Model (GVSM).
Improving the Effectiveness of Traceability Link Recovery using Hierarchical Bayesian Networks
A HierarchiCal PrObabilistic Model for SoftwarE Traceability (Comet) is designed and implemented that is able to infer candidate trace links and is capable of modeling relationships between artifacts by combining the complementary observational prowess of multiple measures of textual similarity.
Traceability Network Analysis: A Case Study of Links in Issue Tracking Systems
This work explores various network analysis techniques in the issue tracking system of sixty-six open source projects and reveals two salient properties of the traceability network, i.e. scale free and triadic closure.


Estimating the number of remaining links in traceability recovery
Results from this study indicate that: (i) specific estimation models are able to provide accurate estimates of the number of remaining positive links; and (ii) univariate estimation models outperform multivariate ones.
Can method data dependencies support the assessment of traceability between requirements and source code?
It is found that data dependencies are as relevant as call dependencies for assessing requirements traceability and even more interesting, data dependencies complement call dependencies in the assessment and have strong implications on code understanding, including trace capture, maintenance, and validation techniques.
Recovering Traceability Links between Code and Documentation
A probabilistic and a vector space information retrieval model is applied in two case studies to trace C++ source code onto manual pages and Java code to functional requirements to recover traceability links between source code and free text documents.
Information Retrieval Methods for Automated Traceability Recovery
This chapter overviews a general process of using IR-based methods for traceability link recovery and overview some of them in a greater detail: probabilistic, vector space, and Latent Semantic Indexing models.
Discovering Loners and Phantoms in Commit and Issue Data
The results of the evaluation indicate that the proposed PaLiMod model and heuristics enable an automatic interlinking and can indeed reduce the residual of non-linked commits and issues in software projects.
Filling the Gaps of Development Logs and Bug Issue Data
This paper traces two sources of information relative to software bugs: the change logs of the actions of developers and the issues reported as bugs, and proposes an automatic approach to re-align the two databases so that the collected information is mirrored and in sync.
Cold-Start Software Analytics
The best-of-breed approach outperformed the profile-driven approach in all three areas of artifact connectivity, fault-prediction, and finding the expert; however, while it delivered acceptable results for artifact connectivity and find the expert, both techniques under performed for cold-start fault prediction.
Towards feature-aware retrieval of refinement traces
Results show that graph clustering can improve the retrieval of refinement traces and is a step towards the overall goal of ubiquitous traceability.
Breaking the big-bang practice of traceability: Pushing timely trace recommendations to project stakeholders
This work presents a trace recommender system which pushes recommendations to project stakeholders as they create or modify traceable artifacts and introduces the novel concept of a trace obligation, which is used to track satisfaction relations between a target artifact and a set of source artifacts.
Do developers benefit from requirements traceability when evolving and maintaining a software system?
A controlled experiment with 71 subjects re-performing real maintenance tasks on two third-party development projects shows that subjects with traceability performed on average 24 % faster on a given task and created on average 50 % more correct solutions—suggesting that traceability not only saves effort but can profoundly improve software maintenance quality.