TNM: A Tool for Mining of Socio-Technical Data from Git Repositories
@article{Sviridov2021TNMAT, title={TNM: A Tool for Mining of Socio-Technical Data from Git Repositories}, author={Nikolai Sviridov and Mikhail Evtikhiev and Vladimir Kovalenko}, journal={2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)}, year={2021}, pages={295-299} }
Networks of collaboration between engineers are reflected in traces of developers’ activity in version control systems (VCSs). Extracting data from Git repositories is an essential task for researchers and practitioners working on socio-technical analysis, but it requires substantial engineering work. With increasing interest in analysing socio-technical data and applying it in practice, there are no flexible and easily reusable tools to retrieve socio-technical information from VCSs. With no…
2 Citations
GitDelver Enterprise Dataset (GDED): An Industrial Closed-source Dataset for Socio-Technical Research
- Computer Science2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR)
- 2022
This work mined 101 repositories and produced the GDED dataset containing socio-technical information about 106,216 commits, 470,940 file modifications and 3,471,556 method modifications from 164 developers during the last 13 years, using various programming languages.
References
SHOWING 1-10 OF 38 REFERENCES
PyDriller: Python framework for mining software repositories
- Computer ScienceESEC/SIGSOFT FSE
- 2018
PyDriller is presented, a Python Framework that eases the process of mining Git, and is compared against the state-of-the-art Python Framework GitPython, demonstrating that PyDriller can achieve the same results with, on average, 50% less LOC and significantly lower complexity.
Perceval: Software Project Data at Your Will
- Computer Science2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion)
- 2018
Perceval is an industry strong free software tool that has been widely used in Bitergia, a company devoted to offer commercial software analytics of software projects, and hides the technical complexities related to data acquisition and eases the definition of analytics.
Assessing the bus factor of Git repositories
- Computer Science2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)
- 2015
A tool that, given a Git-based repository, automatically measures the bus factor for any file, directory and branch in the repository and for the project itself and you can simulate with the tool what would happen to the project if one or more developers disappeared.
The promises and perils of mining git
- Computer Science2009 6th IEEE International Working Conference on Mining Software Repositories
- 2009
This work focuses on git, a very popular DSCM used in high-profile projects and aims to help researchers interested in DSCMs avoid perils when mining and analyzing git data.
Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity
- Computer ScienceESEM '08
- 2008
This paper argues that modularization, the traditional technique intended to reduce interdependencies among components of a system, has serious limitations in the context of software development and builds on the idea of congruence, proposed in prior work, to examine the relationship between the structure of technical and work dependencies.
Using Software Repositories to Investigate Socio-technical Congruence in Development Projects
- Computer ScienceFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)
- 2007
It is shown how the information necessary to implement a quantitative measure of socio- technical congruence can be mined from commonly used software repositories, and how socio-technical congruency can be computed based on that information.
Empirical findings on team size and productivity in software development
- Computer ScienceJ. Syst. Softw.
- 2012
Revisiting the applicability of the pareto principle to core development teams in open source software projects
- Economics, Computer ScienceIWPSE
- 2015
The findings suggest that the Pareto principle is not compatible with the core teams of many GitHub projects, and several of the studied GitHub projects are susceptible to the “bus factor” where the impact of a core developer leaving would be quite harmful.
CVS release history data for detecting logical couplings
- Computer ScienceSixth International Workshop on Principles of Software Evolution, 2003. Proceedings.
- 2003
The software evolution analysis approach enabled us to detect shortcomings of PACS such as architectural weaknesses, poorly designed inheritance hierarchies, or blurred interfaces of modules.
A degree-of-knowledge model to capture source code familiarity
- Computer Science2010 ACM/IEEE 32nd International Conference on Software Engineering
- 2010
It is shown that the degree-of-knowledge model can provide better results than an existing expertise finding approach and also case studies of the use of the model to support knowledge transfer and to identify changes of interest are reported.