Building the Collaboration Graph of Open-Source Software Ecosystem

  title={Building the Collaboration Graph of Open-Source Software Ecosystem},
  author={Elena Lyulina and Mahmoud Jahanshahi},
  journal={2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)},
The Open-Source Software community has become the center of attention for many researchers, who are investigating various aspects of collaboration in this extremely large ecosystem. Due to its size, it is difficult to grasp whether or not it has structure, and if so, what it may be. Our hackathon project aims to facilitate the understanding of the developer collaboration structure and relationships among projects based on the bi-graph of what projects developers contribute to by providing an… 

Figures from this paper

The Extent of Orphan Vulnerabilities from Code Reuse in Open Source Software
A tool to help identify and fix white-box-reuse-induced vulnerabilities that have been already patched in the original projects (orphan vulnerabilities) and hopes that VDiOS will lead to further study and mitigation of risks from orphan vulnerabilities and other orphan code flaws.


Network Structure of Social Coding in GitHub
This paper collects 100,000 projects and 30,000 developers from GitHub, constructs developer-developer and project-project relationship graphs, and compute various characteristics of the graphs, which identify influential developers and projects on this sub network of GitHub by using PageRank.
World of Code: An Infrastructure for Mining the Universe of Open Source VCS Data
A very large and frequently updated collection of version control data for FLOSS projects named World of Code (WoC), which is capable of supporting trend evaluation, ecosystem measurement, and the determination of package usage, and is expected to spur investigation into global properties of OSS development leading to increased resiliency of the entire OSS ecosystem.
Evolution patterns of open-source software systems and communities
A case study of four typical OSS projects is conducted, and it is found that while collaborative development within a community is the essential characteristic of OSS, different collaboration models exist, and that the difference in collaboration model results in different evolution patterns of O SS systems and communities.
Gephi: An Open Source Software for Exploring and Manipulating Networks
This work presents several key features of Gephi in the context of interactive exploration and interpretation of networks, and highlights key aspects of dynamic network visualization.
ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software
ForceAtlas2 is a force-directed layout close to other algorithms used for network spatialization, designed for the Gephi user experience, and proposed for the first time as a benchmark for the compromise between performance and quality.
Exploring the patterns of social behavior in GitHub
An explosive growth of the users in GitHub is found and the Diffusion of Innovation theory is introduced and illustrated to illustrate intrinsic sociological basis of this phenomenon.
A Dataset and an Approach for Identity Resolution of 38 Million Author IDs extracted from 2B Git Commits
This paper proposes a method that finds all author IDs belonging to a single developer in this entire dataset, and shares the list of all author ID that were found to have aliases, and uses a machine learning model to predict which of these potentially related IDs belong to the same developer.
Visualizing knowledge domains
L'A. passe en revue les techniques de visualisation utilisees pour representer de facon cartographique la structure de domaine des disciplines scientifiques, et pour soutenir la recherche