An empirical comparison of developer retention in the RubyGems and npm software ecosystems
Context: Numerous open source software projects are based on volunteers’ collaboration and require a continuous influx of newcomers for their continuity. Newcomers face barriers that can lead them to give up. These barriers hinder both developers willing to make a single contribution and those willing to become a project member. Objective: This study aims to identify and classify the barriers that newcomers face when contributing to Open Source Software projects. Method: We conducted a systematic literature review of papers reporting empirical evidence regarding the barriers that newcomers face when contributing to Open Source Software (OSS) projects. We retrieved 291 studies by querying 4 digital libraries. Twenty studies were identified as primary. We performed a backward snowballing approach, and searched for other papers published by the authors of the selected papers to identify potential studies. Then, we used a coding approach inspired by open coding and axial coding procedures from Grounded Theory to categorize the barriers reported by the selected studies. Results: We identified 20 studies providing empirical evidence of barriers faced by newcomers to OSS projects while making a contribution. From the analysis, we identified 15 different barriers, which we grouped into five categories: social interaction, newcomers’ previous knowledge, finding a way to start, documentation, and technical hurdles. We also classified the problems with regard to their origin: newcomers, community, or product. Conclusion: The results are useful to researchers and OSS practitioners willing to investigate or to implement tools to support newcomers. We mapped technical and non-technical barriers that hinder newcomers’ first contributions. The most evidenced barriers are related to socialization, appearing in 75% (15 out of 20) of the studies analyzed, with a high focus on interactions in mailing lists (receiving answers and socialization with other members). There is a lack of in-depth studies on technical issues, such as code issues. We also noticed that the majority of the studies relied on historical data gathered from software repositories and that there was a lack of experiments and qualitative studies in this area.