Ethical Mining: A Case Study on MSR Mining Challenges

  title={Ethical Mining: A Case Study on MSR Mining Challenges},
  author={Nicolas E. Gold and Jens Krinke},
  journal={Proceedings of the 17th International Conference on Mining Software Repositories},
  • N. Gold, J. Krinke
  • Published 29 June 2020
  • Computer Science
  • Proceedings of the 17th International Conference on Mining Software Repositories
Research in Mining Software Repositories (MSR) is research involving human subjects, as the repositories usually contain data about developers' interactions with the repositories. Therefore, any research in the area needs to consider the ethics implications of the intended activity before starting. This paper presents a discussion of the ethics implications of MSR research, using the mining challenges from the years 2010 to 2019 as a case study to identify the kinds of data used. It highlights… 

Tables from this paper

"We do not appreciate being experimented on": Developer and Researcher Views on the Ethics of Experiments on Open-Source Projects
A survey among open source developers and empirical software engineering researchers to see what behaviors they think are acceptable, and suggests that open-source developers are largely open to research, provided it is done transparently.
Recruiting Software Engineers on Prolific
This experience report discusses the experience conducting sample studies using Prolific, an academic crowdsourcing platform, and Topics discussed are the type of studies, selection processes, and power computation.
TCTracer: Establishing test-to-code traceability links using dynamic and static techniques
TCTracer is presented, an approach and implementation for the automatic establishment of test-to-code traceability links that improves over existing techniques by combining an ensemble of new and existing techniques that utilise both dynamic and static information and exploiting a synergistic flow of information between the method and class levels.
Should I Get Involved? On the Privacy Perils of Mining Software Repositories for Research Participants
This position paper aims to start a discussion about indirect participation in MSRs investigations, the dichotomy of ‘privacy vs. utility’ regarding sharing non-aggregated data, and its effects on privacy restrictions and ethical considerations for participant involvement.
How Empirical Research Supports Tool Development: A Retrospective Analysis and new Horizons
  • M. Di Penta
  • Computer Science
    Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
  • 2021
This keynote will first overview how empirical research has been used over the past decades to evaluate tools, and how this is changing over the years, and the importance of combining quantitative and qualitative evaluations.


Ethical Issues in Software Engineering Research: A Survey of Current Practice
  • T. Hall, V. Flynn
  • Computer Science, Business
    Empirical Software Engineering
  • 2004
It is shown that an analysis of recent published work measures an increase in empirical software engineering research currently being undertaken, and a survey of UK University Department Heads finds that whilst some UK Universities have taken ethical issues very seriously, others have not considered the issues.
Reporting Ethics Considerations in Software Engineering Publications
  • Deepika Badampudi
  • Business
    2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
  • 2017
It is important to not only state that consent was obtained however, the procedure of obtaining consent should be reported to improve the accountability and trust.
A Practical Guide to Ethical Research Involving Humans
In this chapter, four ethics principles of primary importance for conducting ethical research are introduced and examples of applying these principles in the context of ethics review are provided.
Ethical Issues in Empirical Studies of Software Engineering
Through a review of the ethical codes of several fields that commonly employ humans and artifacts as research subjects, major ethical issues relevant to empirical studies of software engineering are identified.
Ethical issues in research using datasets of illicit origin
It is found that existing advice and guidance does not address all of the problems that researchers have faced and explain how the papers tackle ethical issues inconsistently, and sometimes not at all.
Ethical Issues in Empirical Software Engineering: The Limits of Policy
This position paper describes some of the common approaches to encourage ethical behavior and their limits for enforcing ethical behavior.
Aspects of Data Ethics in a Changing World: Where Are We Now?
  • D. Hand
  • Computer Science
    Big Data
  • 2018
The nature of data, personal data, data ownership, consent and purpose of use, trustworthiness of data as well as of algorithms and of those using the data, and matters of privacy and confidentiality are explored.
Worse Than Spam: Issues In Sampling Software Developers
Contacting developers over public media proved to be the most effective and efficient sampling strategy and one specific ethical guideline is presented to start a discussion in the software engineering research community about which sampling strategies should be considered ethical.
Ethical challenges in online research: Public/private perceptions
It is suggested that new ethical guidelines, particularly in relation to informed consent and participants’ own perceptions of what is public or private, are needed owing to the unique challenges of online research.
Research Ethics for Studying Open Source Projects
The public visibility of Free and Open Source Software development has sparked interest in the research communities of business, social and computer sciences to use the projects as research subjects.