A Best Practice for Research

Abstract

R is an extremely flexible statistics programming language and environment that is Open Source and freely available for all mainstream operating systems. R has recently experienced an " explosive growth in use and in user contributed software " (Tierney, 2005, p. 7). The " user-contributed software " is one of the most unique and beneficial aspects of R, as a large number of users have contributed code for implementing some of the most up-to-date statistical methods, in addition to R implementing essentially all standard statistical analyses. Because of R's Open Source structure and a community of users dedicated to making R of the highest quality, the computer code on which the methods are based is openly critiqued and improved. 1 The flexibility of R is arguably unmatched by any other statistics program, as its object-oriented programming language allows for the creation of functions that perform customized procedures and/or the automation of tasks that are commonly performed. This flexibility, however, has also kept some researchers away from R. There seems to be a misperception that learning to use R is a daunting challenge. The goals of this chapter include the following: (a) convey that the time spent learning R, which in many situations is a relatively small amount, is a worthwhile investment; (b) illustrate that many commonly performed analyses are straightforward to implement; and (c) show that important methods not available elsewhere can be implemented in R (easily in many cases). In addition to these goals, we will show that an often unrealized benefit of R is that it helps to create " reproducible research, " in the sense that a record will exist of the exact analyses performed (e.g., algorithm used, options specified, subsample selected, etc.) so that the results of analyses can be recovered at a later date by the original researcher or by others if necessary (and thus " How was this result obtained? " is never an issue). Currently, R is maintained by the R Core Development Team. R consists of a base system with optional add-on packages for a wide variety of techniques that are contributed by users from around the world (currently, there are more than 1,100 packages available on the Comprehensive R Archival Network, http://cran.r-project.org/). An R package is a collection of functions and corresponding documentation that work seamlessly with R.R has been called the lingua franca of statistics by the editor of the …

13 Figures and Tables

Cite this paper

@inproceedings{Kelley2008ABP, title={A Best Practice for Research}, author={Ken Kelley and Keke Lai and Po-Ju Wu}, year={2008} }