Open science is the practice of making various elements of scientific research -- data & methods, code & software, and results & publications -- readily accessible to anyone. While this has great potential for advancing research (in addition to education, public policy, & commercial innovation) as a whole, there are both technical and social challenges preventing this practice from being more widespread. Social challenges stem largely from the dichotomy between what is best for an individual researcher and what is best for the community. Technical challenges arise largely from issues of scale: putting free print copies of DNA sequencing data in a box in front of your office doesn't scale as well as depositing those sequences on repositories like GenBANK.
Our goal is to provide open-source tools to help address both these challenges. These are interesting times. The technology to facilitate the access and utilization of this data has never been better, yet it is only beginning to be employed. The internet -- firmly in its second generation, the read-write Web 2.0 culture in which users generate content as readily as they consume it -- has led to the explosion of mechanisms for sharing. Yet these tools are not widely leveraged in scientific communities (Coombes et. al. 2007).
R is an open-source statistical environment that can be used for not only statistics, but also for data acquisition, data manipulation, modeling, among other uses. R is increasingly being used by scientists across all disciplines and has overtaken popular scientific programming tools. Part of the reason behind R's explosive growth is the ease with which the ever-growing userbase can add new functionality, a fact evidenced by 3,000+ currently available R packages. The R framework is ideal for open science because: