The iNaturalist project is a really cool way to both engage people in citizen science and collect species occurrence data. The premise is pretty simple, users download an app for their smartphone, and then can easily geo reference any specimen they see, uploading it to the iNaturalist website. It let’s users turn casual observations into meaningful crowdsourced species occurrence data. They also provide a nice robust API to access almost all of their data. We’ve developed a package rinat that can easily access all of that data in R. Our package spocc uses iNaturalist data as one of it’s sources, rinat provides an interface for all the features available in the API....
The rOpenSci projects aims to provide programmatic access to scientific data repositories on the web. A vast majority of the packages in our current suite retrieve some form of biodiversity or taxonomic data. Since several of these datasets have been georeferenced, it provides numerous opportunities for visualizing species distributions, building species distribution maps, and for using it analyses such as species distribution models. In an effort to streamline access to these data, we have developed a package called Spocc, which provides a unified API to all the biodiversity sources that we provide. The obvious advantage is that a user can interact with a common API and not worry about the nuances in syntax that differ between packages. As more data sources come online, users can access even more data without significant changes to their code. However, it is important to note that spocc will never replicate the full functionality that exists within specific packages. Therefore users with a strong interest in one of the specific data sources listed below would benefit from familiarising themselves with the inner working of the appropriate packages.
...We recently pushed the first version of rnoaa to CRAN - version 0.1. NOAA has a lot of data, some of which is provided via the National Climatic Data Center, or NCDC. NOAA has provided access to NCDC climate data via a RESTful API - which is great because people like us can create clients for different programming languages to access their data programatically. If you are so inclined to write a bit of R code, this means you can get to NCDC data in the R environment where your workflow is reproducible, and you can connect data acquisition to a suite of tools for data manipulation (e.g., plyr), visualization (e.g., ggplot2), and statistics (e.g., lme4, etc.)....
Reproducible research involves the careful, annotated preservation of data, analysis code, and associated files, such that statistical procedures, output, and published results can be directly and fully replicated. As the push for reproducible research has grown, the R community has responded with an increasingly large set of tools for engaging in reproducible research practices (see, for example, the ReproducibleResearch Task View on CRAN). Most of these tools focus on improving one’s own workflow through closer integration of data analysis and report generation. But reproducible research also requires the persistent - and perhaps indefinite - storage of research files so that they can be used to recreate or modify future analyses and reports....
We just released a new version of taxize - version 0.2.0. This release contains a number of new features, and bug fixes. Here is a run down of some of the changes:
First, install and load taxize
install.packages("rgbif")
library(taxize)
New things
New functions: class2tree
Sometimes you just want to have a visual of the taxonomic relationships among taxa. If you don’t know how to build a molecular phylogeny, don’t have time, or there just isn’t molecular data, you can sorta build one using taxonomy. Building on our classification function, you can get a bunch of taxonomic hierarchies from the classification function, then pass them to the new function class2tree. Like so: