rOpenSci | Blog

All posts (Page 122 of 130)

Tuesday, February 18, 2014

AntWeb - programmatic interface to ant biodiversity data

This post was updated on August 20, 2014, with AntWeb version 0.7.2.99. Please install an updated version to make sure the code works.

Data on more than 10,000 species of ants recorded worldwide are available through from California Academy of Sciences’ AntWeb, a repository that boasts a wealth of natural history data, digital images, and specimen records on ant species from a large community of museum curators.

...

By Karthik Ram

Monday, February 17, 2014

Changed and new things in the new version of rgbif, v0.5

rgbif is an R package to search and retrieve data from the Global Biodiverity Information Facilty (GBIF). rgbif wraps R code around the [GBIF API][gbifapi] to allow you to talk to GBIF from R.

We just pushed a new verion of rgbif to cran - v0.5.0. Source and binary files are now available on CRAN.

There are a few new functions: count_facet, elevation, and installations. These are described, with examples, below.

...

By Scott Chamberlain

Wednesday, February 12, 2014

Caching Encyclopedia of Life API calls

In a recent blog post we discussed caching calls to the web offline, on your own computer. Just like you can cache data on your own computer, a data provider can do the same thing. Most of the data providers we work with do not provide caching. However, at least one does: EOL, or Encyclopedia of Life. EOL allows you to set the amount of time (in seconds) that the call is cached, within which time you can make the same call and get the data back faster. We have a number of functions to interface with EOL in our taxize package....

By Scott Chamberlain

Monday, February 10, 2014

rOpenSci developer meeting in March

Our team has been cranking out a large number of tools over the past several months. As regular readers are aware, our software packages provide programmatic access to a diverse and extensive trove of scientific data. More recently we’ve expanded our efforts to build more general purpose and cross-domain tools. These include tools for reading, writing, integrating and publishing data, a unit testing platform for data, and a mapping engine that can visualize various kinds of spatial data. Many of our projects are inspired by ad hoc discussions with other scientists and software developers both online (often on Twitter and GitHub) and offline. Several of these folks are now regular contributors to the project. To foster more such collaborations and drive new software innovations, we are excited to announce our first developer meeting next month at GitHub’s headquarters in San Francisco. This meeting is made possible by support from the Alfred P. Sloan foundation and GitHub....

By Karthik Ram

Monday, February 3, 2014

Caching API calls offline

I’ve recently heard the idea of “offline first” via especially Hood.ie. We of course don’t do web development, but primarily build R interfaces to data on the web. Internet availablility is increasinghly ubiqutous, but there still are times and places where you don’t have internet, but need to get work done.

In the R packages we write there are generally two steps to every workflow:

Make a call to the web to request data and collect data
Rearrange the result as some sort of R object (e.g., an R data.frame), then visualize, analyze, etc.

The first process is not possible if you don’t have an internet connection - making the second step fail as a result.

...

By Scott Chamberlain