rOpenSci tech notes

ccafs - client for CCAFS General Circulation Models data

Scott Chamberlain — March 1, 2017
I've recently released the new package ccafs, which provides access to data from Climate Change, Agriculture and Food Security (CCAFS; http://ccafs-climate.org/) General Circulation Models (GCM) data. GCM's are a particular type of climate model, used for weather forecasting, and climate change forecasting - read more at https://en.wikipedia.org/wiki/General_circulation_model. ccafs falls in the data client camp - its focus is on getting users data - many rOpenSci packages fall into this area. These kinds of packages are...

Package evolution - changing stuff in your package

Scott Chamberlain — January 5, 2017
Making packages is a great way to organize R code, whether it’s a set of scripts for personal use, a set of functions for internal company use or a lab group, or to distribute your new cool framework foobar to the masses. There's a number of guides to writing packages, including http://r-pkgs.had.co.nz/. As you develop packages there's a number of issues that don't often get much air time. I'll cover some of them here. Philosophy...

Update jsonlite 1.2

Jeroen Ooms — January 4, 2017
A new version of jsonlite package to CRAN. This is a maintenance release with enhancements and bug fixes. A summary of changes in v1.2 from the NEWS file: Add read_json and write_json convenience wrappers, #161 Update modp_numtoa from upstream, fixes a rounding issue in #148. Ensure asJSON.POSIXt does not use sci notation for negative values, #155 Tweak num_to_char to properly print large negative numbers Performance optimization for simplyfing data frames (see below) Use the Github...

finch - parse Darwin Core files

Scott Chamberlain — December 23, 2016
finch has just been released to CRAN (binaries should be up soon). finch is a package to parse Darwin Core files. Darwin Core (DwC) is: a body of standards. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing reference definitions, examples, and commentaries. The Darwin Core is primarily based on taxa, their...

Announcing pdftools 1.0

Jeroen Ooms — December 9, 2016
This week we released version 1.0 of the ropensci pdftools package to CRAN. Pdftools provides utilities for extracting text, fonts, attachments and other data from PDF files. It also supports rendering of PDF files into bitmap images. This release has a few internal enhancements and fixes an annoying bug for landscape PDF pages. The version bump to 1.0 signifies that the package has undergone sufficient testing and the API is stable. Extracting Text As described...

Tesseract Update: Options and Languages

Jeroen Ooms — December 8, 2016
A few weeks ago we announced the first release of the tesseract package: a high quality OCR engine in R. We have now released an update with extra features. Installing Training Data As explained in the first post, the tesseract system is powered by language specific training data. By default only English training data is installed. Version 1.3 adds utilities to make it easier to install additional training data. # Download French training data tesseract_download("fra")...

fauxpas - HTTP conditions package

Scott Chamberlain — November 18, 2016
HTTP, or Hypertext Transfer Protocol is a protocol by which most of us interact with the web. When we do requests to a website in a browser on desktop or mobile, or get some data from a server in R, all of that is using HTTP. HTTP has a rich suite of status codes describing different HTTP conditions, ranging from Success to various client errors, to server errors. R has a few HTTP client libraries...

crul - an HTTP client

Scott Chamberlain — November 9, 2016
A new package crul is on CRAN. crul is another HTTP client for R, but is relatively simplified compared to httr, and is being built to link closely with webmockr and vcr. webmockr and vcr are packages ported from Ruby's webmock and vcr, respectively. They both make mocking HTTP requests really easy. A major use case for mocking HTTP requests is for unit tests. Nearly all the packages I work on personally make HTTP requests...

Parse NOAA Integrated Surface Data Files

Scott Chamberlain — November 3, 2016
A new package isdparser is on CRAN. isdparser was in part liberated from rnoaa, then improved. We'll use isdparser in rnoaa soon. isdparser does not download files for you from NOAA's ftp servers. The package focuses on parsing the files, which are variable length ASCII strings stored line by line, where each line has some mandatory data, and any amount of optional data. The data is great, and includes for example, wind speed and direction,...

Encryption and Digital Signatures in R using GPG

Jeroen Ooms — October 19, 2016
A new package gpg has appeared on CRAN. From the package description: Bindings to GnuPG for working with OpenGPG (RFC4880) cryptographic methods. Includes utilities for public key encryption, creating and verifying digital signatures, and managing your local keyring. Note that some functionality depends on the version of GnuPG that is installed on the system. In particular GnuPG 2 mandates the use of 'gpg-agent' for entering passphrases, which only works if R runs in a terminal...

Get air quality data for the United Kingdom using the rdefra package

Claudia Vitolo — October 6, 2016
Whether you are an environmental scientist, a pollution expert or just concerned about the air you breathe when cycling in the United Kingdom, the ropensci rdefra package can help find the information you need. This package gives you access to the UK-AIR database, hosted by the Department for Environment, Food & Rural Affairs in the United Kingdom, directly from R. The database comprises hundreds of air quality monitoring sites and each provides time series of...

New package graphql: A GraphQL Query Parser

Jeroen Ooms — October 5, 2016
The new ropensci graphql package is now on CRAN. It implements R bindings to the libgraphqlparser C++ library to parse GraphQL syntax and export the syntax tree in JSON format: graphql2json("{ field(complex: { a: { b: [ $var ] } }) }") A syntax parser is perhaps not super useful to most end-users, but can be used to validate graphql queries or implement a GraphQL API in R. We hope to add more related functionality...

Hunspell 2.0: High-Performance Stemmer, Tokenizer, and Spell Checker for R

Jeroen Ooms — September 12, 2016
A new version of the ropensci hunspell package has been released to CRAN. Hunspell is the spell checker library used by LibreOffice, OpenOffice, Mozilla Firefox, Google Chrome, Mac OS-X, InDesign, Opera, RStudio and many others. It provides a system for tokenizing, stemming and spelling in almost any language or alphabet. The R package exposes both the high-level spell-checker as well as low-level stemmers and tokenizers which analyze or extract individual words from various formats (text,...

New in Magick 0.3

Jeroen Ooms — September 8, 2016
A new version of the ropensci magick package has been released to CRAN. Magick is a package for Advanced Image-Processing in R. It wraps the ImageMagick STL which is perhaps the most comprehensive open-source image processing library available today. Our original announcement has more details. New features This new version now includes a beautiful vignette which gives an overview of the main functionality to get you started! It lists the various formats, transformations, effects, operations...