The package FedData
has gone through software review and is now part of rOpenSci. FedData
includes functions to automate downloading geospatial data available from several federated data sources (mainly sources maintained by the US Federal government).
Currently, the package enables extraction from six datasets:
FedData
is designed with the large-scale geographic information system (GIS) use-case in mind: cases where the use of dynamic web-services is impractical due to the scale (spatial and/or temporal) of analysis. It functions primarily as a means of downloading tiled or otherwise spatially-defined datasets; additionally, it can preprocess those datasets by extracting data within an area of interest (AoI), defined spatially. It relies heavily on the sp
, raster
, and rgdal
packages.
Contributing to an open-source community without contributing code is an oft-vaunted idea that can seem nebulous. Luckily, putting vague ideas into action is one of the strengths of the rOpenSci Community, and their package onboarding system offers a chance to do just that.
This was my first time reviewing a package, and, as with so many things in life, I went into it worried that I’d somehow ruin the package-reviewing process— not just the package itself, but the actual onboarding infrastructure…maybe even rOpenSci on the whole.
...Take a look at the data
This is a phrase that comes up when you first get a dataset.
It is also ambiguous. Does it mean to do some exploratory modelling? Or make some histograms, scatterplots, and boxplots? Is it both?
Starting down either path, you often encounter the non-trivial growing pains of working with a new dataset. The mix ups of data types - height in cm coded as a factor, categories are numerics with decimals, strings are datetimes, and somehow datetime is one long number. And let’s not forget everyone’s favourite: missing data.
...Last week we released an update of the tesseract package to CRAN. This package provides R bindings to Google’s OCR library Tesseract.
install.packages("tesseract")
The new version ships with the latest libtesseract 3.05.01 on Windows and MacOS. Furthermore it includes enhancements for managing language data and using tesseract together with the magick package.
Installing Language Data
The new version has several improvements for installing additional language data. On Windows and MacOS you use the tesseract_download()
function to install additional languages:
Last week, version 1.0 of the magick package appeared on CRAN: an ambitious effort to modernize and simplify high quality image processing in R. This R package builds upon the Magick++ STL which exposes a powerful C++ API to the famous ImageMagick library.
The best place to start learning about magick is the vignette which gives a brief overview of the overwhelming amount of functionality in this package.
Towards Release 1.0
Last year around this time rOpenSci announced the first release of the magick package: a new powerful toolkit for image reading, writing, converting, editing, transformation, annotation, and animation in R. Since the initial release there have been several updates with additional functionality, and many useRs have started to discover the power of this package to take visualization in R to the next level.
...