Data from Public Bicycle Hire Systems

  Mark Padgham   | OCTOBER 17, 2017

A new rOpenSci package provides access to data to which users may already have directly contributed, and for which contribution is fun, keeps you fit, and helps make the world a better place. The data come from using public bicycle hire schemes, and the package is called bikedata. Public bicycle hire systems operate in many cities throughout the world, and most systems collect (generally anonymous) data, minimally consisting of the times and locations at which every single bicycle trip starts and ends.

Accessing patent data with the patentsview package

  Chris Baker   | SEPTEMBER 19, 2017

Why care about patents? 1. Patents play a critical role in incentivizing innovation, without which we wouldn’t have much of the technology we rely on everyday What does your iPhone, Google’s PageRank algorithm, and a butter substitute called Smart Balance all have in common? …They all probably wouldn’t be here if not for patents. A patent provides its owner with the ability to make money off of something that they invented, without having to worry about someone else copying their technology.

Onboarding visdat, a tool for preliminary visualisation of whole dataframes

  Nicholas Tierney   | AUGUST 22, 2017

Take a look at the data This is a phrase that comes up when you first get a dataset. It is also ambiguous. Does it mean to do some exploratory modelling? Or make some histograms, scatterplots, and boxplots? Is it both? Starting down either path, you often encounter the non-trivial growing pains of working with a new dataset. The mix ups of data types - height in cm coded as a factor, categories are numerics with decimals, strings are datetimes, and somehow datetime is one long number.

So you (don't) think you can review a package

  Mara Averick   | AUGUST 22, 2017

Contributing to an open-source community without contributing code is an oft-vaunted idea that can seem nebulous. Luckily, putting vague ideas into action is one of the strengths of the rOpenSci Community, and their package onboarding system offers a chance to do just that. This was my first time reviewing a package, and, as with so many things in life, I went into it worried that I’d somehow ruin the package-reviewing process— not just the package itself, but the actual onboarding infrastructure…maybe even rOpenSci on the whole.

New package tokenizers joins rOpenSci

  Lincoln Mullen   | AUGUST 23, 2016

The R package ecosystem for natural language processing has been flourishing in recent days. R packages for text analysis have usually been based on the classes provided by the NLP or tm packages. Many of them depend on Java. But recently there have been a number of new packages for text analysis in R, most notably text2vec, quanteda, and tidytext. These packages are built on top of Rcpp instead of rJava, which makes them much more reliable and portable.

Page 1 of 1