  Evan Odell   | MAY 8, 2018

I’m excited to announce a new package for accessing official statistics from the UK. nomisr is the R client for the Nomis database. Nomis is run by Durham University on behalf of the UK’s Office for National Statistics (ONS), and contains over a thousand datasets, primarily on the UK labour market, census data, benefit spending and general economic activity. Registration is optional, although registration and the use of an API key allows for larger queries without the risk of being timed out or rate limited by the API.

Lessons Learned from rtika, a Digital Babel Fish

  Sasha Goodman   | APRIL 25, 2018

The Apache Tika parser is like the Babel fish in Douglas Adam’s book, “The Hitchhikers’ Guide to the Galaxy” 1. The Babel fish translates any natural language to any other. Although Tika does not yet translate natural language, it starts to tame the tower of babel of digital document formats. As the Babel fish allowed a person to understand Vogon poetry, Tika allows an analyst to extract text and objects from Microsoft Word.

Exploratory Data Analysis of Ancient Texts with rperseus

  David Ranzolin   | DECEMBER 5, 2017

Introduction When I was in grad school at Emory, I had a favorite desk in the library. The desk wasn’t particularly cozy or private, but what it lacked in comfort it made up for in real estate. My books and I needed room to operate. Students of the ancient world require many tools, and when jumping between commentaries, lexicons, and interlinears, additional clutter is additional “friction”, i.e., lapses in thought due to frustration.

googleLanguageR - Analysing language through the Google Cloud Machine Learning APIs

  Mark Edmondson   | OCTOBER 3, 2017

One of the greatest assets human beings possess is the power of speech and language, from which almost all our other accomplishments flow. To be able to analyse communication offers us a chance to gain a greater understanding of one another. To help you with this, googleLanguageR is an R package that allows you to perform speech-to-text transcription, neural net translation and natural language processing via the Google Cloud machine learning services.

rtimicropem: Using an *R* package as platform for harmonized cleaning of data from RTI MicroPEM air quality sensors

  Maëlle Salmon   | AUGUST 29, 2017

As you might remember from my blog post about ropenaq, I work as a data manager and statistician for an epidemiology project called CHAI for Cardio-vascular health effects of air pollution in Telangana, India. One of our interests in CHAI is determining exposure, and sources of exposure, to PM2.5 which are very small particles in the air that have diverse adverse health effects. You can find more details about CHAI in our recently published protocol paper.

