Exploring European attitudes and behaviours using the European Social Survey

  Jorge Cimentada   | JUNE 14, 2018

Introduction I never thought that I’d be programming software in my career. I started using R a little over 2 years now and it’s been one of the most important decisions in my career. Secluded in a small academic office with no one to discuss/interact about my new hobby, I started searching the web for tutorials and packages. After getting to know how amazing and nurturing the R community is, it made me want to become a data scientist.

fulltext v1: text-mining scholarly works

  Scott Chamberlain   | JANUARY 17, 2018

The problem Text-mining - the art of answering questions by extracting patterns, data, etc. out of the published literature - is not easy. It’s made incredibly difficult because of publishers. It is a fact that the vast majority of publicly funded research across the globe is published in paywall journals. That is, taxpayers pay twice for research: once for the grant to fund the work, then again to be able to read it.

solrium 1.0: Working with Solr from R

  Scott Chamberlain   | NOVEMBER 8, 2017

Nearly 4 years ago I wrote on this blog about an R package solr for working with the database Solr. Since then we’ve created a refresh of that package in the solrium package. Since solrium first hit CRAN about two years ago, users have raised a number of issues that required breaking changes. Thus, this blog post is about a major version bump in solrium. What is Solr? Solr is a “search platform” - a NoSQL database - data is organized by so called documents that are xml/json/etc blobs of text.

elastic - Elasticsearch for R

  Scott Chamberlain   | AUGUST 2, 2017

elastic is an R client for Elasticsearch elastic has been around since 2013, with the first commit in November, 2013. sidebar - ‘elastic’ was picked as a package named before the company now known as Elastic changed their name to Elastic. What is Elasticsearch? If you aren’t familiar with Elasticsearch, it is a distributed, RESTful search and analytics engine. It’s similar to Solr. It falls in the NoSQL bin of databases, holding data in JSON documents, instead of rows and columns.

All the fake data that's fit to print

  Scott Chamberlain   | JUNE 22, 2017

charlatan makes fake data. Excited to annonunce a new package called charlatan. While perusing packages from other programming languages, I saw a neat Python library called faker. charlatan is inspired from and ports many things from Python’s https://github.com/joke2k/faker library. In turn, faker was inspired from PHP’s faker, Perl’s Faker, and Ruby’s faker. It appears that the PHP library was the original - nice work PHP. Use cases What could you do with this package?

Page 1 of 2