Exploratory Data Analysis of Ancient Texts with rperseus

  David Ranzolin   | DECEMBER 5, 2017

Introduction When I was in grad school at Emory, I had a favorite desk in the library. The desk wasn’t particularly cozy or private, but what it lacked in comfort it made up for in real estate. My books and I needed room to operate. Students of the ancient world require many tools, and when jumping between commentaries, lexicons, and interlinears, additional clutter is additional “friction”, i.e., lapses in thought due to frustration.

googleLanguageR - Analysing language through the Google Cloud Machine Learning APIs

  Mark Edmondson   | OCTOBER 3, 2017

One of the greatest assets human beings possess is the power of speech and language, from which almost all our other accomplishments flow. To be able to analyse communication offers us a chance to gain a greater understanding of one another. To help you with this, googleLanguageR is an R package that allows you to perform speech-to-text transcription, neural net translation and natural language processing via the Google Cloud machine learning services.

Release 'open' data from their PDF prisons using tabulizer

  Thomas J. Leeper   | APRIL 18, 2017

There is no problem in science quite as frustrating as other peoples’ data. Whether it’s malformed spreadsheets, disorganized documents, proprietary file formats, data without metadata, or any other data scenario created by someone else, scientists have taken to Twitter to complain about it. As a political scientist who regularly encounters so-called “open data” in PDFs, this problem is particularly irritating. PDFs may have “portable” in their name, making them display consistently on various platforms, but that portability means any information contained in a PDF is irritatingly difficult to extract computationally.

Page 1 of 1