Exploratory Data Analysis of Ancient Texts with rperseus

December 5, 2017

By: David Ranzolin

Introduction When I was in grad school at Emory, I had a favorite desk in the library. The desk wasn’t particularly cozy or private, but what it lacked in comfort it made up for in real estate. My books and I needed room to operate. Students of the ancient world require many tools, and when jumping between commentaries, lexicons, and interlinears, additional clutter is additional “friction”, i.e., lapses in thought due to frustration.

googleLanguageR - Analysing language through the Google Cloud Machine Learning APIs

October 3, 2017

By: Mark Edmondson

One of the greatest assets human beings possess is the power of speech and language, from which almost all our other accomplishments flow. To be able to analyse communication offers us a chance to gain a greater understanding of one another. To help you with this, googleLanguageR is an R package that allows you to perform speech-to-text transcription, neural net translation and natural language processing via the Google Cloud machine learning services.

Text Analysis R Developers' Workshop 2017

May 3, 2017

By: Ken Benoit

On 21-22 April, the London School of Economics hosted the Text Analysis Package Developers’ Workshop, a two-day event held in London that brought together developers of R packages for working with text and text-related data. This included a wide range of applications, including string handling (stringi) and tokenization (the rOpenSci-onboarded tokenizers, KoNLP), corpus and text processing (readtext, tm, quanteda, and qdap), natural language processing (NLP) such as part of speech and dependency tagging (cleanNLP, spacyr), and the statistical analysis of textual data (stm, text2vec, and koRpus) – although this list is hardly complete.

New package tokenizers joins rOpenSci

August 23, 2016

By: Lincoln Mullen

The R package ecosystem for natural language processing has been flourishing in recent days. R packages for text analysis have usually been based on the classes provided by the NLP or tm packages. Many of them depend on Java. But recently there have been a number of new packages for text analysis in R, most notably text2vec, quanteda, and tidytext. These packages are built on top of Rcpp instead of rJava, which makes them much more reliable and portable.

Page 1 of 1