Parsing Metadata with R - A Package Story

October 9, 2018

By: Thomas Klebel

Every R package has its story. Some packages are written by experts, some by novices. Some are developed quickly, others were long in the making. This is the story of jstor, a package which I developed during my time as a student of sociology, working in a research project on the scientific elite within sociology. Writing the package has taught me many things (more on that later) and it is deeply gratifying to see, that others find the package useful.

outcomerate: Transparent Communication of Quality in Social Surveys

October 2, 2018

By: Rafael Pilliard Hellwig

Background Surveys are ubiquitous in the social sciences, and the best of them are meticulously planned out. Statisticians often decide on a sample size based on a theoretical design, and then proceed to inflate this number to account for “sample losses”. This ensures that the desired sample size is achieved, even in the presence of non-response. Factors that reduce the pool of interviews include participant refusals, inability to contact respondents, deaths, and frame inaccuracies.

Mapping the 2018 East Africa floods from space with smapr

September 25, 2018

By: Max Joseph

Hundreds of thousands of people in east Africa have been displaced and hundreds have died as a result of torrential rains which ended a drought but saturated soils and engorged rivers, resulting in extreme flooding in 2018. This post will explore these events using the R package smapr, which provides access to global satellite-derived soil moisture data collected by the NASA Soil Moisture Active-Passive (SMAP) mission and abstracts away some of the complexity associated with finding, acquiring, and working with the HDF5 files that contain the observations (shout out to Laura DeCicco and Marco Sciaini for reviewing smapr, and Noam Ross for editing in the rOpenSci onboarding process).

Building Reproducible Data Packages with DataPackageR

September 18, 2018

By: Greg Finak

Sharing data sets for collaboration or publication has always been challenging, but it’s become increasingly problematic as complex and high dimensional data sets have become ubiquitous in the life sciences. Studies are large and time consuming; data collection takes time, data analysis is a moving target, as is the software used to carry it out. In the vaccine space (where I work) we analyze collections of high-dimensional immunological data sets from a variety of different technologies (RNA sequencing, cytometry, multiplexed antibody binding, and others).

phylotaR: Retrieve Orthologous Sequences from GenBank

August 8, 2018

By: Dom Bennett

In this technote I will outline what phylotaR was developed for, how to install it and how to run it with some simple examples. What is phylotaR? In any phylogenetic analysis it is important to identify sequences that share the same orthology – homologous sequences separated by speciation events. This is often performed by simply searching an online sequence repository using sequence labels. Relying solely on sequence labels, however, can miss sequences that have either not been labelled, have unanticipated names or have been mislabelled.

Page 1 of 10