phylotaR: Retrieve Orthologous Sequences from GenBank

August 8, 2018

By: Dom Bennett

In this technote I will outline what phylotaR was developed for, how to install it and how to run it with some simple examples. What is phylotaR? In any phylogenetic analysis it is important to identify sequences that share the same orthology – homologous sequences separated by speciation events. This is often performed by simply searching an online sequence repository using sequence labels. Relying solely on sequence labels, however, can miss sequences that have either not been labelled, have unanticipated names or have been mislabelled.

Extracting and Processing eBird Data

August 7, 2018

By: Matthew Strimas-Mackey

eBird is an online tool for recording bird observations. The eBird database currently contains over 500 million records of bird sightings, spanning every country and nearly every bird species, making it an extremely valuable resource for bird research and conservation. These data can be used to map the distribution and abundance of species, and assess how species’ ranges are changing over time. This dataset is available for download as a text file; however, this file is huge (over 180 GB!

treeio: Phylogenetic data integration

May 17, 2018

By: Guangchuang Yu

Phylogenetic trees are commonly used to present evolutionary relationships of species. Newick is the de facto format in phylogenetic for representing tree(s). Nexus format incorporates Newick tree text with related information organized into separated units known as blocks. For the R community, we have ape and phylobase packages to import trees from Newick and Nexus formats. However, analysis results (tree + analysis findings) from widely used software packages in this field are not well supported.

Nomisr - Access 'Nomis' UK Labour Market Data

May 8, 2018

By: Evan Odell

I’m excited to announce a new package for accessing official statistics from the UK. nomisr is the R client for the Nomis database. Nomis is run by Durham University on behalf of the UK’s Office for National Statistics (ONS), and contains over a thousand datasets, primarily on the UK labour market, census data, benefit spending and general economic activity. Registration is optional, although registration and the use of an API key allows for larger queries without the risk of being timed out or rate limited by the API.

ὕδωρ + σκοπῶ = water + observe

April 3, 2018

By: Konstantinos Vantas

Hydrology is a concept to unify statistics, data analysis and numerical models in order to understand and analyze the endless circulation of water between the earth and its atmosphere. That’s a lot alike Data Science, isn’t it? Hydrologic Processes evolve in space and time, are extremely complex and we may never comprehend them. For this reason Hydrologists use models where their inputs and outputs are measurable variables: climatic and hydrologic data, land uses, vegetation coverage, soil type etc.

Page 1 of 3