rOpenSci | Blog

All posts (Page 85 of 122)

Thursday, February 8, 2018

Apply to attend rOpenSci unconf 2018!

For a fifth year running, we are excited to announce the rOpenSci unconference, our annual event loosely modeled on Foo Camp. rOpenSci unconferences have a rich history. You can get a feel for them by reading collected stories about people and projects from unconf17.

We’re organizing unconf18 to bring together scientists, developers, and open data enthusiasts from academia, industry, government, and non-profits to get together for a couple of days to hack on various projects and generally enrich our community. The agenda is mostly decided during the unconference itself. Past projects have related to open data, data visualization, data publication and open science using R. This event is unlike many other unconferences in that it is primarily invite-only, with a few spots set aside for self-nominations from the community at large. That’s you!

...

By Stefanie Butland

Tuesday, February 6, 2018

The prequel to the drake R package

The drake R package is a pipeline toolkit. It manages data science workflows, saves time, and adds more confidence to reproducibility. I hope it will impact the landscapes of reproducible research and high-performance computing, but I originally created it for different reasons. This post is the prequel to drake’s inception. There was struggle, and drake was the answer.

🔗
Dissertation frustration

My dissertation project was intense. The final computational challenge was to analyze multiple genomics datasets using an emerging method and its competitors. Even with GPU computing, which shrank days of runtime down to hours, the full battery of Markov chain Monte Carlo runs took several weeks from start to finish. I organized my workflow as an R package, and I worked in a loop:

...

By Will Landau

Monday, January 29, 2018

Introducing Maëlle Salmon, rOpenSci’s new Research Software Engineer

We’re very pleased to be introducing someone who needs no introduction in the R community. Join us in welcoming Maëlle Salmon to rOpenSci as a Research Software Engineer (part time, working from Nancy, France). We’d like to formally introduce her here and share a bit about the kinds of things she’ll be working on.

Maëlle did a B.Sc. in Biology with an emphasis on maths and quantitative work, two Masters degrees - one in Ecology and one in Public Health - and a Ph.D. in epidemiological statistics at the Ludwig-Maximilian University in Germany. Her thesis dealt with statistical algorithms for aberration detection in time series of counts of reported cases of infectious diseases. Most recently, Maëlle worked as a data manager and statistician for the CHAI project. Maëlle has contributed six packages to rOpenSci to date, and has written about two of them, ropenaq and rtimicropem for our guest blog series about onboarded software.

...

By Stefanie Butland, Scott Chamberlain, Maëlle Salmon

Thursday, January 25, 2018

nodbi: the NoSQL Database Connector

🔗
DBI

What is DBI? DBI is an R package. It defines an interface to relational database management systems (R/DBMS) that other R packages build upon to interact with a specific relational database, such as SQLite or PostgreSQL.

🔗
NoSQL

NoSQL databases are a very broad class of database that can include document databases such as CouchDB and MongoDB, key-value stores such as Redis, and more. They are generally not row-column relational stores though, though can include that. NoSQL is often thought of now as “not only SQL”.

...

By Scott Chamberlain

Wednesday, January 17, 2018

fulltext v1: text-mining scholarly works

🔗
The problem

Text-mining - the art of answering questions by extracting patterns, data, etc. out of the published literature - is not easy.

It’s made incredibly difficult because of publishers. It is a fact that the vast majority of publicly funded research across the globe is published in paywall journals. That is, taxpayers pay twice for research: once for the grant to fund the work, then again to be able to read it. These paywalls mean that every potential person text-mining will have different access: some have access through their university, some may have access through their company, and others may only have access to whatever happens to be open access. On top of that, access for paywall journals often depends on your IP address - something not generally on top of mind for most people.

...

By Scott Chamberlain