The 2013 rOpenSci challenge
At rOpenSci we’re very passionate about engaging with our community and getting more people on board with open science and open data. There are many challenges to be overcome before this practice becomes mainstream. Even when researchers see the value in engaging more openly, the learning curve associated with various aspects of the workflow can seem daunting. To identify some of these challenges and barriers, we launched an open science challenge at the start of the year. If any researchers were interesting is using the suite of tools we’ve built so far, we offered to help them through the technical challenges they might encounter. We’re excited to report that we’ll be working closely with Simon Queenborough and Julien Colomb on this effort. Below are brief summaries of their work. Stay tuned for updates.
I began my research career in the tropical forests of Central and South America and since a post-doc working in arable land, I have a deep appreciation of both the questions and relative merits of working in hyper-diverse versus depauperate systems. I maintain an active interest in both. My current research program falls into three main areas: understanding diversity, demography of plant populations, and resource allocation.
My research is motivated by the striking differences between plants and animals. The sessile nature, modular construction and flexibility that plants exhibit lead to intriguing challenges in `thinking like a plant’. Such challenges include how over 600 species of tree can coexist within 1-ha of rain forest, how populations respond to biotic, abiotic and human factors, and how plants allocate limited resources to growth, defense and reproduction.
The resources available to an organism are finite and are usually allocated to one function: reproduction, growth, or defense. Trade-offs among these functions is central to life history theory, but the ease of quantifying allocation to each varies among taxa. For example, comparing males and females of dioecious species isolates the resource axis.
One of the most curious and obscure problems (Darwin, 1877) in evolutionary biology is how dioecy evolves and is maintained. The seed shadow handicap of dioecious populations requires at least a twofold greater relative fitness than hermaphrodite competitors for coexistence. Thus, successful invasion into hermaphrodite populations must offset this handicap. Specifically, more resources must be invested per seed, producing offspring with higher fitness; but apart from seed size, other traits have received little attention. For the rOpenSci Challenge, I am addressing this deficit using demographic data on thousands of species and their reproductive traits, within a phylogenetic statistical framework.
I did a post-doc at the National Center for Ecological Analysis & Synthesis and analyzed a lot of large data-sets, but became increasingly skeptical about the analyses that we run on such data and how it was possible to replicate them, or even know fully what had been done. Discovering that all manipulation of data and analyses could be scripted and documented (and therefore checked) along with the manuscript was very exciting. And I love being able to create a full professional-looking paper from only a text file and some data! Participating in the rOpenSci Challenge and working with people who really know what they are doing I hope will increase my efficiency and understanding of reproducible and open research.
Apart from having found a company (Drososhare) in 2012, Dr. Julien Colomb works part-time as a post doc in Bjoern Brembs’ lab in Berlin. They are neurobiologists who use the fly Drosophila melanogaster as a model system. Their main scientific questions are to ask how nervous systems produce spontaneous actions and how these actions are modulated by learning. In particular, they study a specific form of instrumental learning that shares some characteristics with language learning and skill learning in mammals. Using the genetic tools available in Drosophila, they could, for example, demonstrate that the inhibition of PKC prevents the formation of this peculiar form of learning, but not another instrumental learning task. Dr. Colomb got his PhD in 2006, he learned to use R during his first postdoc in Paris. He was rapidly convinced of its power for data analysis and presentation. His first move into data reuse happened then: he convinced the lab members to use the same data format for their common experiments, allowing metadata analysis at the lab level. In 2009, he started to work with Bjoern Brembs, an avid supporter for open science. He continued to learn R working on CeTrAn, an open source centroid trajectory analysis software that was in development in the lab. After publishing the software in 2012 in Plos One, he is working on the import of data coming from different trackers in CeTrAn. Most recently, he tested new tools to make the data automatically published and available online, again using R. In this project, Dr. Colomb and Prof. Brembs are seeking help on R, version control, and database design, and are eager to start collaborating with the rOpenSci team. The upcoming commercial and academic tools will make it possible to have science open by default and this project is a case study for such developments.