rOpenSci | Blog

All posts (Page 67 of 120)

rcites - The story behind the package

🔗

The Ecology Hackathon

Almost one year ago now, ecologists filled a room for the “Ecology Hackathon: Developing R Packages for Accessing, Synthesizing and Analyzing Ecological Data” that was co-organised by rOpenSci Fellow, Nick Golding and Methods in Ecology and Evolution. This hackathon was part of the “Ecology Across Borders” Joint Annual Meeting 2017 of BES, GfÖ, NecoV, and EEF in Ghent. At different tables, different people joined each other to work on different ideas to implement as R packages. At our table, we were around ten people that more or less did not know anything about what we aimed for. We barely knew each other and nobody had clear expectations, just the desire of learning more about R packages. We were interested in a common idea posted as a wishlist in the rOpenSci community: building an R package to interact with CITES and its Speciesplus database. CITES (the Convention on International Trade in Endangered Species of Wild Fauna and Flora) is an international agreement between governments and provides key information to ensure that international trade in specimens of wild animals and plants does not threaten their survival. At 10 am, nobody had a clear idea on where to start. By 6 pm, we had a functional prototype of the rcites package, which was really rewarding and gave motivation to follow up on the package development. We did great team-work, met new researchers, and learned a bunch of new stuff. This was definitely a successful hackathon!

...

Pdftools 2.0: powerful pdf text extraction tools

A new version of pdftools has been released to CRAN. Go get it while it’s hot:

install.packages("pdftools")

This version has two major improvements: low level text extraction and encoding improvements.

🔗

About PDF textboxes

A pdf document may seem to contain paragraphs or tables in a viewer, but this is not actually true. PDF is a printing format: a page consists of a series of unrelated lines, bitmaps, and textboxes with a given size, position and content. Hence a table in a pdf file is really just a large unordered set of lines and words that are nicely visually positioned. This makes sense for printing, but makes extracting text or data from a pdf file extremely difficult.

...

Generating reasonable starting trees for complex phylogenetic analyses

I never really thought I would write an R package. I use R pretty casually. Then, this year, I was invited to participate during the last week of the Analytical Paleobiology short course, an intensive month-long experience in quantitative paleontology. I was thrilled to be invited. But I got a slight sinking feeling in my stomach when I realized all the materials were in R.

And so I, a Pythonista, decided I would spend some of my maternity leave writing R packages to try to blend in with students who had spent the month living and breathing R.

...

Community Call - Governance strategies for open source research software projects

🎤 Dan Sholler, rOpenSci Postdoctoral Fellow

🕘 Tuesday, December 18, 2018, 10-11AM PST; 7-8PM CET (find your timezone)

☎️ Details for joining the Community Call. Everyone is welcome. No RSVP needed.

Researchers use open source software for the capabilities it provides, such as streamlined data access and analysis and interoperability with other pieces of the scientific computing ecosystem. For most complex software, generating these technical capabilities requires building and governing a community via sound management practices, activities that are often less visible than code contributions and other software development work. And unless the initial developers commit to doing all the needed work for a long time, a community needs to develop to sustain the software, and in many cases, to determine where the software should go. In this call, we’ll pull back the cover on some of the non-technical work that goes into building and sustaining a software project by highlighting the governance challenges projects face and the strategies they use to overcome them.

...

rnoaa: new data sources and NCDC units

We’ve just released a new version of rnoaa with A LOT of changes. Check out the release notes for a complete list of changes.

We’ll highlight a few things in this post:

  • New data sources in the package
  • NCDC units added to the output of ncdc()

Links:

🔗

Installation

Install the lastest from CRAN

install.packages("rnoaa")

Some binaries are not up yet on CRAN - you can also install from GitHub:

...

Working together to push science forward

Happy rOpenSci users can be found at