While many people groan at the thought of participating in a group ice breaker activity, we’ve gotten consistent feedback from people who have been to recent rOpenSci unconferences.
Best ice breaker ever!
We’ve had lots of requests for a detailed description of how we do it. This post shares our recipe, including a script you can adapt, a reflection on its success, examples of how others have used it, and some tips to remember. Let us know in the comments if you’ve used or adapted it!
...rOpenSci’s software engineer / postdoc Jeroen Ooms will explain what images are, under the hood, and showcase several rOpenSci packages that form a modern toolkit for working with images in R, including opencv, av, tesseract, magick and pdftools.
🕘 Thursday, November 15, 2018, 10-11AM PST; 7-8PM CET (find your timezone)
☎️ Find all details for joining the call on our Community Calls page. Everyone is welcome. No RSVP needed.
Agenda
Abstract
Images in various forms are used for numerous applications across scientific disciplines. Whether you are observing through satellite or microscope, looking at MRI scans or petri dishes, trying to find patterns or abnormalities, the data is in the image. Unfortunately the tools for working with images are traditionally highly fragmented by field, and often narrow in scope. At rOpenSci we are working on a suite of general purpose packages based on powerful c/c++ libraries. These provide an extensible and interoperable foundation for working with images in R, which can be used to implement domain specific-methods. This talk gives a taste of things we can currently do with images in R, and highlights some of the ongoing developments and challenges.
...pubchunks is a package grown out of the fulltext package. fulltext
provides a single interface to many sources of full text scholarly articles. As
part of the user flow in fulltext
there is an extraction step where fulltext::chunks()
pulls parts of articles out of XML format article files.
As part of making fulltext
more maintainable and focused on simply fetching articles,
and realizing that pulling out bits of structured XML files is a more general problem,
we broke out pubchunks
into a separate package. fulltext::ft_chunks()
and
fulltext::ft_tabularize()
will eventually be removed and we’ll point users to
pubchunks
.
Every R package has its story. Some packages are written by experts, some by
novices. Some are developed quickly, others were long in the making. This is the
story of jstor
, a package which I developed during my time as a student of
sociology, working in a research project on the scientific elite within
sociology. Writing the package has taught me many things (more on that later)
and it is deeply gratifying to see, that others find the package useful....
Proper identification of individuals is crucial for acknowledging and studying their scientific work, be it journal articles or pieces of software. In this tech note, one year after CRAN started supporting ORCIDs, we shall explain why and how to use unique author identifiers in DESCRIPTION files.
Why use ORCIDs on CRAN?
When analyzing the authorship of CRAN packages, one can look at authors’ names and email addresses. Names can be written with and without quotes, email addresses change, which makes it all tricky as noted by David Smith when he looked for the most prolific CRAN authors (notice our very own Scott Chamberlain and Jeroen Ooms in that scoreboard by the way?). Besides, several people can have the same name!
...