Tuesday, April 7, 2020 From rOpenSci (https://ropensci.org/blog/2020/04/07/bookdown-learnings/). Except where otherwise noted, content on this site is licensed under the CC-BY license.
After soliciting, reviewing, and publishing over 100 blog posts and tech notes by rOpenSci community members, we have created the rOpenSci Blog Guide for Authors and Editors to address many frequently asked questions and frequently given suggestions.
Technically, we structured the content as a bookdown gitbook. It was Stef’s first foray into the glorious process of publishing a book with bookdown, and Maëlle’s second1. And oh, we learned a lot.
There is a 9-hour time difference between Maëlle and me for most of the year. I’m in Kamloops, Canada and Maëlle is in Nancy, France. Since so much of this is new to me, text-based explanations from Maëlle via Slack usually boggled my mind. It helped immensely to have a weekly 30-minute meeting (8:30am for Stef and 5:30pm for Maëlle) with agenda and notes in a shared google doc. We would talk through our approaches and priorities and Maëlle would coach me in new-to-me tools.
At the end of a day, I would send a message to Maëlle on Slack to say “I’m finished for the day, can you please review and merge my pull requests?” or “Please review the structure but not the text yet”. Next morning, Maëlle would have done that plus her own work so I could update my local copy of our bookdown book and open new pull requests for new pieces of work.
Part way through the project, we got more strict about assigning issues to ourselves and tying them to specific milestones like “official release” or “nice to have one day”. This helped us work asynchronously toward a common goal, showed some light at the end of the tunnel, as well as helping us (try to) avoid scope-creep.
I’m an R beginner. My blog posts have never contained any R code. I’ve always written them in Markdown using the Atom editor and I’ve always run git from the terminal, building on recipes that Scott Chamberlain and Karthik Ram have shared with me. After three years working in git and GitHub, I’m an intermediate user who still lacks robust mental model and the vocabulary to explain it2. (This is not self-deprecation.) This bookdown Blog Guide project was the first time I used RStudio hooked up to GitHub to collaborate. In the process, I have become much more comfortable with installing packages as needed and keeping R, RStudio, and package versions up to date with a weekly calendar reminder.
For setting up RStudio and knowing how to open a project from .Rproj I loved R-Ladies Sydney’s “opinionated tour of RStudio”.
I used happygitwithR “Connect RStudio to Git and GitHub” to do exactly that. Years ago I had already used happygitwithR to set up my git/GitHub from square-one and my https credentials.
Maëlle and I both worked on feature branches of master. On any given day we each might work on mulitple chapters of the book. Because each book chapter is generated from one Rmd file (more on that in tip #4), and you don’t want one pull request to be too complex, I would create a separate feature branch for any chapter or file I worked on. I would periodically push my work to GitHub in a pull request (more in tip #3) with "Draft" status so Maëlle could see I was not yet seeking her review. At the end of a day’s commits, I would update the pull request status to “Ready for review” and assign Maëlle as reviewer. She would review, edit, and merge my pull requests and open new branches of her own for me to review.
Adding “Fix #54” in a pull request description automatically closes issue #54 when the pull request is merged into master.
At the start of the project, Maëlle asked me if I used the usethis package. I did not. But I bought into it because "usethis is a workflow package: it automates repetitive tasks that arise during project setup and development, both for R packages and non-package projects." The trickiest part, for me, was commiting to setting up and forcing myself to use usethis pull request helpers despite not really understanding how it would help me.
From previous work I had already followed happygitwithr to set up my GitHub authentication via https and a GitHub Personal Access Token (PAT) in my .Renviron
.
I followed usethis setup instructions to edit my .Rprofile
so that its functions would be available without explicitly calling the package.
I also followed the rest of the setup instructions somewhat religiously (e.g. “Prepare your system to build packages from source”), because I wasn’t confident enough to make a judgement on what I would and would not need.
Practically, I kept the usethis pull request helpers article open in my browser and forced myself to use them.
At the start of the day, I would open our bookdown RStudio project by clicking my local ropensci-blog-guidance.Rproj
file, pull to update my local master (because Maëlle would have reviewed, edited and merged my pull requests while I slept), create a new branch with pr_init(branch = "branchname")
, make edits and commits, and pr_push()
to push my local changes to GitHub master.
usethis automatically opened a browser to the GitHub web interface prompting me to open a pull request.
There’s no magic conclusion here. This feels a bit better than working on pull requests on the command line. I expect by forcing myself to usethis I’ll discover some magic soon enough.
To answer the question, “how do I do this thing?” I often compared the GitHub file structure and contents of completed books with their corresponding live pages. This, for me, had the biggest payoff in learning how to bookdown.
Things I learned:
One book chapter is made from one Rmd file.
Chapters are woven together in the _bookdown.yml
file that references those Rmd files
Create links inside the book with {#anchor}
. In the authorcontent.Rmd
file, the heading # Content Guidelines {#content}
means that a markdown format [link to that chapter](#content)
anywhere in the book will link to the Content Guidelines chapter.
Appendices A to H (too many; we know) are created from a single appendix.Rmd file made up of groupings of a heading, some text, and sometimes a code chunk that points to a file, like a template or checklist that populates that appendix.
I often felt quite euphoric about the things I was able to figure out comparing GitHub file structures to their books. Consistently, within days I would take this hard-earned knowledge for granted and feel inadequate in the face of my next technical challenge…until I felt the satisfaction of owning that next one too.
These bookdown books were helpful in exploring those comparisons:
rOpenSci Packages: Development, Maintenance, and Peer Review source on GitHub cf deployed book;
R for Data Science source on GitHub cf deployed book.
What doesn’t appear in this list of tips are all the things I’ve already forgotten that I had trouble with and learned to overcome by reading error messages and searching and poking at my setup until things worked. Looking back in Slack conversations with Maëlle I see that setting up usethis to use my GitHub credentials and serving the bookdown preview using the RStudio Addin were tricky. But errors had a lot to do with making sure the packages I was using were up to date (thus my calendar reminder in tip #2) and needing to install the development version of usethis. This humbling bookdown experience required me to figure out a whole new workflow and up my skills.
I’ve started both my bookdown projects using Sean Kross’ excellent primer, but whilst looking for a reference to show Stef3, I saw a tweet of Alison Hill’s about starting a bookdown project from RStudio which looks handy. Slide of Alison’s showing how to start a bookdown from RStudio.
In the blog guidance, if you hover around the top-right corner of e.g. the Markdown post template you get a copy-paste button. For this to work, the chunk needs to have some language information i.e.
```
code
```
will not get a copy-paste button, but
```yaml
code
```
will! I’m glad I know that now. Chunks with language info are prettier anyway since they get adapted code highlighting.
I found how to write code that’ll be loaded before each chapter rendering thanks to Christophe Dervieux’s answer on an old RStudio community thread.
One simply needs to create an R script called _common.R
at the root of the bookdown project.
Here’s ours below.
It contains chunk options, magrittr
loading, and two helper functions for rendering templates.
knitr::opts_chunk$set(
cache = TRUE,
echo = FALSE
)
library("magrittr")
show_template <- function(filename,
lang = "markdown",
details = FALSE,
yaml_only = FALSE,
...) {
lines <- suppressWarnings(
if(grepl("roweb2", filename)) {
readLines(filename)
} else {
readLines(
file.path("templates", filename)
)
}
)
if (yaml_only) {
lines <- bookdown:::fetch_yaml(lines)
}
lines %>%
glue::glue_collapse(sep = "\n") -> template
if (details) {
toshow <- details::details(template, summary = filename,
lang = lang,
...)
} else {
toshow <- glue::glue("````{lang}\n{template}\n````")
}
return(toshow)
}
show_checklist <- function(filenames) {
filenames <- file.path("checklists", filenames)
purrr::map(filenames,
readLines) %>%
unlist() %>%
gluedown::md_task() %>%
glue::glue_collapse("\n") -> x
glue::glue("````markdown\n{x}\n````") %>%
knitr::asis_output()
}
Our bookdown project uses DESCRIPTION
to track dependencies, I suppose I could use the package infrastructure more and define the helper functions as functions of a package, but the approach above is pleasant too.
In R Markdown, the same chunk options fig.cap
controls the caption and alternative text of images.
We wanted alternative text but no caption.
The header option figure_caption
didn’t work.
I opened an issue in bookdown repo after not getting solutions on an RStudio community post.
So if bookdown doesn’t really support having alternative text but no captions for figures, what did we do? Thanks to a good tip by Romain Lesur, I wrote CSS to remove all captions.
.caption {
display: none;
}
The lines above live with other styling stuff in a file called style.css
, that we refer to in the _output.yml
config file.
I felt quite strongly about having some sort of CI/CD for the book: having each edit to the source automatically resulting in an updated deployed book is a much smoother – and lazier – workflow than having to remember to render the book ourselves. We achieved that using GitHub Actions, a new CI service by GitHub. If you’re curious about it, I’d recommend watching Jim Hester’s talk from the RStudio conference earlier this year, and having a look at the exploration book written by participants of the Oz UnConf 2019. Our GitHub Actions workflows make good use of Jim’s actions and examples.
Here’s what we now have
on:
push:
branches:
master
name: Render-Book-from-master
jobs:
bookdown:
name: Render-Book
runs-on: macOS-latest
steps:
- uses: actions/checkout@v1
- uses: r-lib/actions/setup-r@v1
- uses: r-lib/actions/setup-pandoc@v1
- name: Query dependencies
run:
Rscript -e "install.packages('remotes')" -e "saveRDS(remotes::dev_package_deps(dependencies = TRUE), 'depends.Rds', version = 2)"
- name: Cache R packages
uses: actions/cache@v1
with:
path: ${{ env.R_LIBS_USER }}
key: ${{ runner.os }}-r-${{ hashFiles('depends.Rds') }}
restore-keys: ${{ runner.os }}-r-
- name: Install dependencies
run:
Rscript -e "library(remotes)" -e "deps <- readRDS('depends.Rds')" -e "deps[['installed']] <- vapply(deps[['package']], remotes:::local_sha, character(1))" -e "update(deps)"
- name: Render Book
run: Rscript -e 'bookdown::render_book("index.Rmd")'
- name: Commit results
if: github.repository == 'ropensci-org/blog-guidance'
run: |
cp ghpagescname docs/CNAME
cp -r favicon/ docs/
cp images/logo.png docs/logo.png
cd docs
git config --global user.email "[email protected]"
git config --global user.name "gh-pages committer"
git init
git add .
git commit -m 'update book'
git push https://${{github.actor}}:${{secrets.GITHUB_TOKEN}}@github.com/${{github.repository}}.git HEAD:gh-pages --force
after each commit in a pull request from a fork, the book is built so we’d notice something breaking the Rmd. log example;
after each commit in a pull request from the repo, the book is built and deployed to a Netlify preview whose URL is posted in the pull request checks. Log example, Direct link to the check with the preview link.
on: pull_request
name: PR-workflow
jobs:
bookdown:
name: Render Book
runs-on: macOS-latest
steps:
- name: Is this a fork
run: |
fork=$(jq --raw-output .pull_request.head.repo.fork "${GITHUB_EVENT_PATH}");echo "::set-env name=fork::$fork"
- uses: actions/checkout@v1
- uses: r-lib/actions/setup-r@v1
- uses: r-lib/actions/setup-pandoc@v1
- name: Query dependencies
run:
Rscript -e "install.packages('remotes')" -e "saveRDS(remotes::dev_package_deps(dependencies = TRUE), 'depends.Rds', version = 2)"
- name: Cache R packages
uses: actions/cache@v1
with:
path: ${{ env.R_LIBS_USER }}
key: ${{ runner.os }}-r-${{ hashFiles('depends.Rds') }}
restore-keys: ${{ runner.os }}-r-
- name: Install dependencies
run:
Rscript -e "library(remotes)" -e "deps <- readRDS('depends.Rds')" -e "deps[['installed']] <- vapply(deps[['package']], remotes:::local_sha, character(1))" -e "update(deps)"
- name: Render Book
run: Rscript -e 'bookdown::render_book("index.Rmd")'
- uses: actions/setup-node@v1
if: env.fork == 'false'
with:
node-version: "12.x"
- name: Install Netlify CLI
if: env.fork == 'false'
run: npm install netlify-cli -g
- name: Deploy to Netlify (test)
if: env.fork == 'false'
run: DEPLOY_URL=$(netlify deploy --site ${{ secrets.NETLIFY_SITE_ID }} --auth ${{ secrets.NETLIFY_AUTH_TOKEN }} --dir=docs --json | jq '.deploy_url' --raw-output);echo "::set-env name=DEPLOY_URL::$DEPLOY_URL"
- name: Create check
if: env.fork == 'false'
run: |
curl --request POST \
--url https://api.github.com/repos/${{ github.repository }}/check-runs \
--header 'authorization: Bearer ${{ secrets.GITHUB_TOKEN }}' \
--header 'Accept: application/vnd.github.antiope-preview+json' \
--header 'content-type: application/json' \
--data '{
"name": "Preview Book",
"head_sha": "${{ github.event.pull_request.head.sha }}",
"conclusion": "success",
"output": {
"title": "Preview link",
"summary": "[Preview link](${{ env.DEPLOY_URL }}) :rocket:"
}
}'
Highlights from the pull request workflow above:
jq
4 and sets it as an environment variable that can be used by next steps. I got the idea from a thread on the Netlify forum.run: DEPLOY_URL=$(netlify deploy --site ${{ secrets.NETLIFY_SITE_ID }} --auth ${{ secrets.NETLIFY_AUTH_TOKEN }} --dir=docs --json | jq '.deploy_url' --raw-output);echo "::set-env name=DEPLOY_URL::$DEPLOY_URL"
After the successful deployment of a preview, in the pull request checks, one check called “Preview book” appears.
When clicking on “Details” one gets to a check page where the preview link is prominent.
jq
on the raw info about the build, idea I got from Vanessa Sochat. - name: Is this a fork
run: |
fork=$(jq --raw-output .pull_request.head.repo.fork "${GITHUB_EVENT_PATH}");echo "::set-env name=fork::$fork"
And some of the further steps are skipped based on fork
.
- name: Install Netlify CLI
if: env.fork == 'false'
run: npm install netlify-cli -g
If you’re feeling motivated to add GitHub Actions deployment to your bookdown project, on top of Jim’s video mentioned earlier and our workflows, be sure to note that Emil Hvitfeld wrote an excellent guide to deploying your book on Netlify with GitHub Actions, with screenshots!
In this post, we shared tips and things we learned from novice and (more) experienced perspectives on bookdown and R project management whilst working on the new rOpenSci Blog Guide for Authors and Editors. We’re almost as happy with our new skills as we are with the Blog Guide itself! If you want to get started with bookdown, we’d recommend the bookdown book, this introductory slidedeck by Alison Hill and the bookdown gallery.
The first bookdown book Maëlle worked on is rOpenSci dev guide! ↩︎
Hat-tip to Julia Stewart Lowndes for using this phrasing to describe herself as if she read my mind. ↩︎
A tweet by Hadley Wickham about using the config file for ordering chapters rather than numbering their filenames ↩︎
Maëlle discovered jq
in a blog post by Carl Boettiger and reported on her use of jq
with rOpenSci jqr
R package in a blog post about getting data about Software Peer Review. ↩︎
The secrets are the site ID and your Netlify access tokens. Refer to Emil Hvitfeldt’s walk-through to see where to find your site ID, how to generate a token, and how to save both in the repo settings. ↩︎