Tuesday, September 10, 2024 From rOpenSci (https://ropensci.org/blog/2024/09/10/script-screenshots/). Except where otherwise noted, content on this site is licensed under the CC-BY license.
As part of our work documenting R-Universe, we’re adding screenshots of the interface to the documentation website. Taking screenshots manually could quickly become very cumbersome, especially as we expect they’ll need updating in future: we might want to change the universes we feature, the interface might improve yet again and therefore look slightly different. Therefore, we decided to opt for a programmatic approach. In this post we shall present our learnings from using the R packages chromote and magick to produce screenshots.
To take a screenshot programmatically, we need to somehow get control of a browser from a script. The chromote R package is an actively maintained wrapper for Chrome Remote Interface, authored by Winston Chang and Barrett Schloerke. With chromote, you can open a browser in the background, navigate to the page of your choice, interact with it, and capture screenshots. chromote powers the experimental live web-scraping in the rvest package and is also used in shinytest2.
Generally, Jeroen Ooms’ magick R package isn’t required to take screenshots, but we wanted to format the images by adding shadows. Therefore this tool for image manipulation was necessary.
We start by initiating a chromote session of a standard screen size. We found the default screen size too narrow for our needs.
library("chromote")
screen_width <- 1920
screen_height <- 1080
b <- ChromoteSession$new(height = screen_height, width = screen_width)
We created a function that wraps b$screenshot()
while still exposing
the parameters we need to tweak for some of the pages.
The chromote package, like Shiny, uses R6.
Our function also calls the handy magick::image_shadow()
for adding a shadow to the image.
screenshot <- function(b, img_path,
selector = "html",
cliprect = c(top = 0, left = 0, width = 1920, height = 1080),
expand = NULL) {
b$screenshot(
img_path,
selector = selector,
delay = 2,
cliprect = cliprect,
expand = expand
)
magick::image_read(img_path) |>
magick::image_shadow() |>
magick::image_write(img_path, quality = 100)
}
We can now capture the search interface of R-Universe.
In our script, we’ve added a generous Sys.sleep()
call to ensure pages are properly loaded.
b$Page$navigate("https://r-universe.dev/search/")
Sys.sleep(1)
screenshot(b, "search.png")
This example captures the screenshot below.
We also wanted to demonstrate searching for something. We followed the example in a gist to enter text into the search navbar1:
b$DOM$querySelector()
;b$DOM$focus()
;b$Input$insertText()
.It was easier to reload the page than to try and delete the current text before doing a new search.
screenshot_search <- function(query, screen_width) {
b$Page$navigate("https://r-universe.dev/search/")
Sys.sleep(2)
search_box <- b$DOM$querySelector(b$DOM$getDocument()$root$nodeId, "#search-input")
b$DOM$focus(search_box$nodeId)
b$Input$insertText(text = query)
Sys.sleep(2)
screenshot(
b, sprintf("search-%s.png", snakecase::to_lower_camel_case(query))
)
}
purrr::walk(
c('"missing-data"', "author:jeroen json", "exports:tojson"),
screenshot_search,
screen_width = screen_width
)
The search interface of R-universe suggests some advanced research fields if one clicks on the arrow-down button near the search button. Through a GitHub search for code we found an example of clicking in the source code of rvest that we were able to adapt.
The code finds the arrow-down button through its class rather than id so it might be a bit brittle over time. Once the class is found, we retrieve its “box model” which is a bunch of coordinates, and then calculate the centre coordinates. After that, a “click” which is actually a three-step action: moving the mouse, pressing the mouse, releasing the mouse.
# Searching, advanced fields ----
b$Page$navigate("https://r-universe.dev/search/")
Sys.sleep(1)
search_info <- b$DOM$querySelector(
b$DOM$getDocument()$root$nodeId,
"button.btn.btn-outline-secondary.dropdown-toggle.dropdown-toggle-split"
)
quads <- b$DOM$getBoxModel(search_info$nodeId)
content_quad <- as.numeric(quads$model$content)
center_x <- mean(content_quad[c(1, 3, 5, 7)])
center_y <- mean(content_quad[c(2, 4, 6, 8)])
b$Input$dispatchMouseEvent(
type = "mouseMoved",
x = center_x,
y = center_y,
)
b$Input$dispatchMouseEvent(
type = "mousePressed",
x = center_x,
y = center_y,
button = "left",
clickCount = 1
)
b$Input$dispatchMouseEvent(
type = "mouseReleased",
x = center_x,
y = center_y,
button = "left",
clickCount = 1
)
Sys.sleep(2)
screenshot(b, "search-advanced.png")
The package pages on R-universe are quite exhaustive.
We needed to capture screenshots of different sections.
When using an URL with a fragment, for instance https://r-spatial.r-universe.dev/sf#citation
,
we got the exact same screenshot as for https://r-spatial.r-universe.dev/sf
.
We therefore had to use the cliprect
argument of the screenshot method.
Figuring out we needed it was a first gotcha.
A second gotcha is that cliprect
is “a (unnamed) vector/list with left, top, width and height”2:
and it’s the order, not the names, that matters.
In this post we explained how we used the chromote and magick R packages to produce screenshots for the documentation website of R-universe. Find the current version of our script. Do you use sometimes chromote or a similar programmatic browser interface?