Deploying an R Markdown Jekyll site to GitHub Pages

GitHub Pages’ built-in Jekyll support makes it very easy for techie types to deploy static web sites. Simply push your plain text Markdown content to a repository and the server-side Jekyll engine will render it for the web.

Markdown is good, but R Markdown is even better, assuming we ever want to write anything involving plots or data analysis. How can we write and edit blog posts in R Markdown and serve them on GitHub Pages without having to build everything locally?

In this post I will explain how you can use Travis CI to knit R Markdown posts and deploy them to a GitHub Pages Jekyll site.

Yihui Xie1 has added Jekyll support to the servr package and published a blog post and GitHub repository demonstrating how to use servr::jekyll to serve a Jekyll site locally with R Markdown.

This is a nice, but incomplete solution, because it means we can only really write and edit posts from computers on which R, servr and Jekyll are installed. Jekyll is written in Ruby, which can be a pain to install on Windows and fiddly to configure for GitHub Pages.

Ideally, we want to be able to maintain our web site from anywhere we have access to a web browser, just as we would a WordPress or Blogger site.

What we are going to do instead is knit our R Markdown posts in the cloud, which automatically pushes the resulting plain Markdown files and images to GitHub, where they will be served by Jekyll like a regular site.

For reference, I have created a minimal working repository.

Linking GitHub to Travis

If you have not already done so, set up a regular GitHub Pages Jekyll site. There are plenty of good guides for this on the web, so I won’t go into the details here.

Once that is all working smoothly, sign up for a free Travis CI account. Travis is a service designed to run unit tests on software packages so that bugs are not introduced during development. However, rather than running tests on software, we are going to be using Travis’s infrastructure to build our web site for us.

In the Travis ‘Accounts’ screen, look for your site’s repository and switch it on (green tick). It doesn’t actually do anything just yet, but now Travis knows to watch for future commits to this repo.

You will also need to generate a personal access token on GitHub, which grants Travis permission to push to your branch. Copy it to your clipboard.

Back on Travis, on the settings page for your branch, create an Environment Variable with name GITHUB_PAT and paste the personal access token into the Value field.

Configuring the Travis build

Travis is controlled by a file called .travis.yml that lives in the root of your Git repository. As a baseline, I recommend creating a file that contains the following configuration.

language: r
cache: packages
pandoc_version: 1.17.2

branches:
  only: source

script:
  - Rscript -e 'servr:::knit_maybe(c(".", "_source", "_posts"), c(".", "_posts", "_posts"), "build.R", "jekyll")'
  
deploy:
  provider: pages
  skip_cleanup: true
  github_token: $GITHUB_PAT
  on:
    branch: source
  target_branch: master

Let’s walk through this, line by line. The first two lines are:

language: r
cache: packages

Since we will be knitting R Markdown files into Markdown, we want Travis to have a copy of R installed. R is natively supported in Travis thanks to work by the community.

A Travis R build comes with pandoc and LaTeX, ostensibly for building R package documentation. To generate standalone R Markdown documents or web sites with Travis, you need to hoodwink the system into thinking it is building a real R package.

The most minimal R package comprises a single file, called DESCRIPTION. More on that below.

pandoc_version: 1.17.2

Pandoc is a key piece of software that R Markdown uses to convert documents between various formats. By default, Travis seems to use an old version of pandoc (1.15 or so), which can cause unexpected errors when trying to render R Markdown documents. At the time of writing 1.17.2 seems to be the recommended version of pandoc for R Markdown, though I expect newer releases should be fine, too.

branches:
  only: source

Choose the branch of your repository to which you will submit your code. For a personal site—i.e. username.github.io—GitHub says the final rendered web site files have to be on the master branch, so we want to push our source code somewhere else. I have opted to use a branch called source but you can use whatever you like.

Whenever you push commits to the source branch, Travis will notice and start a build. The output will then be deployed to another branch. We choose only: source so that Travis doesn’t trigger itself when it pushes your site to the master branch, otherwise we would get an endless feedback loop.

script:
  - Rscript -e 'servr:::knit_maybe(c(".", "_source", "_posts"), c(".", "_posts", "_posts"), "build.R", "jekyll")'

When you push a new commit to the repository, the script above looks for .Rmd files, converts them into .md files and puts them in the root directory (in the case of R Markdown pages) or the _posts directory in the case of R Markdown blog posts.

Why not use servr::jekyll(serve = TRUE)? Because that command requires Jekyll to be installed—not available on Travis’s R environment—and we aren’t interested in building the whole site with Jekyll on Travis anyway. All we want is plain Markdown files and images, which GitHub Pages’ own Jekyll engine will build into an HTML site for us.

deploy:
  provider: pages
  skip_cleanup: true
  github_token: $GITHUB_PAT
  on:
    branch: source
  target_branch: master

Once the site is built, it needs to be published or deployed somewhere. The line provider: pages means we take advantage of Travis’s native GitHub Pages support and don’t have to write our own shell script to run all the complicated git commands.

Skipping cleanup means Travis doesn’t delete everything it builds, which you might want when testing an R package, but not when building a web site.

The GitHub personal access token gives Travis permission to push to your repository. Make sure the variable name (after the $ sign) matches the one you set in Travis settings.

The last few lines specify Travis should look for your source code (R Markdown and Markdown files) and where to deploy the generated Markdown files. If you are working on a Project page rather than a User page, then you probably want to change the settings to the following.

  on:
    branch: master
  target_branch: gh-pages

DESCRIPTION file

To convince Travis it is building a valid R package, include a DESCRIPTION file in the root directory of the repository with the following contents.

Package: placeholder
Title: Does not matter.
Version: 0.0.1
Imports: servr, rmarkdown

The Package, Title and Version are arbitrary, but Imports describes which R packages should be installed when building your site. You need servr and rmarkdown at least. If R code chunks in your blog posts make use of other R packages, you might want to include those here as well.

build.R

This file is called on your R Markdown files. It knits them to Markdown and makes sure plots get saved to the right directory.

Push a new post

When you next push a commit to the on branch of your GitHub repository, Travis will start building and deploying your site. If it fails, you’ll receive an email about it and can have a look through the logs to find out why.

You should now have a system that automagically renders and deploys your R Markdown posts every time you push them to your site’s GitHub repository. If anything is unclear, have a look at my minimal working repository or a real example.

If you found this helpful or have any comments or questions, feel free to get in touch.


  1. Yihui has since turned his attention to the blogdown package, which is much more fleshed-out project based on the Hugo static site generator—a rival to Jekyll. (I will explain how to set up blogdown with Travis in a future post.) [return]
comments powered by Disqus