Deploying an R Markdown Jekyll site to GitHub Pages
GitHub Pages' built-in Jekyll support makes it very easy for techie types to deploy static web sites. Simply push your plain text Markdown content to a repository and the server-side Jekyll engine will render it for the web.
Markdown is good, but R Markdown is even better, assuming we ever want to write anything involving plots or data analysis. How can we write and edit blog posts in R Markdown and serve them on GitHub Pages without having to build everything locally?
In this post I will explain how you can use Travis CI to knit R Markdown posts and deploy them to a GitHub Pages Jekyll site.
Yihui Xie1 has added Jekyll support to the servr package and published a blog post and GitHub repository demonstrating how to use servr::jekyll
to serve a Jekyll site locally with R Markdown.
This is a nice, but incomplete solution, because it means we can only really write and edit posts from computers on which R, servr and Jekyll are installed. Jekyll is written in Ruby, which can be a pain to install on Windows and fiddly to configure for GitHub Pages.
Ideally, we want to be able to maintain our web site from anywhere we have access to a web browser, just as we would a WordPress or Blogger site.
What we are going to do instead is knit our R Markdown posts in the cloud, which automatically pushes the resulting plain Markdown files and images to GitHub, where they will be served by Jekyll like a regular site.
For reference, I have created a minimal working repository.
Linking GitHub to Travis
If you have not already done so, set up a regular GitHub Pages Jekyll site. There are plenty of good guides for this on the web, so I won’t go into the details here.
Once that is all working smoothly, sign up for a free Travis CI account. Travis is a service designed to run unit tests on software packages so that bugs are not introduced during development. However, rather than running tests on software, we are going to be using Travis’s infrastructure to build our web site for us.
In the Travis ‘Accounts’ screen, look for your site’s repository and switch it on (green tick). It doesn’t actually do anything just yet, but now Travis knows to watch for future commits to this repo.
You will also need to generate a personal access token on GitHub, which grants Travis permission to push to your branch. Copy it to your clipboard.
Back on Travis, on the settings page for your branch, create an Environment Variable with name GITHUB_PAT
and paste the personal access token into the Value field.
Configuring the Travis build
Travis is controlled by a file called .travis.yml
that lives in the root of your Git repository.
As a baseline, I recommend creating a file that contains the following configuration.
language: r
cache: packages
pandoc_version: 1.17.2
branches:
only: source
script:
- Rscript -e 'servr:::knit_maybe(c(".", "_source", "_posts"), c(".", "_posts", "_posts"), "build.R", "jekyll")'
deploy:
provider: pages
skip_cleanup: true
github_token: $GITHUB_PAT
on:
branch: source
target_branch: master
Let’s walk through this, line by line. The first two lines are:
language: r
cache: packages
Since we will be knitting R Markdown files into Markdown, we want Travis to have a copy of R installed. R is natively supported in Travis thanks to work by the community.
A Travis R build comes with pandoc and LaTeX, ostensibly for building R package documentation. To generate standalone R Markdown documents or web sites with Travis, you need to hoodwink the system into thinking it is building a real R package.
The most minimal R package comprises a single file, called DESCRIPTION
. More on that below.
pandoc_version: 1.17.2
Pandoc is a key piece of software that R Markdown uses to convert documents between various formats. By default, Travis seems to use an old version of pandoc (1.15 or so), which can cause unexpected errors when trying to render R Markdown documents. At the time of writing 1.17.2 seems to be the recommended version of pandoc for R Markdown, though I expect newer releases should be fine, too.
branches:
only: source
Choose the branch of your repository to which you will submit your code.
For a personal site—i.e. username.github.io
—GitHub says the final rendered web site files have to be on the master
branch, so we want to push our source code somewhere else.
I have opted to use a branch called source
but you can use whatever you like.
Whenever you push commits to the source
branch, Travis will notice and start a build.
The output will then be deployed to another branch.
We choose only: source
so that Travis doesn’t trigger itself when it pushes your site to the master
branch, otherwise we would get an endless feedback loop.
script:
- Rscript -e 'servr:::knit_maybe(c(".", "_source", "_posts"), c(".", "_posts", "_posts"), "build.R", "jekyll")'
When you push a new commit to the repository, the script above looks for .Rmd
files, converts them into .md
files and puts them in the root directory (in the case of R Markdown pages) or the _posts
directory in the case of R Markdown blog posts.
Why not use servr::jekyll(serve = TRUE)
? Because that command requires Jekyll to be installed—not available on Travis’s R environment—and we aren’t interested in building the whole site with Jekyll on Travis anyway.
All we want is plain Markdown files and images, which GitHub Pages' own Jekyll engine will build into an HTML site for us.
deploy:
provider: pages
skip_cleanup: true
github_token: $GITHUB_PAT
on:
branch: source
target_branch: master
Once the site is built, it needs to be published or deploy
ed somewhere.
The line provider: pages
means we take advantage of Travis’s native GitHub Pages support and don’t have to write our own shell script to run all the complicated git commands.
Skipping cleanup means Travis doesn’t delete everything it builds, which you might want when testing an R package, but not when building a web site.
The GitHub personal access token gives Travis permission to push to your repository.
Make sure the variable name (after the $
sign) matches the one you set in Travis settings.
The last few lines specify Travis should look for your source code (R Markdown and Markdown files) and where to deploy the generated Markdown files. If you are working on a Project page rather than a User page, then you probably want to change the settings to the following.
on:
branch: master
target_branch: gh-pages
DESCRIPTION file
To convince Travis it is building a valid R package, include a DESCRIPTION
file in the root directory of the repository with the following contents.
Package: placeholder
Title: Does not matter.
Version: 0.0.1
Imports: servr, rmarkdown
The Package
, Title
and Version
are arbitrary, but Imports
describes which R packages should be installed when building your site.
You need servr and rmarkdown at least.
If R code chunks in your blog posts make use of other R packages, you might want to include those here as well.
build.R
This file is called on your R Markdown files. It knits them to Markdown and makes sure plots get saved to the right directory.
Push a new post
When you next push a commit to the on
branch of your GitHub repository, Travis will start building and deploying your site. If it fails, you’ll receive an email about it and can have a look through the logs to find out why.
You should now have a system that automagically renders and deploys your R Markdown posts every time you push them to your site’s GitHub repository. If anything is unclear, have a look at my minimal working repository or a real example.
If you found this helpful or have any comments or questions, feel free to get in touch.