Static or dynamic? d3js and Jekyll with PhantomJS

D3.js is a great library for creating cool visualisations with HTML/SVG/CSS, particularly noteworthy for being data-driven, dynamic, and flexible. Jekyll is a static site generator tool, again flexible but often used for blogs, and notably on github pages.

On adding some D3 charts to a Jekyll site I had to wonder, is this really the right thing to do? Isn’t there a bit of a conflict between the statically generated philosophy of Jekyll and the dynamic nature of d3js? If the data is static and interaction is not required, then the graphics can be generated statically. I show how to do this using PhantomJS below.

Fetching data

For this example I’m going to use daily weather data to automatically create a chart of the day’s temperature for the date that each post is made. I’m getting data via the Weather Underground API. This allows a request to be made for JSON data, including hourly (or it seems half hourly) historical temperature data for a station.

To avoid cross-domain issues I’m making a JSONP request with jsonp.js D3 plugin. The call looks like this:

d3.jsonp('http://api.wunderground.com/api/{api_key}/history_20140912/q/UK/Edinburgh.json?callback={callback}', function (data) {
    // ...
});

Generating a chart

Given some data, it’s time to use D3 to create a chart. In developing this chart I made an HTML page which could be used to view it stand-alone, which will then be used by a PhantomJS script. The container page is super simple:

<html>
    <head>
        <title>Today's Weather</title>
        <script src="d3.v3.min.js"></script>
        <script src="jsonp.js"></script>
        <script src="weather.js"></script>
        <link rel="stylesheet" href="weather.css">
    </head>
<body>
    <svg class="weather"></svg>
</body>
</html>

I’m going to generate a stupid boring looking line chart for simplicity. We grab some data (just taking raw numbers for the purposes of demonstration):

var temps = data.history.observations.map(function (x) { 
    return parseFloat(x.tempm, 10); 
});
        

Create X and Y scales:

var width = 30, height = 20;
var x_scale = d3.scale.linear()
    .range([0, width])
    .domain([0, temps.length]);

var y_scale = d3.scale.linear()
    .range([height, 0])
    .domain([d3.min(temps), d3.max(temps)]);
    

And then draw the actual line:

var draw_line = d3.svg.line()
    .interpolate("basis")
    .x(function (d, i) { return x_scale(i); })
    .y(function (d) { return y_scale(d); });
d3.select("svg")
    .attr("width", width)
    .attr("height", height)
    .append("path").datum(temps)
    .attr("d", draw_line);

And that’s it! It looks like this today (this is the actual SVG):

Extracting a chart with PhantomJS

So now that we have a webpage that can create a chart, it’s time to automate the production of a static SVG which we can include in a statically generated page. A PhantomJS script is much like a NodeJS script (excepting that it doesn’t actually run under Node). PhantomJS will run our JavaScript in a built-in headless Webkit browser instance.

Strictly it isn’t required to run PhantomJS to run our D3 script headlessly - it’s actually possible with node.js and the domjs module. However it proves so difficult to install the required native node modules on many systems that it’s easier to grab PhantomJS and save a headache.

Here we require the system module to obtain arguments, and generate output on the console (which will be displayed as the output of the phantomjs command).

var system = require('system');
var date = system.args[1]
var fileName = system.args[2];

console.log("Generating: " + fileName);

This is JavaScript code executing directly, what we really want is to load and capture the output of a webpage. We do this with the page.open function.

var page = require('webpage').create();
page.open("weather.html", function (status) {
    // ...
});

The page is the one referenced above, containing a reference to D3, and the JS generating our chart. We pass a callback function to page.open which will be called on page load. To get output, we then need to evaluate a function in the context of that page. A simple example would be like this:

var title = page.evaluate(function() { return document.title; });
console.log(title);

Note that output of the page itself (such as a console.log in the function we pass to page.evaluate) will not appear on our output, but we can return a result from page.evaluate to make use of it.

var page = require('webpage').create();
page.open("weather.html", function (status) {
    page.evaluate(function (date) {
        chart(date, function () {
            window.callPhantom();
        });
    }, date);
});

In our case we pass the date argument because evaluate runs its argument in a context that does not have access to the containing scope variables. Rather than returning a result, we want to get output asynchronously when the data is downloaded and chart is generated, so we pass a callback to the chart method which will then call window.callPhantom() - the “escape hatch” for calling back from the evaluate context. This is used by the onCallback handler to extract the rendered HTML of the page (in this case, simply the SVG markup snippet):

page.onCallback = function () {
    var x = page.evaluate(function () {
        return d3.select("body").html();
    });
    fs.write(fileName, x, 'w');
    phantom.exit();
});

We can run it as so: phantomjs generate.js 20140912 output.svg.

Generating from Jekyll

Finally we come to the thing we came for, autogenerating the chart from Jekyll on build.

We create a simple Jekyll plugin which will run the PhantomJS script. This simply loops over all posts, tries to generate a daily temperature chart for each if required, and dump the output in the includes directory. The filename is also added to the post config.

module Weather
  class Generator < Jekyll::Generator
    def generate(site)
        site.posts.each do |post| 
            date = post.date.strftime("%Y%m%d")
            file = "weather/"+date+".svg"
            Dir.chdir("_plugins/weather") do
                relativeFile = "../../_includes/" + file
                if (! (File.exist?(relativeFile)) )
                    system("phantomjs", "generate.js", date, relativeFile);
                end
            end
            post.data['weather_svg'] = file
        end
    end
  end
end

We can then simply include the file in our template:

{% include {{ post.weather_svg }} %}

End result

Rendering with PhantomJS

Since we’re using PhantomJS, we can even go one further, and render the final page to PNG rather than just dumping out the SVG. This might be useful for displaying in older browsers (IE8).

In order to generate the correct size of image, we create the viewport at the appropriate size:

page.viewportSize = { width: 30, height: 20 };

Then when the page is ready, we simply call the render method:

page.render(fileName + ".png");

The output image could easily be used with Modernizr’s Modernizr.svg test to replace SVG on non-supporting browsers (admittedly compromising the “static” nature of the page to a small degree), or simply use the image in all cases.

Closing thoughts

My take-home message is regardless of the technologies involved to consider carefully whether content needs to be dynamically generated at page-view time. More precisely even if content is created dynamically based on changing data, it may be worth cacheing.

Tools such as PhantomJS allow visualisations which might normally be generated client side to be generated just once on the server. This allows us to leverage existing work, such as the wide library of available D3 visualisations, and even still develop initially in the same browser-based fashion, but then integrate this into a static build process for production use.

There is always a trade-off between static and dynamic options, and for scenarios with data update or heavy interactivity this technique will not be suitable. Some limited interactivity will be possible, for example using CSS for highlighting, transitions etc. It might even be worth considering a hybrid approach in some case, of pre-generating content for the initial state but dynamically manipulating in the case of data update or interaction.

The demo Jekyll site used for this post is available on github.

MORE BY NICHOLAS

CodeMesh 2017

TypeScript compiler APIs revisited

blog comments powered by Disqus