Building a GOV.UK exhibit

The GOV.UK website is being shown at the Design Museum as part of the 2013 Designs of the Year awards. GDS asked me to develop an interactive exhibit to highlight GOV.UK’s responsive design by allowing gallery visitors to browse the site on a desktop computer, an iPhone and an iPad simultaneously.

Setting up

I chose a simple solution based on web technologies: proxy the HTTP requests to the GOV.UK site, inject some JavaScript into every page, and use a WebSocket connection to propagate navigation and scrolling events between browsers on the three devices.

Contents

Proxying requests

I began by building a basic HTTP proxy in Ruby with Rack::Proxy. I installed Thin and wrote this config.ru:

require 'rack/proxy'
require 'uri'

GOV_UK_URL = URI.parse('https://www.gov.uk')

class Proxy < Rack::Proxy
  def rewrite_env(env)
    env.
      merge(
        'rack.url_scheme' => GOV_UK_URL.scheme,
        'HTTP_HOST'       => GOV_UK_URL.host,
        'SERVER_PORT'     => GOV_UK_URL.port
      ).
      reject { |key, _| key == 'HTTP_ACCEPT_ENCODING' }
  end

  def rewrite_response(response)
    status, headers, body = response

    [
      status,
      headers.reject { |key, _| %w(status transfer-encoding).include?(key) },
      body
    ]
  end
end

run Proxy.new

Aside from forwarding every request to https://www.gov.uk/, the proxy makes a couple of other changes:

This simple Rack app doesn’t yet modify the content of pages at all, but it already does a decent job of proxying the site without too many problems. Thankfully most of the navigational links on GOV.UK are relative, so the browser often stays at the proxy hostname rather than navigating away to the real site; conversely, almost all of the asset URLs are absolute, so the images and stylesheets get loaded from the real site without burdening the proxy.

Injecting JavaScript

I created an empty file, mirror.js, and used the Rack::Static middleware to serve it from the proxy app:

require 'rack'
require 'rack/proxy'
require 'uri'

GOV_UK_URL = URI.parse('https://www.gov.uk')
MIRROR_JAVASCRIPT_PATH = '/mirror.js'

# ...

use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
run Proxy.new

Next I wrote a Rack middleware to add a <script> tag to the <head> of text/html responses so that mirror.js got loaded on every page:

# ...

class InsertTags < Struct.new(:app)
  def call(env)
    status, headers, body = app.call(env)

    Rack::Response.new(body, status, headers) do |response|
      if media_type(response) == 'text/html'
        content = add_tags(response.body.join)
        response.body = [content]
        response.headers['Content-Length'] = content.length.to_s
      end
    end
  end

  def media_type(response)
    response.content_type.to_s.split(';').first
  end

  def add_tags(content)
    content.sub(%r{(?=</head>)}, script_tags)
  end

  def script_tags
    %Q{<script src="#{MIRROR_JAVASCRIPT_PATH}"></script>}
  end
end

use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
use InsertTags
run Proxy.new

Installing Faye

Using a raw WebSocket connection involves dealing with low-level details like timeouts, buffering and message encoding, as well as writing a WebSocket server to marshal communication between clients. To avoid this administrative overhead I decided to use Faye, a project which provides Ruby and JavaScript implementations of the Bayeux protocol for asynchronous publish/subscribe messaging over HTTP. Faye abstracts away the WebSocket layer and exposes a simple interface for bidirectional real-time communications: any client can publish a message on a named channel, and the Faye server delivers that message to all clients who are subscribed to the same channel.

I installed the faye gem, loaded the Thin WebSocket adapter, and added Faye’s Rack adapter to the middleware stack:

require 'faye'
require 'rack'
require 'rack/proxy'
require 'uri'

# ...

Faye::WebSocket.load_adapter('thin')
use Faye::RackAdapter, mount: '/faye'

use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
use InsertTags
run Proxy.new

This configures Faye to respond to HTTP requests whose paths begin with /faye. Faye can serve the source of its own JavaScript client, so I modified the InsertTags middleware to tell the browsers to load that file on every page too:

# ...

GOV_UK_URL = URI.parse('https://www.gov.uk')
FAYE_JAVASCRIPT_PATH = '/faye/faye-browser-min.js'
MIRROR_JAVASCRIPT_PATH = '/mirror.js'

# ...

class InsertTags < Struct.new(:app)
  # ...

  def script_tags
    [FAYE_JAVASCRIPT_PATH, MIRROR_JAVASCRIPT_PATH].
      map { |src| %Q{<script src="#{src}"></script>} }.join
  end
end

Syncing scroll positions

Now that every browser was loading Faye and my (empty) mirror.js script, I was ready to write some JavaScript to connect them together.

Synchronising their scroll positions was the easiest part. The scroll event fires whenever a window is scrolled, and window.scrollTo can be used to set the current scroll position, so I could just use Faye to broadcast scroll events and recreate them on other devices.

To avoid feedback loops and other asynchronous difficulties, I decided to designate each browser as either sending scroll messages or receiving them, but not both. (Gallery visitors will be interacting with the desktop machine while watching the iPhone and iPad under perspex, so one-way synchronisation is sufficient.) The simplest way to differentiate these roles was to use two hostnames: the controlling browser opens a URL containing the canonical hostname of the proxy server, while each mirroring browser is started at an alias hostname beginning with mirror. The JavaScript can then easily check the current hostname and decide whether to send or receive scroll events.

Here’s what went into mirror.js:

(function () {
  var begin = function (beginControlling, beginMirroring) {
    var faye = new Faye.Client('/faye');

    if (window.location.hostname.indexOf('mirror') === 0) {
      beginMirroring(faye);
    } else {
      beginControlling(faye);
    }
  };

  var beginControlling = function (faye) {
    window.addEventListener('scroll', function () {
      faye.publish('/scroll', { x: window.scrollX, y: window.scrollY });
    });
  };

  var beginMirroring = function (faye) {
    faye.subscribe('/scroll', function (message) {
      window.scrollTo(message.x, message.y);
    });
  };

  begin(beginControlling, beginMirroring);
}());

This is enough to reproduce the scrolling behaviour of the controlling browser in each of the mirroring browsers.

Syncing URLs

The next step was to synchronise the URL of the page being shown by all the browsers. I did this by publishing a message to the /navigation channel every time a click event occurred inside any <a> element in the controlling browser, and setting window.location.href in each mirroring browser when this message was received:

(function () {
  // ...

  var navigateTo = function (url) {
    if (window.location.href !== url) {
      window.location.href = url;
    }
  };

  var beginControlling = function (faye) {
    // ...

    window.addEventListener('click', function (event) {
      var element = event.target;

      while (element) {
        if (element.localName === 'a') {
          event.preventDefault();
          faye.publish('/navigate', { url: element.href });
          navigateTo(element.href);
          break;
        }

        element = element.parentNode;
      }
    });
  };

  var beginMirroring = function (faye) {
    // ...

    faye.subscribe('/navigate', function (message) {
      navigateTo(message.url);
    });
  };

  begin(beginControlling, beginMirroring);
}());

Manually setting window.location.href in the controlling browser (rather than allowing the default click event behaviour) has the desirable side-effect of forcing any awkward links (e.g. target="_blank") to open in the current window.

Although this code successfully synchronises the first URL change, it causes the mirroring browsers to navigate away from the mirroring hostname and onto the controlling hostname, preventing any further updates. I fixed this by updating navigateTo to rewrite all URLs to use the browser’s current protocol and host:

(function () {
  // ...

  var navigateTo = function (url) {
    var a = document.createElement('a');
    a.href = url;
    a.protocol = window.location.protocol;
    a.host = window.location.host;

    if (window.location.href !== a.href) {
      window.location.href = a.href;
    }
  };

  // ...
}());

Because the controlling browser is also using navigateTo, this prevents the user from navigating away from GOV.UK, although the resulting behaviour — clicking on an external link takes you to that link’s path on the current hostname — is unexpected. To avoid this I just completely disabled navigation to any external link:

(function () {
  // ...

  var beginControlling = function (faye) {
    // ...

    window.addEventListener('click', function (event) {
      var element = event.target;

      while (element) {
        if (element.localName === 'a') {
          event.preventDefault();

          if (element.host === window.location.host || element.hostname === 'www.gov.uk') {
            faye.publish('/navigate', { url: element.href });
            navigateTo(element.href);
          }

          break;
        }

        element = element.parentNode;
      }
    });
  };

  // ...
}());

Clicking on links isn’t the only way to navigate around GOV.UK. Some forms on the site (e.g. Pay your council tax) generate an HTTP redirect when submitted, so the Location headers of these responses need to be rewritten to prevent the browser navigating away from the current host:

# ...

class RewriteRedirects < Struct.new(:app)
  def call(env)
    status, headers, body = app.call(env)

    Rack::Response.new(body, status, headers) do |response|
      if response.redirect?
        url = URI.parse(response.location)
        url = url.route_from(GOV_UK_URL) if url.absolute?

        if url.relative?
          response.redirect(url.to_s, response.status)
        else
          response.status = 204
        end
      end
    end
  end
end

Faye::WebSocket.load_adapter('thin')
use Faye::RackAdapter, mount: '/faye'

use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
use RewriteRedirects
use InsertTags
run Proxy.new

The RewriteRedirects middleware turns absolute GOV.UK URLs into relative ones so that the browser stays on the current host. If the redirect URL points at a non-GOV.UK site, the response code is changed to 204 No Content to prevent the browser from navigating anywhere.

Handling history

The current URL also changes when the user manipulates the browser history, e.g. with the back/forward buttons, but this isn’t caught by the click handler. A catch-all solution is to listen for the pageshow event and republish the /scroll and /navigate messages whenever a new page is shown:

(function () {
  // ...

  var beginControlling = function (faye) {
    // ...

    window.addEventListener('pageshow', function () {
      faye.publish('/scroll', { x: window.scrollX, y: window.scrollY });
      faye.publish('/navigate', { url: window.location.href });
    });
  };

  // ...
}());

(Incidentally, this papers over the race condition which occurs when the click handler is trying to publish the /navigate message before the current page unloads. If the Faye client loses this race, the subsequent pageshow event will bring the mirroring browsers back in sync.)

Unfortunately no event fires at all when GOV.UK’s JavaScript updates the current URL by calling history.pushState (e.g. on the maternity leave calculator), so I had to replace the browser’s pushState implementation with one that publishes a message:

(function () {
  // ...

  var beginControlling = function (faye) {
    // ...

    var realPushState = window.history.pushState;
    window.history.pushState = function (state, title, url) {
      faye.publish('/navigate', { url: url });
      return realPushState.call(window.history, state, title, url);
    };
  };

  // ...
}());

I couldn’t find a GOV.UK page that uses history.replaceState, so I ignored it.

Restoring state

At this stage the implementation was complete enough to deliver, but I wanted to make it more robust by keeping track of the current URL and scroll position on the server so that any new client (e.g. a rebooted iPhone or iPad) could be sent straight to the right page instead of having to wait for a user interaction to trigger an update.

The first step was to write a server-side Faye extension to remember the values that appeared in the most recent /scroll and /navigate messages:

# ...

class StateCache < Struct.new(:x, :y, :url)
  def incoming(message, callback)
    channel, data = message.values_at('channel', 'data')

    case channel
    when '/scroll'
      self.x = data['x']
      self.y = data['y']
    when '/navigate'
      self.url = data['url']
    end

    callback.call(message)
  end
end

Faye::WebSocket.load_adapter('thin')
use Faye::RackAdapter, mount: '/faye', extensions: [StateCache.new(0, 0, '/')]

use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
use RewriteRedirects
use InsertTags
run Proxy.new

The server boots with reasonable defaults for the current URL and scroll position, but to improve accuracy I wrote a client-side extension to republish the actual values whenever the controlling browser reconnects:

(function () {
  // ...

  var beginControlling = function (faye) {
    // ...

    faye.addExtension({
      outgoing: function (message, callback) {
        if (message.channel === '/meta/handshake') {
          faye.publish('/scroll', { x: window.scrollX, y: window.scrollY });
          faye.publish('/navigate', { url: window.location.href });
        }

        callback(message);
      }
    });
  };

  // ...
}());

This automatically freshens the server’s state if it gets restarted for any reason.

The current state could now be sent to clients as soon as they connected. I did this by adding the appropriate values to the ext field of the /meta/subscribe response sent to a mirroring browser when it successfully subscribes to a channel:

# ...

class StateCache < Struct.new(:x, :y, :url)
  def incoming(message, callback)
    # ...
  end

  def outgoing(message, callback)
    channel, successful, subscription =
      message.values_at('channel', 'successful', 'subscription')

    if channel == '/meta/subscribe' && successful
      case subscription
      when '/scroll'
        message['ext'] = { x: x, y: y }
      when '/navigate'
        message['ext'] = { url: url }
      end
    end

    callback.call(message)
  end
end

# ...

To make use of this data I added another client extension to catch /meta/subscribe messages and update the browser’s scroll position and URL:

(function () {
  // ...

  var beginMirroring = function (faye) {
    // ...

    faye.addExtension({
      incoming: function (message, callback) {
        if (message.channel === '/meta/subscribe') {
          if (message.subscription === '/scroll') {
            window.scrollTo(message.ext.x, message.ext.y);
          } else if (message.subscription === '/navigate') {
            navigateTo(message.ext.url);
          }
        }

        callback(message);
      }
    });
  };

  begin(beginControlling, beginMirroring);
}());

Enabling fullscreen

The final change was to make the proxied site appear fullscreen on the iPhone and iPad by injecting a single Apple-specific <meta> tag into each page:

# ...

class InsertTags < Struct.new(:app)
  # ...

  def add_tags(content)
    content.sub(%r{(?=</head>)}, meta_tags + script_tags)
  end

  def meta_tags
    '<meta name="apple-mobile-web-app-capable" content="yes">'
  end

  # ...
end

# ...

Further work

That’s all I did — the code is on GitHub. There were several more features I’d planned to build but ultimately didn’t need to:

A subsequent discussion with Chris Roos made me realise that writing a Chrome extension could’ve made the controlling side easier to implement: I might have been able to listen to events through the Chrome APIs (perhaps chrome.tabs.onUpdated and/or chrome.history.onVisited?) instead of trying to do it all in-page. Ultimately the controlling browser ended up being a generic WebView inside a kiosk application anyway, so a vendor-specific extension wouldn’t have worked, but I’d investigate this option more thoroughly for any similar projects in future.

Conclusions

This project took about two days, including time spent on the initial brief and the physical installation of devices at the Design Museum.

Tom, Ali, Edd - band shot

It wouldn’t have been possible to get everything working in such a short time without several advantages:

It’s extremely satisfying to work with the wind at your back like this. I enjoyed this project a lot, and in future I’ll be more likely to come back to these technologies (and these people) when I want to make something fast and fun.

The 2013 Designs of the Year exhibition runs from 20th March until 7th July. Good luck, GDS — I hope you win.