Wraith | A responsive screenshot comparison tool

What is it?

Wraith is a tool created by the BBC and then open sourced for quickly and easily comparing two versions of the same website. For example a development and a production instance. Within the config.yaml you can define the pages to be tested, the environments to test, and the screen-widths to test at. (it was originally built as a Responsive testing tool).

What do you need?

Not much, at its basic level, you can run Wraith on a desktop PC, as long as you install, PhantomJS, Ruby, Imagemagick and of course Wraith. (Gem install wraith).

The documentation is solid and easy to follow to get it up and running.

If you are happy with it, it is a simple task to put it on a build server and link it into a CI tool like Jenkins.

Why is it useful?

Wraith is useful because one of the limitations of tools like Behat or Cucumber is that they focus on functionality but don't tell us if the page is actually rendering as it should. Wraith gives us the chance to take top level visual QA and automate it. It is not a substitute for a designers eye, and does not allow us to ignore look at feel, but it does give us a baseline to work from. If QA and Creative have signed off a particular look, that can be used as the primary baseline against which Wraith is run. If the next environment in the pipeline looks like the originally signed off one, we know a proper merge has occurred and there are "no" regression problems.

What are the limitations

It doesn't cope well with CMS Changes as these will typically be done on a live environment without being replicated on a test environment. and so will generate differences that are not meaningful as they are valid. As it is not always feasible to copy the data down

Page URL path descriptions in config.yaml need to be without spaces. This is not documented, but becomes pretty obvious when you get spammed in the output with errors.

It will generate a large number of "false positives" when you have a new project, particularly if you decide to test the tool, as I have, during a change to the fonts used on a website.


I can see this being a hugely valuable tool for post-golive smoke testing more than anything else. When all code is deployed and you are ready, running this tool against your production and pre-production environment should result in a perfect match and if it does not, then you know where to focus. It can be useful for pre-live regression as well, but you need to be willing to manually filter out the expected failures. It does allow for monitoring if style changes in one area have somehow spilled out to another.

It is definitely a tool I will continue to use for a while to see if it adds enough value to suggest integrating with our automation suite