Understanding web performance is important for everyone, whether you’re a front-end web developer working with HTML and CSS, an application developer writing services that will interact with the web, or a content author who writes for the web without using code. While there are only a few simple tricks to making web pages load quickly, they’re unfortunately eschewed all too often.
Libraries can ill afford to alienate anyone in their communities. We aim to be open institutions that welcome anyone to use our services, regardless of race, gender, class, creed, ability, or anything else. Yet when we make websites that work only for high-powered desktop computers with broadband connections, we privilege the wealthy. Design a slow enough website and even laptops on decent wireless connections may struggle to load a site in a timely fashion. But what about people who only own smart phones or people who live in rural areas where dial-up is the only option? Poor web performance renders sites unusable for some and frustrating for all.
But what can I do?!?
Below, I’ll elaborate on why certain practices make sites faster, but I want to outline some actions you can take to immediately speed up your sites.
First of all, test your main web properties to identify where the pain points are. Google Pagespeed and Yahoo’s YSlow are two good services that not only spot a broad range of potential problems but provide easy instructions for ameliorating them. They both output letter grades but it’s not worth worrying about getting an A; simply find the laggard sites and look for low-hanging fruit to fix.
Speaking of low-hanging fruit, I’m going to guess you have images on your site. In fact, images make up the bulk of most websites’ download size.1 So if you’re a content author uploading images to a site, ensuring that their file sizes are as small as possible goes a long way to speeding up your site. First of all, ask if the image is really necessary. While images certainly aid a site’s visual appeal, so does clean design with thoughtful colors and highlighted calls to action. It’s becoming easier and easier to avoid images too, as CSS advances and techniques like SVG and icon fonts become more popular. In general, if an image isn’t a logo or picture, it can be replaced through those technologies with less of a performance impact.
If you’re certain an image is necessary, ensure that you’re uploading an appropriately sized one. If you upload a 1200×900 image which will be scaled down to 400×300 inside an <img>
tag, change the dimensions before uploading. Lastly, run the image through an optimizer that removes useless gunk from the file (this includes metadata and unused color profiles). ImageOptim is a great choice on Mac. I don’t use Windows so I can’t vouch for these, but FileOptimizer and RIOT (Radical Image Optimization Tool) appear to be popular choices for that platform.
What’s the second largest source of bytes in a website? JavaScript. Scripts are exponentially smaller than images but a prime opportunity to trim download size. You should always minify your JavaScript before putting it on a live site. What’s minification? It takes all the functionally useless bytes out of a file, producing text that’s operational but smaller. Here’s an example script before and after minification.
Before (236 bytes):
// add a proxy server prefix to all links with class=proxied var links = document.querySelectorAll('a[href].proxied'); Array.prototype.forEach.call(links, function (link) { var proxy = 'https://proxy.example.com/login?url=' , newHref = proxy + link.href; link.href = newHref; });
After (169 bytes):
var links=document.querySelectorAll("a[href].proxied");Array.prototype.forEach.call(links,function(r){var e="https://proxy.example.com/login?url=",l=e+r.href;r.href=l});
By running my script through UglifyJS, I reduced its size 28%. We can see where Uglify applied its tricks: it removed a comment, removed whitespace (including line breaks), and changed my local variable names to single letters. Yet my script’s actions are the same, thus there’s no reason to deliver the larger, human-readable version to a browser.
Minifying JavaScript is a best practice, but CSS can also be minified. CSS benefits primarily through whitespace removal because selectors and properties must all remain the same. While it may not be an option in situations where your pages are served up by a CMS or vendor application, HTML can benefit substantially from having comments removed, quotes taken off of attributes, and even closing tags erased (perfectly valid under HTML5). While the outcome is ugly, the vast majority of your users won’t peak at your source code but will appreciate the quicker page load.
There’s one last optimization that most content authors or front-end web developers can do: combine like files. Rather than make the browser request several separate scripts and stylesheets, combine your assets into as few files as possible. Saving HTTP requests helps sites load quickly, especially over cellular connections. While it’s possible to minify and combine files one-by-one by pasting them into online tools, ideally you have a development workflow that employs a tool like Grunt or Gulp to minify your text-based resources and concatenate them into one big package. 2
When going through one’s scripts and stylesheets looking for optimizations, it’s again pertinent to ask how much is truly necessary. Yes, you can accomplish some gorgeous things with just a little code, slick modal dialogs and luscious fly-out menus. But is that really what your website needs? Are users struggling because your nav bar doesn’t collapse and stick to the top of the page as you scroll, or because your link text is obscure? Throwing scripts at a site can create more issues than it resolves.
But what can my server admin do?!?
If you have access to your library’s servers, then you’ll be able to change a few settings that will make broad improvements. If not, try emailing your local server admin to see if they can make these alterations.
One of the easiest, and biggest, improvements you can make is to turn on compression. With compression on, your server packs up requests into smaller files (think .zip archives) before sending them over the web. This makes a huge difference with most web assets because they’re repetitive plain text which is very amenable to compression algorithms. The best part is that turning on compression is typically a matter of copy-pasting a few lines into a server configuration file. If you’re unsure of how to do it, see the HTML5 Boilerplate Server Configs project. If you’re not certain your site is using compression, try the aptly-named GzipWTF.
What’s the only thing faster than sending a minified, compressed file? Not having to send that file at all. Browsers cache assets if they’re told to, meaning that repeat visits to your site won’t incur repeat downloads. This significantly increases speed if a visitor spends a lot of time on your site. Caching is, again, a matter of inserting a few configuration lines. It can be a little tricky, however, because when you update cached files you want users to download the new version, bypassing their local cache. One work-around is filename-based versioning. 3
Don’t have server access? Or are your servers taxed to the breaking point? Popular front-end resources are often available over Content Delivery Networks (CDNs). These CDNs have servers spread across the globe whose only job is to serve up static assets which get cached in browsers. Because many sites use them, users may already have assets cached in their browser (e.g. if they’ve visited Reddit recently, they have version 2.1.1. of the jQuery JavaScript library from a popular CDN run by Google). While Google and Microsoft run well-known CDNs for a few popular libraries, the lesser-known jsDelivr, CDNJS, and Bootstrap CDN give you even greater options.
The Basic Rules
I’ve avoided the theory behind web performance thus far, but not because it’s complex. There are only two primary rules: send fewer bytes and send fewer requests. So given a choice between two files that fulfill the same function, choose the smaller one (the logic behind minification and compression). If you can combine multiple files into one or avoiding sending something altogether, that saves requests (the logic behind concatenation and caching).
Sending fewer bytes makes intuitive sense; we know that larger files take longer to download. Not only that, we’ve all probably visually witnessed the effects of file size as manifested in progress bars or massive images slowly tiling into place. The benefits of concatenation, however, may seem peculiar. Why would sending one 50kb file be quicker than sending two 25kb files, or five 10kb files? It turns out that there are inefficient redundancies across multiple HTTP requests. A request has a few stages:
- A DNS lookup translates a domain like “acrl.ala.org” into an IP address like “38.69.5.149”
- The client and server establish a connection with a TCP handshake
- The server receives the request and determines what to send back
- Finally, the data is pushed out over the network
Throughout these steps there may be a great amount of latency as data travels through the air and over wires. All of that latency is repeated with each request, making multiple requests slower. So even if we serve all our assets from the same domain, thus requiring only one DNS request, and configure our server to not repeat the TCP connection on each request, there’s still additional overhead to multiple requests.
Here’s a visual explanation of this phenomenon:
While the DNS, TCP, and download times are identical for two scripts sent independently and the product of their concatenation, the total “Time To First Byte” (analogous to latency) is greater with the dual requests.
Incidentally, the next version of the HTTP specification—HTTP 2.0, currently active as the SPDY protocol—features request multiplexing. The ability to pack multiple requests into one connection virtually eliminates the need for concatenation. It’s a long way off though, since the spec hasn’t been finalized and it will need to be implemented in a backwards-compatible way in both web browsers and servers.
Don’t Sweat What You Can’t Control
Here’s a sad truth: a lot of your library’s web properties probably have terrible performance and there’s little you can do about it. Libraries rely heavily upon third-party websites that may give you a box for pasting custom HTML into, but no or limited control over page templates. Since we’re purchasing a web site and not a web service, we trade convenience for quality. 4 Perhaps because making a generically useful website is really tough, or perhaps because they just don’t know, most of these vendor sites ignore best practices. Similarly, we tend to reuse large open source applications without auditing their performance characteristics. I began listing some of the more egregious offenders, but the list became so long that I’ve put it in a separate appendix below.
There’s More
There’s a lot more to web performance, including avoiding redirects, combining images into sprites, and putting scripts down towards the end of the <body>
tag. An entire field has grown up around the topic, spurred on by the rising complexity of web applications and the prominence of mobile devices. Google’s Make the Web Faster project is great place to learn more, as is Steve Souder’s dated but classic High Performance Web Sites. The main takeaway, however, is that understanding a few tenets of web performance can make a difference. By slimming your images and combining your CSS, you can make your sites accessible to more devices and quicker for everyone.
Appendix A: The Wall of Performance Shame
I don’t mean to imply that any of these software packages are poorly made or bad choices. In fact, I’m quite fond of many of them. But run a few through tools like Google Pagespeed and it quickly becomes apparent that even the easiest optimizations aren’t being applied.
LibGuides 1.0 loads all its 5 scripts in the <head>
and includes two large utility libraries in jQuery and Dojo, which I struggle to believe is necessary. LibGuides 2.0 does the right thing in relying on CDNs to serve up a few libraries, but still includes 5 scripts, two of which are unminified (despite having “.min” in the name).
The EQUELLA digital repository software loads 11 CSS files and and 14 scripts in the <head>
of the document. I didn’t bother to check them all, but the ones I did view are unminified (e.g. they’ve left jQuery at full size, which increases the download 154kb for no good reason). This essentially crushes the page load on all but desktop devices with decent bandwidth; I cannot imagine a user enduring the wait on a phone’s 3G connection.
DSpace loads 9 stylesheets, most of which appear unminified, and 5 scripts (which, luckily, are minified) so far up in the <head>
of the document that the <title>
of the page loads noticeably slow on a phone.
VuFind loads 10 scripts in a row in the <head>
; these could be easily combined.
Thinking a little broader beyond library-specific software, Drupal could do a better job concatenating and minifying resources. While the performance tools have a simple checkbox for combining scripts, by default Drupal will still load multiple (3-6 stylesheets and scripts) resources on each page. The JavaScript and CSS that come with many contributed modules tend to not be minified for some reason and Drupal sometimes doesn’t process them itself. I don’t know the best practices for providing client-side assets with a module but many authors seem to be ignoring performance.
On a positive note, the Blacklight discovery layer does a great job of combining all CSS and JavaScript into one file apiece, probably due to the fact it’s built with Ruby on Rails which provides some nice tools for conforming to best practices. The Koha ILS also appears to keep the number of files down and pushes most of its scripts to the bottom of the document.
Notes
- The HTTP Archive is a wonderful source for data like this. ↩
- These development workflows are a bit much to go into at the moment, but trust me that they’re not too difficult to set up. There are template configurations which you can use that make it relatively painless to simply point a piece of software at a folder full of scripts and watch an optimized output appear in a prescribed destination. If there’s interest in another blog post with more details, leave me a note in the comments and I’ll happily write one. ↩
- This Stack Overflow answer gives a decent overview of filename-based versioning with some further reading. Google’s Make the Web Faster project has a nice article on how HTTP caching works. ↩
- When I say “web service” I mean platforms which expose all their functionality through an API, which allows one to write a custom interface layer on top. That’s often a lot of work but would provide total control over things like the number and order of resources loaded in the HTML. ↩