Information Architecture for a Library Website Redesign

My library is about to embark upon a large website redesign during this summer semester. This isn’t going to be just a new layer of CSS, or a minor version upgrade to Drupal, or moving a few pages around within the same general site. No, it’s going to be a huge, sweeping change that affects the whole of our web presence. With such an enormous task at hand, I wanted to discuss some of the tools and approaches that we’re using to make sure the new site meets our needs.

Why Redesign?

I’ve heard about why the wholesale website redesign is a flawed approach, why we should be continually, iteratively working on our sites. Continual changes stop problems from building up, plus large swaths of changes can disrupt our users who were used to the old site. The gradual redesign makes a lot of sense to me, and also seems like a complete luxury that I’ve never had in my library positions.

The primary problem with a series of smaller changes is that that approach assumes a solid fundamental to begin with. Our current site, however, has a host of interconnected problems that makes tackling any individual issue a challenge. It’s like your holiday lights sitting in a box all year; they’re hopelessly tangled by the time you take them out again.

Our site has decades of discarded, forgotten content. That’s mostly harmless; it’s hard to find and sees virtually no traffic. But it’s still not great to have outdated information scattered around. In particular, I’m not thrilled that a lot of it is static HTML, images, and documents sitting outside our content management system. It’s hard to know how much content we even have because it cannot be managed in one place.

We also fell into a pattern of adding content to the site but never removing or re-organizing existing content. Someone would ask for a button here, or a page dictating a policy there, or a new FAQ entry. Pages that were added didn’t have particular owners responsible for their currency and maintenance; I, as Systems Librarian, was expected to run the technical aspects of the site but also be its primary content editor. That’s simply an impossible task, as I don’t know every detail of the library’s operations or have the time to keep on top of a menagerie of pages of dubious importance.

I tried to create a “website changes form” to manage things, but it didn’t work for staff nor myself. The few staff who did fill out the form ended up requesting things that were difficult to do, large theme changes that I wasn’t comfortable making without user testing or approval from our other librarians. The little content that was added was minor text being ferried through this form and myself, essentially slowing down the editorial process and furthering this idea that web content was solely my domain.

To top our content troubles off, we’re also on an unsupported, outdated version of Drupal. Upgrading or switching a CMS isn’t necessarily related to a website redesign. If you have a functional website on a broken piece of software, you probably don’t want to toss out the good with the bad. But in our case, similar to how our ILS migration gave us the opportunity to clean up our bibliographic records, a CMS migration gives us a chance to rebuild a crumbling website. It just doesn’t make sense to invest technical effort in migrating all our existing content when it’s so clearly in need of major structural change.

Card Sort

Making a card sort

Cards in the middle of being constructed.

Not wanting to go into a redesign process blind, we set out to collect data on our current site and how it could be improved. One of the first ways we gathered data was to ask all library staff to perform a card sort. A card sort is an activity wherein pieces of web content are put on cards which can then be placed into categories; the idea is to form a rough information architecture for your site which can dictate structure and main menus. You can do either open or closed card sorts, meaning the categories are up to the participants to invent or provided ahead of time.

For our card sort, I chose to do an open card sort since we were so uncertain on the categories. Secondly, I selected web content based on our existing site’s analytics. It was clear to me that our current site was bloated and disorganized; there were pages tucked into the nooks of cyberspace that no one had visited in years. There was all sorts of overlapping and unnecessary content. So I selected ≈20 popular pages but also gave each group two pieces of blank paper on which to add whatever content they felt was missing.

Finally, trying to get as much and as useful data as possible, I modified the card sort procedure in a couple ways. I asked people to role play as different types of stakeholders (graduate & undergraduate students, faculty, administrators) and to justify their decisions from that vantage point. I also had everyone, after sorting was done, put dots on content they felt was important enough for the home page. Since one of our current site’s primary challenges in maintenance, or the lack thereof, I wanted to add one last activity wherein participants would write a “responsible staff member” on each card (e.g. the instruction librarian maintains the instruction policy page). Sadly, we ran out of time and couldn’t do that bit.

The results of the card sort were informative. A few categories emerged as a commonality across everyone’s sorts: collections, “about us”, policies, and current events/news. We discovered a need for new content to cover workshops, exhibits, and events happening in the library which were currently only represented (and not very well) on blog posts. In terms of the home page, it was clear that LibGuides, collections, news, and most importantly our open hours needed to represented.

Treejack & Analytics

Once we had enough information to build out the site’s architecture, I organized our content into a few major categories. But there were still several questions on my mind: would users understand terms like “special collections”? Would they understand where to look for LibGuides? Would they know how to find the right contact for various questions? To answer some of these questions, I turned to Optimal Workshop’s “Treejack” tool. Treejack tests a site’s information architecture by having users navigate basic text links to perform basic tasks. We created a few tasks aimed at answering our questions and recruited students to perform them. While we’re only using the free tier of Optimal Workshop, and only using student stakeholders, the data was till informative.

For one, Optimal Workshop’s results data is rich and visualized well. It shows the exact routes each user took through our site’s content, the time it took to complete a task, and whether a task was completed directly, completed indirectly, or failed. Completed directly means the user took an ideal route through our content; no bouncing up and down the site’s hierarchy. Indirect completion means they eventually got to the right place, but didn’t take a perfect path there, while failure means they ended in the wrong place. The graph’s the demonstrate each tasks’ outcomes are wonderful:

Data & charts for a task

The data & charts Treejack shows for a moderately successful task.

"Pie tree" visualizing users' paths

A “pie tree” showing users’ paths while attempting a task.

We can see here that most of our users found their way to LibGuides (named “study guides” here). But a few people expected to find them under our “Collections” category and bounced around in there, clearly lost. This tells us we should represent our guides under Collections alongside items like databases, print collections, and course reserves. While building and running your own Treejack-type tests would be easy, I definitely recommend Optimal Workshop as a great product which provides much insight.

There’s much work to be done in terms of testing—ideally we would adjust our architecture to address the difficulties that users had, recruit different sets of users (faculty & staff), and attempt to answer more questions. That’ll be difficult during the summer while there are fewer people on campus but we know enough now to start adjusting our site and moving along in the redesign process.

Another piece of our redesign philosophy is using analytics about the current site to inform our decisions about the new one. For instance, I track interactions with our home page search box using Google Analytics events 1. The search box has three tabs corresponding to our discovery layer, catalog, and LibGuides. Despite thousands of searches and interactions with the search box, LibGuides search is seeing only trace usage. The tab was clicked on a mere 181 times this year; what’s worse, only 51 times did a user actually search afterwards. This trace amount of usage, plus the fact that users are clearly clicking onto the tab and then not finding what they want there, indicates it’s just not worth any real estate on the home page. When you add in that our LibGuides now appear in our discovery layer, their search tab is clearly disposable.

What’s Next

Data, tests, and conceptual frameworks aside, our next stage will involve building something much closer to an actual, functional website. Tools like Optimal Workshop are wonderful for providing high-level views on how to structure our information, but watching a user interact with a prototype site is so much richer. We can see their hesitation, hear them discuss the meanings of our terms, get their opinions on our stylistic choices. Prototype testing has been a struggle for me in the past; users tend to fixate on the unfinished or unrefined nature of the prototype, providing feedback that tells me what I already know (yes, we need to replace the placeholder images; yes, “Lorem ipsum dolor sit amet” is written on every page) rather than something new. I hope to counter that by setting appropriate expectations and building a small but fairly robust prototype.

We’re also building our site in an entirely new piece of software, Wagtail. Wagtail is exciting for a number of reasons, and will probably have to be the subject of future posts, but it does help address some of the existing issues I noted earlier. We’re excited by the innovative Streamfield approach to content—a replacement for large, rich text fields which are unstructured and often let users override a site’s base styles. We’ve also heard whispers of new workflow features which let us send reminders to owners of different content pages to revisit them periodically. While I could do something like this myself with an ad hoc mess of calendar events and spreadsheets, having it build right into the CMS bodes well for our future maintenace plans. Obviously, the concepts underlying Wagtail and the tools it offers will influence how we implement our information architecture. But we also started gathering data long before we knew what software we’d use, so exactly how it will work remains to be figured out.

Has your library done a website redesign or information architecture test recently? What tools or approaches did you find useful? Let us know in the comments!

Notes

  1. I described Google Analytics events before in a previous Tech Connect post

Online Privacy in Post-Election America

A commitment to protecting the privacy of our patrons is enshrined in the ALA Code of Ethics. While that has always been an important aspect of librarianship, it’s become even more pivotal in an information age where privacy is far more nuanced and difficult to achieve. Given the rhetoric of the election season, and statements made by our President-Elect as well as his Cabinet nominees 1, the American surveillance state has become even more disconcerting. As librarians, we have an obligation to empower our communities with the knowledge they need to secure their own personal information. This post will cover, at a high level, a few areas where librarians of various types can assist patrons.

The Tools

Given that so much information is exchanged online these days, librarians are in a unique position to educate patrons about the Internet. We spend so much time either building web services or utilizing them, it’s highly likely that a librarian knows more about the web than your average citizen. As such, we can relate some of the powerful pieces of software and services that aid in protecting one’s online presence. To name just a handful that almost everyone could benefit from knowing:

DuckDuckGo is a privacy-aware search engine which explicitly does not track individual users. While it is a for-profit endeavor earning money through ad revenue, its policies set it apart from major competitors such as Google and Bing.

TorBrowser is a web browser utilizing The Onion Router protocol which obfuscates the user’s IP address, essentially masking their online activities behind a web of redirects. The Tor network is run by volunteers and TorBrowser is open source software developed by a non-profit organization.

HTTPS is the encrypted version of HTTP, the data transfer protocol that powers the internet. HTTPS sites are less likely to have their traffic intercepted or surveilled. Tools like HTTPS Everywhere help one to find HTTPS versions of sites without too much trouble.

Two-factor authentication is available for many apps and web services. It decreases the possibility that a third-party can access your account by providing an additional layer of protection beyond your password, e.g. through a code sent to your phone.

Signal is an open source private messaging app which uses end-to-end encryption, think of it as HTTPS for your text messages. Signal is made by Open Whisper Systems which, like the Tor Foundation, is a non-profit.

These are just a few major tools in different areas, all of which are worth knowing about. Many have usability trade-offs but switching to just one or two is enough to substantially improve an individual’s privacy.

Privacy Workshops

Merely knowing about particular pieces of software is not enough to secure one’s communications. Tor perhaps says it best in their “Tips on Staying Anonymous“:

Tor is NOT all you need to browse anonymously! You may need to change some of your browsing habits to ensure your identity stays safe.

A laundry list of web browsers, extensions, and apps doesn’t do much by itself. A person’s behavior is still the largest factor in how private their information is. One can visit a secure HTTPS site but still use a password that’s trivial to crack; one can use the “incognito” or “privacy” mode of a browser but still be tracked by their IP address. Online privacy is an immensely complicated and difficult subject which requires knowledge of practices as well tools. As such, libraries can offer workshops that teach both at once. Most libraries teach skills-based workshops, whether they’re on using a citation manager or how to evaluate information sources for credibility. Adding privacy skills is a natural extension of work we already do. Workshops can fit into particular classes—whether they’re history, computer science, or ethics—or be extra-curricular. Look for sympathetic partners on campus, such as student groups or concerned faculty, to see if you can collaborate or at least find an avenue for advertising your events.

Does your library not have anyone qualified or willing to teach a privacy workshop? Consider contacting an outside expert. The Library Freedom Project immediately comes to mind as a wonderful resource offering: a privacy toolkit for librarians, an online class, “train the trainers” type events, and community-focused workshops.2 Academic librarians may also have access to local computer security experts, whether they’re computer science instructors or particularly savvy students, who would be willing to lend their expertise. My one caution would be that just because someone is a subject expert doesn’t mean they’re equipped to effectively lead a workshop, and that working with an expert to ensure an event is tailored to your community will be more successful than simply outsourcing the entire task.

Patron Data

Depending on your position at your library, this final section might either be the most or least obvious thing to be done: control access to data about your patrons. If you’re an instruction or reference librarian, I imagine workshops were the first thing on your mind. If you’re a systems librarian such as myself, you may have thought of technologies like HTTPS or considered data security measures. This section will be longer not because it’s more important, but because these are topics I think about often as they directly relate to my job responsibilities.

Patron data is tricky. I’ll be the first to admit that my library collects quite a bit of data about patrons, a rather small amount of which contains personally identifying information. Data is extremely useful both in fine-tuning our services to meet community needs as well as in demonstrating our value to stakeholders like the college administration. Still, there is good reason to review data practices and web services to see if anything can be improved. Here’s a brief list of heuristics to use:

Are your websites using HTTPS? Secure sites, especially for one’s with patron accounts that hold sensitive information, help prevent data from being intercepted by third parties. I fully realize this is actually more difficult than it appears; our previous ILS offered HTTPS but only as a paid add-on which we couldn’t afford. If a vendor is the holdup here, pester them relentlessly until progress is made. I’ve found that most vendors understand that HTTPS is important, it’s just further down in their development priorities. Making a fuss can change that.

Is personal information being unnecessarily collected? What’s “necessary” is subjective, certainly. A good measure is looking at when the last time personal information was actually used in any substantive manner. If you’re tracking the names of students who ask reference questions, have you ever actually needed them for follow-ups? Could an anonymized ID be used instead? Could names be deleted after a certain amount of time has passed? Which brings us to…

Where personal information is collected, do retention policies exist? E.g. if you’re doing website user studies that record someone’s name, likeness, or voice, do you eventually delete the files? This goes for paper files as well, which can be reviewed and then shredded if deemed unnecessary. Retention policies are beneficial in a few ways. They not only prevent old data from leaking into the wrong hands, they often help with organization and “spring cleaning” tasks. I try to review my hard drive periodically for random files I’ve been sent by faculty or students which can be cleaned out.

Can patrons be empowered with options regarding their own data? Opt-in policies regarding data retention are desirable because they allow a library to collect information that might prove valuable while also giving people the ability to limit their vulnerabilities. Catalog reading lists are the quintessential example: some patrons find these helpful as a tool to review what they’ve read, while others would prefer to obscure their checkout history. It should go without saying that these options existing without any surrounding education is rather useless. Patrons need to know what’s at stake and how to use the systems at their disposal; the setting does nothing by itself. While optional workshops typically only touch a fragment of the overall student population, perhaps in-browser tips and suggestions can be presented to prompt our users to consider about the ramifications of their account’s configuration.

Relevance Ranking

Every so often, an event will happen which foregrounds the continued relevance of our profession. The most recent American election was an unmitigated disaster in terms of information literacy 3, but it also presents an opportunity for us to redouble our efforts where they are needed. Like the terrifying revelations of Edward Snowden, we are reminded that we serve communities that are constantly at risk of oppression, surveillance, and strife. As information professionals, we should strive to take on the challenge of protecting our patrons, and much of that protection occurs online. We can choose to be paralyzed by distress when faced with the state of affairs in our country, or to be challenged to rise to the occasion.

Notes

  1. To name a few examples, incoming CIA chief Mike Pompeo supports NSA bulk data collection and President-Elect Trump has been ambiguous as to whether he supports the idea of a registry or database for Muslim Americans.
  2. Library Freedom Director Alison Macrina has an excellent running Twitter thread on privacy topics which is worth consulting whether you’re an expert or novice.
  3. To note but two examples, the President-Elect persistently made false statements during his campaign and “fake news” appeared as a distinct phenomenon shortly after the election.

GIS and Geospatial Data Tools

I was recently appointed the geography subject librarian for my library, which was mildly terrifying considering that I do not have a background in geography. But I was assigned the subject because of my interest in data visualization, and since my appointment I’ve learned a few things about the awesome potential opportunities to integrate Geographic Information Systems (GIS) and geospatial visualization tools into information literacy instruction and library services generally.  A little bit of knowledge about GIS and geospatial visualization goes a long way, and is useful across a variety of disciplines, including social sciences, business, humanities and environmental studies and sciences.   If you are into open data (who isn’t?) and you like maps and / or data visualization (who doesn’t?!) then it’s definitely worth it to learn about some tools and resources to work with geospatial information.

About GIS and Geospatial Data

Geographic Information Systems, or GIS, are software tools that enable visualizing and interpreting data (social, demographic, economic, political, topographic, spatial, natural resources, etc.) using maps and geospatial data. Often data is visualized using layers, where a base map (containing, for example, a political map of a city) or tiles are overlaid with shapes, data points, or choropleth shading. For example, in the map below, a map of districts in Tokyo is overlaid with data points representing the number of seniors living in the area: 1

You may be familiar with Google Earth, which has a lot of features similar to a GIS (but is arguably not really a GIS, due to its lack of data analysis and query tools typically found in a fully-featured GIS). You can download a free Pro version of Google Earth that enables you to import GIS data. GIS data can appear in a variety of formats, and while there isn’t space here to go into each of them, a few common formats you might come across include Shapefiles, KML, and GeoJSON2  Shapefiles, as the name suggests, represent shapes (e.g., polygons) as layers of vector data that can be visualized in GIS programs and Google Earth Pro.  You may also come across KML files (Keyhole Markup Language), which is an XML-style standard  for representing geographic data, and is commonly used with Google Earth and Google Maps.  GeoJSON is another format for representing geospatial information that is ideal for use with web services.  The various formats of GIS and geospatial data deserve a full post on their own, and I plan to write a follow-up post exploring some of these formats and how they are used in greater detail.

GIS/Geospatial Visualization Tools

ArcGIS (ESRI)

ArcGIS is arguably the industry standard for GIS software, and the maker of ArcGIS (ESRI) publishes manuals and guides for GIS students and practitioners.  There are a few different ArcGIS products:  ArcGIS for Desktop, ArcGIS Online, and ArcGIS server.  Personally I am only familiar with ArcGIS online, but you can do some pretty cool things with a totally free account, like create this map of where drones can and cannot fly in the United States: 3

ArcGIS can be very powerful and is particularly useful for complex geospatial datasets and visualizations (particularly visualizations that might require multiple layers of data or topographic / geologic data). A note about signing up with ArcGIS online:  You don’t actually need to sign up for a ‘free trial’ to explore the software – you can just create a free account that, as I understand it, is not limited to a trial period.  Not all features may be available in the completely free account.

CartoDB

CartoDB is both an open source application and a freemium cloud service that can be used to make some pretty amazing geospatial visualizations that can be embedded in web pages, like this choropleth that visualizes the amount of various kinds of pollution across Los Angeles.4

CartoDB’s aesthetics are really strong, and default map settings tend to be pretty gorgeous.  It also leverages Torque to enable animations (which is what’s behind the heatmap animation of this map showing Twitter activity related to Ferguson, MO over time).5  CartoDB can import Shapefiles, GeoJSON, and .csv files, and has a robust SQL API (built on PostGreSQL) that can be used to import and export data. CartoDB also has its own JavaScript library (CartoDB.js) that can be leveraged for building attractive custom apps.

More JavaScript Libraries

In addition to CartoDB.js mentioned above, there are lots of other flexible JavaScript libraries for mapping and geospatial visualization on the scene that can be leveraged for visualizing geospatial data:

  • OpenLayers – OpenLayers enables pulling in ’tile’ layers as base maps from a variety of sources, as well as enabling parsing of vector data in a wide range of formats, such as GeoJSON and KML.
  • Leaflet.js – A fairly user-friendly and lightweight library used for creating basic interactive, mobile-friendly maps.  In my opinion, Leaflet is a good library to get started with if you’re just jumping in to geospatial visualization.
  • D3.js – Everyone’s favorite JavaScript charting library also has some geospatial visualization features for certain kinds of maps, such as this choropleth example.
  • Mapbox Mapbox.js is a JavaScript API library built on top of Leaflet.js, but Mapbox also offers a suite of tools for more extensive mapping and geospatial visualization needs

Open Geospatial Data

Librarians wanting to integrate geospatial data visualization and GIS into interdisciplinary instruction can take advantage of open data sets that are increasingly available online. Sui (2014) notes that increasingly large data sets are being released freely and openly on the web, which is an exciting trend for GIS and open data enthusiasts. However, Sui also notes that the mere fact that data is legally released and made accessible “does not necessarily mean that data is usable (unless one has the technical expertise); thus they are not actually used at all.”6  Libraries could play a crucial role in helping users understand and interpret public data by integrating data visualization into information literacy instruction.

Some popular places to find open data that could be used in geospatial visualiation include:

  • Data.gov  Since 2009, Data.gov has published thousands of public open datasets, including datasets containing geographic and geospatial information.  As of this month, you can now open geospatial data files directly in CartoDB (requires a CartoDB account) to start making visualizations.  There isn’t a huge amount of geospatial data available currently, but Data.gov will hopefully benefit from initiatives like Project Open Data, which was launched in 2013 by the White House and designed to accelerate the publishing of open data sets by government agencies.
  • Google Public Data Explorer – This is a somewhat small set of public data that Google has gathered from other open data repositories (such as Eurostat) that can be directly visualized using Google charting tools.  For example, you could create a visualization of European population change by country using data available through the Public Data Explorer.  While the currently available data is pretty limited, Google has prepared a kind of open data metadata standard (Data Set Publishing Language, or DSPL) that might increase the availability of data through the explorer if the standard takes off.
  • publicdata.eu – The destination for Europe’s public open data, a nice feature of publicdata.eu is the ability to filter down to datasets that contain Shapefiles (.shp files) that can be directly imported into GIS software or Google Earth Pro.
  • OpenStreetMap (OSM) –  Open, crowdsourced street map data that can be downloaded or referenced to create basemaps or other geospatial visualizations that rely on transportation networks (roads, railways, walking paths, etc.).  OpenStreetMap data are open, so for those who would prefer to make applications that are based entirely on open data (rather than commercial solutions), OSM can be combined with JavaScript libraries like Leaflet.js for fully open geospatial applications.

GIS and Geospatial Visualization In the Library

I feel like I’ve only really scratched the surface with the possibilities for libraries to get involved with GIS and geospatial data.  Libraries are doing really exciting things with these technologies, whether it’s creating new ways of interacting with historical maps, lending GPS units, curating and preserving geospatial data, exploring geospatial linked data possibilities with GeoSPARQL or integrating GIS or geospatial visualization into information literacy / instruction programs.  For more ideas about integrating GIS and geospatial visualization into library instruction and services, check out these guides:

(EDIT 4/13) Also be sure to check out ALA’s Map and Geospatial Information Round Table (MAGIRT).  Thanks to Paige Andrew and Kathy Weimer for pointing out this awesome resource in the comments.

If you’re working on something awesome related to geospatial data in your library and would be interested in writing about it for ACRL TechConnect, contact me on Twitter @lpmagnuson or drop me a line in the comments!

Notes

  1. AtlasPublisher. Tokyo Senior Population. https://www.arcgis.com/home/webmap/viewer.html?webmap=6990a8c5e87b42ee80701cf985383d5d.  (Note:  Apologies if you have trouble seeing or zooming in on embedded visualizations in this post; the interaction behavior of these embedded iframes can be a little unpredictable if your cursor gets near them.  It’s definitely a drawback of embedding these interactive visualizations as iframes.)
  2. The Open Geospatial Consortium is an organization that gathers and shares information about geographic and geospatial data formats, and details about a variety of geospatial file formats and standards can be found on its website:  http://www.opengeospatial.org/.
  3. ESRI. A Nation of Drones. http://story.maps.arcgis.com/apps/MapSeries/?appid=79798a56715c4df183448cc5b7e1b999
  4. Lauder, Thomas Suh (2014).  Pollution Burdenshttp://graphics.latimes.com/responsivemap-pollution-burdens/.
  5. YMMV, but the performance of map animations that use Torque seems to be a little tricky, especially when embedded in an iFrame.  I tried to embed the Ferguson Twitter map into this post (because it is really cool looking), and it really slowed down page loading, and the script seemed to get stuck at times.
  6. Sui, Daniel. “Opportunities and Impediments for Open GIS.” Transactions in GIS, 18.1 (2014): 1-24.

Using Grunt to Automate Repetitive Tasks

Riding a tangent from my previous post on web performance, here is an introduction to Grunt, a JavaScript task runner.

Why would we use Grunt? It’s become a common tool for web development as it puts together a number of tedious but necessary steps for optimizing a website. However, “task runner” is intentionally generic; Grunt isn’t specifically limited to websites, it can move and modify files in manifold ways.

Initial Setup

Unfortunately, virtually no operating system comes prepared to use Grunt out of the box. Grunt is written in  Node.js and thus require a few install steps. The good news is that Node is cross-platform; in my experience, it works better on Windows than most other programming frameworks.

  • Install Node
  • Ensure that the node and npm commands are on your path by running node --version and npm --version
    • If not, try a web search like “add node to path {{operating system}}”, it takes at most editing a single line of a particular file
  • Install Grunt globally with npm install -g grunt-cli
  • Ensure Grunt is on your path (grunt --version) and, again, search for an answer if not
  • Inside your project, run npm install grunt to install a local copy of grunt (it’ll appear in a folder named “node_modules”)

We’re ready to run! Let’s do a basic example.

First Example: Basic Web App Optimization

Say we have a simple website: there’s an index.html page, a stylesheet in a “css” subfolder, and a script in a “js” subfolder. First, let’s define what we want to accomplish. We want to: keep a full-size, easily readable copy of all our code while also building minified versions of both the CSS and JS. To do this, we’ll use three plugins: cssmin, uglify, and copy. This whole example is available on GitHub; even if you don’t use git, you can download a zip archive of the files.

First, inside our project, We run npm install grunt-contrib-cssmin grunt-contrib-uglify grunt-contrib-copy. These plugins are now installed in a “nodemodules” folder, but Grunt still needs to know _how to use them. It needs to know what tasks manipulate what files and other options. We provide this information in a file named “Gruntfile.js” in the root of our project. Here’s our initial one:

module.exports = function(grunt) {
    // this tells Grunt about the 3 plugins we installed 
    grunt.loadNpmTasks('grunt-contrib-cssmin');
    grunt.loadNpmTasks('grunt-contrib-uglify');
    grunt.loadNpmTasks('grunt-contrib-copy');
    grunt.initConfig({
        // our configuration will go here 
    });
    // we're going to want to run cssmin, uglify, & copy together 
    // so let's group them under a single "build" task 
    grunt.registerTask('build', ['cssmin', 'uglify', 'copy']);
};

For each task, there’ll be a section inside initConfig with its settings. For cssmin, the setup is nested a few layers but is really just a single line of code. Under a cssmin property we specify a target name, which can be anything. Targets allow us to have multiple configurations for a single task, which is handy in more complex projects but often unneeded. Under our target, which we’ll name “minify”, there’s a files property where we associate an array of input files with a single output file. This makes concatenating multiple stylesheets easy.

cssmin: {
  minify: {
    files: {
      'build/css/main.css': ['css/main.css']
    }
  }
}, // trailing comma since we'll add another section

Uglify’s setup is identical. We’ll name the target the same and only change the paths inside the files property.

uglify: {
  minify: {
    files: {
      'build/js/main.js': ['js/main.js']
    }
  }
}, // again, trailing comma, we add more below

While cssmin handles stylesheets and uglify handles JavaScript, our index.html only needs to be copied and not modified.1 See if you can write out the copy task’s settings by yourself, mimicking what we’ve already done.

Now we run grunt build in our command prompt and some messages tell us about each task’s status. Look in the “build” folder which appears after we ran the command. Far smaller, optimized versions of our main CSS and JS files are in there.

Great! We’ve accomplished our goals. But we must run grunt build over and over each time we want to remake our optimized assets, switching between our code editor and command prompt each time. If we’re doing a lot of piecemeal editing, this is most annoying. Instead, let’s use another plugin by running npm install grunt-contrib-watch to get the “watch” task and load it with the line grunt.loadNpmTasks('grunt-contrib-watch').2 Then, write this configuration:

watch: {
    minify: {
      files: ['\*.html', 'css/\*.css', 'js/\*.js'],
      tasks: ['build']
  }
}

Watch has just two intuitive parameters in its configuration; an array of files to watch and an array of tasks to execute when those files change. The asterisks are wildcards, so unlike our settings above this stanza isn’t dependent on exact file names. Now, by running grunt watch in our command prompt, optimized assets are magically constructed every time we save changes to a file. We can edit in peace without continually switching between the command line and our editor. Better yet, watch can work with a local development server to reload new versions of files upon every edit.3 The right combination of tasks can yield super efficient workflows where we edit and view results without worrying about optimizations made behind the scenes.

More Advanced: Portable Header

While the above is suitable for much small-scale web development, Grunt can handle far more complex situations. For example, I wrote a portable HTML header which can be inserted onto various vendor websites such as LibGuides or an A to Z list. The project’s Gruntfile is 159 lines long and makes use of ten Grunt plugins.

I won’t go into detail to explain how each Grunt tasks’ settings work, but I will outline what’s happening. A “sass” task compiles SCSS code into minified CSS that browser can understand using an external program. A couple of linting tools, jshint and scss-lint, check files against code quality heuristics. Our good friends copy and uglify are back doing their job, only this time they’ve joined by htmlmin which handles the index page. “String-replace” is an example of a multi-target task; its first target searches over a series of files for strings wrapped in double-curly braces like “{{example}}”. It then swaps out these placeholders with values specified in another file. The second takes the entire contents of a stylesheet and a script and inlines them into the main HTML.

That’s a lot of labor being handling by computer programs instead of humans. While passing a couple files through tools that remove comments and whitespace isn’t tough, the many steps in constructing an optimized HTML header from several files provide a good demonstration of Grunt’s value. While it took me some time to configure everything properly, the combined “build” task for this project has probably run hundreds of times and saved me hours of work. Not only that, because of the linting and minification, the final product is doubtless more high-quality than I could assemble manually.

The length and complexity of my Gruntfile points to one of the tougher pieces of using Grunt heavily; the order and delegated responsibilities of numerous tasks is tricky to coordinate properly. Throughout my Gruntfile, there are comments indicating when particular tasks run because running them out of order would either be fruitless or cause an error. For instance, the “string-replace” task’s “inline” target must run after those other files have been minified, otherwise the minification serves no purpose (the inlined code would be full size).

Similarly, coordinating which tasks move which files has been a constant headache for me in many projects. While the “copy” task moves images to the build folder, the “tpl” target of the “string-replace” task moves everything else. But I could’ve also used the uglify or sass tasks to move files! Since every task can potentially move the files it operates upon, it’s difficult to keep track of where a file is at a particular time in the workflow. The best way to debug these issues is to run multi-task aliases like build one at a time; first run uglify, then run cssmin, then run htmlmin… checking the state of files in between each to make sure that changes are occurring as anticipated.

Use Cases Abound

I use Grunt in almost all my projects, whether they’re web development or not. I use Grunt to copy my shell customizations into place, so that when I’m working on a new one I can just run grunt watch and rely on the changes being synched into place. I use Grunt to build Chrome extensions which require extra packaging steps before they can be pushed into the Chrome Web Store. I use Grunt in our catalog’s customized pages to minify code and also to check for potential errors. As an additional step, Grunt can be hooked up to deploy processes such that once a successful build is made the new files are pushed off to a remote server.

Grunt can be used to construct almost any workflow from a series of discrete pieces. Compiling some EAD finding aids into an HTML website via XSLT. Processing vendor MARC files with a PyMARC script and then uploading them into an ILS. Anything that can be scripted could be tied to Grunt tasks with a plug-in like grunt-exec, which executes arbitrary shell commands. However, there is a limit to what it’s sensible to do with Grunt. These last two examples are arguably better accomplished with shell scripts. Grunt is at its best when its great suite of plug-ins are relied upon and those tend to perform web-specific tasks. Grunt also requires at least a modicum of comfort with coding. It falls into an odd space, because while the configuration file is indeed JavaScript, it reads like a series of lists of settings, files, and ordered tasks. If you have more complex needs that involve if-then conditions and custom scripts, a lot of Grunt’s utility is negated. On the other hand, for those who would rather avoid code and the command line, options like the CodeKit app make more sense.

The Grunt site’s Getting Started and Sample Gruntfile pages are helpful sources of documentation.

A Beginner’s Guide to Grunt: Redux — a nice, updated overview. Some of the steps here are unnecessary for beginners, however, as they require a lot of files and structure. That’s great for experienced developers in the long run, because everything is smaller and more modular, but too much setup for simple projects.

I find myself constantly consulting the readme’s for various grunt plugins to figure out how they work, since their options are not necessarily discoverable otherwise. A quick way to pop open the home page of a package is by running npm home grunt-contrib-uglify (inserting the plugin name of your choice) which will open the registered home page of the package, often on GitHub.

Finally, it’s worth mentioning Gulp, a competing JavaScript task runner. Gulp is the same type of tool as Grunt (you wouldn’t use both in a project) but follows a different design philosophy. In short, Gulp tends to run faster due to its design and setting it up looks more like code and less like a configuration file, which some people prefer.

Notes

  1. There’s actually another great plugin, grunt-contrib-htmlmin, which minifies HTML. Its settings are only a little bit more involved than the copy task, so trying to configure htmlmin would make another nice exercise for those wanting to build on this post.
  2. Writing grunt.loadNpmTasks for each task we add to a complex Gruntfile gets tiresome. It takes a bit more initial work—we need to run npm init before anything else, fill in some prompts, and append --save-dev to all your npm install commands—so I decided to skip it in this intro, but we can use load-grunt-tasks to get this down to a single line that doesn’t need to be updated each time a new plugin is added.
  3. The appropriate setting is options.livereload as documented here. While our scenario doesn’t quite capture how time-saving this can be, grunt watch shines when working with a language that compiles to CSS like SASS. A process like “edit SASS, compile CSS, reload web page, view changes” becomes simply “edit SASS, view changes” because the intermediary stages are triggered by grunt watch.

This Is How I Work (Nadaleen Tempelman-Kluit)

Editor’s Note: This post is part of ACRL TechConnect’s series by our regular and guest authors about The Setup of our work.

 

Nadaleen Tempelman-Kluit @nadaleen

Location: New York, NY

Current Gig: Head, User Experience (UX), New York University Libraries

Current Mobile Device: iPhone 6

Current Computer:

Work: Macbook pro 13’ and Apple 27 inch Thunderbolt display

Old dell PC that I use solely to print and to access our networked resources

Home:

I carry my laptop to and from work with me and have an old MacBook Pro at home.

Current Tablet: First generation iPad, supplied by work

One word that best describes how you work: has anyone said frenetic yet?

What apps/software/tools can’t you live without?

Communication / Workflow

Slack is the UX Dept. communication tool in which all our communication takes place, including instant messaging, etc. We create topic channels in which we add links and tools and thoughts, and get notified when people add items. We rarely use email for internal communication.

Boomeranggmail-I write a lot of emails early in the morning so can schedule them to be sent at different times of the day without forgetting.

Pivotal Tracker-is a user story-based project planning tool based on agile software development methods. We start with user flows then integrate them into bite size user stories in Pivotal, and then point them for development

Google Drive

Gmail

Google Hangouts-We work closely with our Abu Dhabi and Shanghai campus libraries, so we do a lot of early morning and late night meetings using Google Hangouts (or GoToMeeting, below) to include everyone.

Wireframing, IA, Mockups

Sketch: A great lightweight design app

OmniGraffle: A more heavy duty tool for wire framing, IA work, mockups, etc. Compatible with a ton of stencil libraries, including he great Knoigi (LINK) and Google material design icons). Great for interactive interface demos, and for user flows and personas (link)

Adobe Creative Cloud

Post It notes, Graph paper, White Board, Dry-Erase markers, Sharpies, Flip boards

Tools for User Centered Testing / Methods 

GoToMeeting– to broadcast formal usability testing to observers in another room, so they can take notes and view the testing in real time and ask virtual follow up questions for the facilitator to ask participants.

Crazy Egg-a heat mapping hot spotting A/B testing tool which, when coupled with analytics, really helps us get a picture of where users are going on our site.

Silverback– Screen capturing usability testing software app.

PostitPlus – We do a lot of affinity grouping exercises and interface sketches using post it notes,  so this app is super cool and handy.

OptimalSort-Online card sorting software.

Personas-To think through our user flows when thinking through a process, service, or interface. We then use these personas to create more granular user stories in Pivotal Tracker (above).

What’s your workspace like?

I’m on the mezzanine of Bobst Library which is right across from Washington Square Park. I have a pretty big office with a window overlooking the walkway between Bobst and the Stern School of Business.

I have a huge old subway map on one wall with an original heavy wood frame, and everyone likes looking at old subway lines, etc. I also have a map sheet of the mountain I’m named after. Otherwise, it’s all white board and I’ve added our personas to the wall as well so I can think through user stories by quickly scanning and selecting a relevant persona.

I’m in an area where many of my colleagues mailboxes are, so people stop by a lot. I close my door when I need to concentrate, and on Fridays we try to work collaboratively in a basement conference room with a huge whiteboard.

I have a heavy wooden L shaped desk which I am trying to replace with a standing desk.

Every morning I go to Oren’s, a great coffee shop nearby, with the same colleague and friend, and we usually do “loops” around Washington Square Park to problem solve and give work advice. It’s a great way to start the day.

What’s your best time saving trick

Informal (but not happenstance) communication saves so much time in the long run and helps alleviate potential issues that can arise when people aren’t communicating. Though it takes a few minutes, I try to touch base with people regularly.

What’s your favorite to do list manager

My whiteboard, supplemented by stickies (mac), and my huge flip chart notepad with my wish list on it. Completed items get transferred to a “leaderboard.”

Besides your phone and computer, what gadget can’t you live without?

Headphones

What everyday thing are you better at than everyone else?

I don’t think I do things better than other people, but I think my everyday strengths include:  encouraging and mentoring, thinking up ideas and potential solutions, getting excited about other people’s ideas, trying to come to issues creatively, and dusting myself off.

What are you currently reading?

I listen to audiobooks and podcasts on my bike commute. Among my favorites:

In print, I’m currently reading:

What do you listen to while at work?

Classical is the only type of music I can play while working and still be able to (mostly) concentrate. So I listen to the masters, like Bach, Mozart and Tchaikovsky

When we work collaboratively on creative things that don’t require earnest concentration I defer to one of the team to pick the playlist. Otherwise, I’d always pick Josh Ritter.

Are you more of an introvert or an extrovert?

Mostly an introvert who fakes being an extrovert at work but as other authors have said (Eric, Nicholas) it’s very dependent on the situation and the company.

What’s your sleep routine like?

Early to bed, early to rise. I get up between 5-6 and go to bed between around 10.

Fill in the blank: I’d love to see _________ answer these same questions.

@Morville (Peter Morville)

@leahbuley (Leah Buley)

What’s the best advice you’ve ever received?

Show up


This Is How I (Attempt To) Work

Editor’s Note: ACRL TechConnect blog will run a series of posts by our regular and guest authors about The Setup of our work. The first post is by TechConnect alum Becky Yoose.

Ever wondered how several of your beloved TechConnect authors and alumni manage to Get Stuff Done? In conjunction with The Setup, this is the first post in a series of TechConnect authors, past and present, to show off what tools, tips, and tricks they use for work.

I have been tagged by @nnschiller in his “This is how I work” post. Normally, I just hide when these type of chain letter type events come along, but this time I’ll indulge everyone and dust off my blogging skills. I’m Becky Yoose, Discovery and Integrated Systems Librarian, and this is how I work.

Location: Grinnell, Iowa, United States

Current Gig: Assistant Professor, Discovery and Integrated Systems Librarian; Grinnell College

Current Mobile Device: Samsung Galaxy Note 3, outfitted with an OtterBox Defender cover. I still mourn the discontinuation of the Droid sliding keyboard models, but the oversized screen and stylus make up for the lack of tactile typing.

Current Computer:

Work: HP EliteBook 8460p (due to be replaced in 2015); boots Windows 7

Home: Betty, my first build; dual boots Windows 7 and Ubuntu 14.04 LTS

eeepc 901, currently b0rked due to misjudgement on my part about appropriate xubuntu distros.

Current Tablet: iPad 2, supplied by work.

One word that best describes how you work:

Panic!

Don’t panic. Nothing to see here. Move along.

What apps/software/tools can’t you live without?

Essential work computer software and tools, in no particular order:

  • Outlook – email and meetings make up the majority of my daily interactions with people at work and since campus is a Microsoft shop…
  • Notepad++ – my Swiss army knife for text-based duties: scripts, notes, and everything in between.
  • PuTTY – Great SSH/Telnet client for Windows.
  • Marcedit – I work with library metadata, so Marcedit is essential on any of my work machines.
  • MacroExpress and AutoIt – Two different Windows automation apps: MacroExpress handles more simple automation (opening programs, templating/constant data, simple workflows involving multiple programs) while AutoIt gives you more flexibility and control in the automation process, including programming local functions and more complex decision-making processes.
  • Rainmeter and Rainlander – These two provide customized desktop skins that give you direct or quicker access to specific system information, functions, or in Rainlander’s case, application data.
  • Pidgin – MPOW uses both LibraryH3lp and AIM for instant messaging services, and I use IRC to keep in touch with #libtechwomen and #code4lib channels. Being able to do all three in one app saves time and effort.
  • Jing – while the Snipping Tool in Windows 7 is great for taking screenshots for emails, Jing has proven to be useful for both basic screenshots and screencasts for troubleshooting systems issues with staff and library users. The ability to save screencasts on screencast.com is also valuable when working with vendors in troubleshooting problems.
  • CCleaner – Not only does it empty your recycling bin and temporary files/caches, the various features available in one spot (program lists, registry fixes, startup program lists, etc.) make CCleaner an efficient way to do housekeeping on my machines.
  • Janetter (modified code for custom display of Twitter lists) – Twitter is my main information source for the library and technology fields. One feature I use extensively is the List feature, and Janetter’s plugin-friendly set up allows me to highly customize not only the display but what is displayed in the list feeds.
  • Firefox, including these plugins (not an exhaustive list):

For server apps, the main app (beyond putty or vSphere) that I need is Nagios to monitor the library virtual Linux server farm. I also am partial to nano, vim, and apt.

As one of the very few tech people on staff, I need a reliable system to track and communicate technical issues with both library users and staff. Currently the Libraries is piggybacking on ITS’ ticketing system KBOX. Despite being fit into a somewhat inflexible existing structure, it has worked well for us, and since we don’t have to maintain the system, all the better!

Web services: The Old Reader, Gmail, Google Drive, Skype, Twitter. I still mourn the loss of Google Reader.

For physical items, my tea mug. And my hat.

What’s your workspace like?

Take a concrete box, place it in the dead center of the library, cut out a door in one side, place the door opening three feet from the elevator door, cool it to a consistent 63-65 degrees F., and you have my office. Spending 10+ hours a day during the week in this office means a bit of modding is in order:

  • Computer workstation set up: two HP LA2205wg 22 inch monitors (set to appropriate ergonomic distances on desk), laptop docking station, ergonomic keyboard/mouse stand, ergonomic chair. Key word is “ergonomic”. I can’t stress this enough with folks; I’ve seen friends develop RSIs on the job years ago and they still struggle with them today. Don’t go down that path if you can help it; it’s not pretty.
  • Light source: four lamps of varying size, all with GE Daylight 6500K 15 watt light bulbs. I can’t do the overhead lights due to headaches and migraines, so these lamps and bulbs help make an otherwise dark concrete box a little brighter.
  • Three cephalopods, a starfish, a duck, a moomin, and cats of various materials and sizes
  • Well stocked snack/emergency meal/tea corner to fuel said 10+ hour work days
  • Blankets, cardigans, shawls, and heating pads to deal with the cold

When I work at home during weekends, I end up in the kitchen with the laptop on the island, giving me the option to sit on the high chair or stand. Either way, I have a window to look at when I need a few seconds to think. (If my boss is reading this – I want my office window back.)

What’s your best time-saving trick?

Do it right the first time. If you can’t do it right the first time, then make the path to make it right as efficient  and painless as you possibly can. Alternatively, build a time machine to prevent those disastrous metadata and systems decisions made in the past that you’re dealing with now.

What’s your favorite to-do list manager?

Post it notes on a wall

The Big Picture from 2012

I have tried to do online to-do list managers, such as Trello; however, I have found that physical managers work best for me. In my office I have a to-do management system that comprises of three types of lists:

  • The Big Picture List (2012 list pictured above)- four big post it sheets on my wall, labeled by season, divided by months in each sheet. Smaller post it notes are used to indicate which projects are going on in which months. This is a great way to get a quick visual as to what needs to be completed, what can be delayed, etc.
  • The Medium Picture List – a mounted whiteboard on the wall in front of my desk. Here specific projects are listed with one to three action items that need to be completed within a certain time, usually within one to two months.
  • The Small Picture List – written on discarded Choice review cards, the perfect size to quickly jot down things that need to be done either today or in the next few days.

Besides your phone and computer, what gadget can’t you live without?

My wrist watch, set five minutes fast. I feel conscientious if I go out of the house without it.

What everyday thing are you better at than everyone else?

I’d like to think that I’m pretty good with adhering to Inbox Zero.

What are you currently reading?

The practice of system and network administration, 2nd edition. Part curiosity, part wanting to improve my sysadmin responsibilities, part wanting to be able to communicate better with my IT colleagues.

What do you listen to while you work?

It depends on what I am working on. I have various stations on Pandora One and a selection of iTunes playlists to choose from depending on the task on hand. The choices range from medieval chant (for long form writing) to thrash metal (XML troubleshooting).

Realistically, though, the sounds I hear most are email notifications, the operation of the elevator that is three feet from my door, and the occasional TMI conversation between students who think the hallway where my office and the elevator are located is deserted.

Are you more of an introvert or an extrovert?

An introvert blessed/cursed with her parents’ social skills.

What’s your sleep routine like?

I turn into a pumpkin at around 8:30 pm, sometimes earlier. I wake up around 4:30 am most days, though I do cheat and not get out of bed until around 5:15 am, checking email, news feeds, and looking at my calendar to prepare for the coming day.

Fill in the blank: I’d love to see _________ answer these same questions.

You. Also, my cats.

What’s the best advice you’ve ever received?

Not advice per se, but life experience. There are many things one learns when living on a farm, including responsibility, work ethic, and realistic optimism. You learn to integrate work and life since, on the farm, work is life. You work long hours, but you also have to rest whenever you can catch a moment.  If nothing else, living on a farm teaches you that no matter how long you put off doing something, it has to be done. The earlier, the better, especially when it comes with shoveling manure.


Migrating to LibGuides 2.0

This summer Springshare released LibGuides 2.0, which is a complete revamp of the LibGuides system. Many libraries use LibGuides, either as course/research guides or in some cases as the entire library website, and so this is something that’s been on the mind of many librarians this summer, whichever side of LibGuides they usually see. The process of migrating is not too difficult, but the choices you make in planning the new interface can be challenging. As the librarians responsible for the migration, we will discuss our experience of planning and implementing the new LibGuides platform.

Making the Decision to Migrate

While migrating this summer was optional, Springshare will probably only support LibGuides 1 for another two years, and at Loyola we felt it was better to move sooner rather than later. Over the past few years there were perpetual LibGuides cleanup projects, and this seemed to be a good opportunity to finalize that work. At the same time, we wanted to experiment with new designs for the library’s website that would bring it in closer alignment with the university’s new brand as well as make the site responsive, and LibGuides seemed like the ideal place to experiment with some of those ideas. Several new features, revealed on Springshare’s blog, resonated with subject-area specialists which was another reason to push for a migration sooner than later. We also wanted to have it in place before the first day of classes, which gave us a few months to experiment.

The Reference and Electronic Resources librarian, Will Kent, as well as the Head of Reference, Niamh McGuigan, and the Digital Services Librarian, Margaret Heller, worked in concert to make decisions, as well as inviting all the other reference and instruction librarians (as well as anyone else who was interested) to participate in the process. There were a few ground rules the core team went by, however: we were migrating and the process was iterative, i.e. we weren’t waiting for perfection to launch.

Planning the Migration

During the migration planning process, the small team of three librarians worked together to create a timeline, report to the library staff on progress, solicit feedback on the system, and update the LibGuide policies to reflect the new changes and functions. As far as front-end migration went, we addressed large staff-wide meetings, provided updates, polled subject specialists on the progress, prepared our 400 databases for conversion to the new A-Z list, and demonstrated new features, and opened changes that they should be aware of. We would relay updates from Springshare and handle any troubleshooting questions as they happened.

Given the new features – new categories, new ways of searching, the A-Z database list, and other features, it was important for us to sit down, discuss standards, and update our content policies. The good news was that most of our content was in good shape for the migration. The process was swift and barring inevitable, tiny bugs went smoothly.

Our original timeline was to present the migration steps at our June monthly joint meeting of collections and reference staff, and give a timeline one month until the July meeting to complete the work. For various reasons this ended up stretching until mid-August, but we still launched the day before classes began. We are constantly in the process of updating guide types, adding new resources, and re-classifying boxes to adhere to our new policies.

Working on the Design

LibGuides 2.0 provides two basic templates, a left navigation menu and a top tabbed menu that looks similar to the original LibGuides (additional templates are available with the LibGuides CMS product). We had originally discussed using the left navigation box template and originally began a design based on this, but ultimately people felt more comfortable with the tabbed navigation. Whiteboard sketch of the LibGuides UI

For the initial prototype, Margaret worked off a template that we’d used before for Omeka. This mirrors the Loyola University Chicago template very closely. We kept all of the LibGuides standard template–i.e. 1-3 columns with the number of columns and sections within the column determined by the page creator, but added a few additional pieces in the header and footer, as well as making big changes to the tabs.

The first step in planning the design was to understand what customization happened in the template, and which in the header and footer which are entered separately in the admin UI. Margaret sketched out our vision for the site on the whiteboard wall to determine existing selectors and those that would need to be added, as well as get a sense of whether we would need to change the content section at all. In the interests of completing the project in a timely fashion, we determined that the bare minimum of customization to unify the research guides with the rest of the university websites would be the first priority.

For those still planning a redesign, the Code4Lib community has many suggestions on what to consider. The main thing to consider is that LibGuides 2.0 is based on the Bootstrap 3.0 framework, which Michael Schofield recently implored us to use responsibly. Other important considerations are the accessibility of the solution you pick, and use of white space.

LibGuides Look & Feel UI tabs The Look & Feel section under ‘Admin’ has several tabs with sections for Header and Footer, Custom CSS/JS, and layout of pages–Guide Pages Layout is the most relevant for this post.

Just as in the previous version of LibGuides, one can enter custom code for the header and footer (which in this case is almost the same as the regular library website), as well link to a custom CSS file (we did not include any custom Javascript here, but did include several Google Fonts and our custom icon). The Guide Pages Layout is new, and this is where one can edit the actual template that creates each page. We didn’t make any large changes here, but were still able to achieve a unique look with custom CSS.

The new LibGuides platform is responsive, but we needed to account for several items we added to the interface. We added a search box that would allow users to search the entire university website, as well as several new logos, so Margaret added a few media queries to adjust these features on a phone or tablet, as well as adjust the spacing of the custom footer.

Improving the Design

Our first design was ready to present to the subject librarians a month after the migration process started. It was based on the principle of matching the luc.edu pages closely (example), in which the navigation tabs across the top have unusual cutouts, and section titles are very large. No one was very happy with this result, however, as it made the typical LibGuides layout with multiple sections on a page unusable and the tabs not visible enough. While one approach would have been to change the navigation to left navigation menu and limit the number of sections, the majority of the subject librarians preferred to keep things closer to what they had been, with a view to moving toward a potential new layout in the future.

Once we determined a literal interpretation of the university website was not usable for our content, we found inspiration for the template body from another section of the university website that was aimed at presenting a lot of dynamic content with multiple sections, but kept the standard luc.edu header. This allowed us to create a page that was recognizably part of Loyola, but presented our LibGuides content in a much more usable form.

Sticky Tabs

Sticky Tabs

The other piece we borrowed from the university website was sticky tabs. This was an attempt to make the tabs more visible and usable based on what we knew from usability testing on the old platform and what users would already know from the university site. Because LibGuides is based on the Bootstrap framework, it was easy to drop this in using the Affix plugin (tutorial on how to use this)1. The tabs are translucent so they don’t obscure content as one scrolls down.

Our final result was much more popular with everyone. It has a subtle background color and border around each box with a section header that stands out but doesn’t overwhelm the content. The tabs are not at all like traditional LibGuides tabs, functioning somewhat more like regular header links.

mainview

Final result.

Next Steps

Over the summer we were not able to conduct usability testing on the new interface due to the tight timeline, so the first step this fall is to integrate it into our regular usability testing schedule to make iterative changes based on user feedback. We also need to continue to audit the page to improve accessibility.

The research guides are one of the most used links on our website (anywhere between 10,000 and 20,000 visits per month), so our top priority was to make sure the migration did not interfere with use – both in terms of patron access and content creation by the subject-area librarians. Thanks to our feedback sessions, good communication with Springshare, and reliable new platform, the migration went smoothly without interruption.

About our guest author: Will Kent is Reference/Instruction and Electronic Resources Librarian and subject specialist for Nursing and Chemistry at Loyola University Chicago. He received his MSLIS from University of Illinois Urbana-Champaign in 2011 with a certificate in Community Informatics.

Notes
  1. You may remember that in the Bootstrap Responsibly post Michael suggested it wasn’t necessary to use this, but it is the most straightforward way in LibGuides 2.0

Bootstrap Responsibly

Bootstrap is the most popular front-end framework used for websites. An estimate by meanpath several months ago sat it firmly behind 1% of the web – for good reason: Bootstrap makes it relatively painless to puzzle together a pretty awesome plug-and-play, component-rich site. Its modularity is its key feature, developed so Twitter could rapidly spin-up internal microsites and dashboards.

Oh, and it’s responsive. This is kind of a thing. There’s not a library conference today that doesn’t showcase at least one talk about responsive web design. There’s a book, countless webinars, courses, whole blogs dedicated to it (ahem), and more. The pressure for libraries to have responsive, usable websites can seem to come more from the likes of us than from the patronbase itself, but don’t let that discredit it. The trend is clear and it is only a matter of time before our libraries have their mobile moment.

Library websites that aren’t responsive feel dated, and more importantly they are missing an opportunity to reach a bevy of mobile-only users that in 2012 already made up more than a quarter of all web traffic. Library redesigns are often quickly pulled together in a rush to meet the growing demand from stakeholders, pressure from the library community, and users. The sprint makes the allure of frameworks like Bootstrap that much more appealing, but Bootstrapped library websites often suffer the cruelest of responsive ironies:

They’re not mobile-friendly at all.

Assumptions that Frameworks Make

Let’s take a step back and consider whether using a framework is the right choice at all. A front-end framework like Bootstrap is a Lego set with all the pieces conveniently packed. It comes with a series of templates, a blown-out stylesheet, scripts tuned to the environment that let users essentially copy-and-paste fairly complex web-machinery into being. Carousels, tabs, responsive dropdown menus, all sorts of buttons, alerts for every occasion, gorgeous galleries, and very smart decisions made by a robust team of far-more capable developers than we.

Except for the specific layout and the content, every Bootstrapped site is essentially a complete organism years in the making. This is also the reason that designers sometimes scoff, joking that these sites look the same. Decked-out frameworks are ideal for rapid prototyping with a limited timescale and budget because the design decisions have by and large already been made. They assume you plan to use the framework as-is, and they don’t make customization easy.

In fact, Bootstrap’s guide points out that any customization is better suited to be cosmetic than a complete overhaul. The trade-off is that Bootstrap is otherwise complete. It is tried, true, usable, accessible out of the box, and only waiting for your content.

Not all Responsive Design is Created Equal

It is still common to hear the selling point for a swanky new site is that it is “responsive down to mobile.” The phrase probably rings a bell. It describes a website that collapses its grid as the width of the browser shrinks until its layout is appropriate for whatever screen users are carrying around. This is kind of the point – and cool, as any of us with a browser-resizing obsession could tell you.

Today, “responsive down to mobile” has a lot of baggage. Let me explain: it represents a telling and harrowing ideology that for these projects mobile is the afterthought when mobile optimization should be the most important part. Library design committees don’t actually say aloud or conceive of this stuff when researching options, but it should be implicit. When mobile is an afterthought, the committee presumes users are more likely to visit from a laptop or desktop than a phone (or refrigerator). This is not true.

See, a website, responsive or not, originally laid out for a 1366×768 desktop monitor in the designer’s office, wistfully depends on visitors with that same browsing context. If it looks good in-office and loads fast, then looking good and loading fast must be the default. “Responsive down to mobile” is divorced from the reality that a similarly wide screen is not the common denominator. As such, responsive down to mobile sites have a superficial layout optimized for the developers, not the user.

In a recent talk at An Event Apart–a conference–in Atlanta, Georgia, Mat Marquis stated that 72% of responsive websites send the same assets to mobile sites as they do desktop sites, and this is largely contributing to the web feeling slower. While setting img { width: 100%; } will scale media to fit snugly to the container, it is still sending the same high-resolution image to a 320px-wide phone as a 720px-wide tablet. A 1.6mb page loads differently on a phone than the machine it was designed on. The digital divide with which librarians are so familiar is certainly nowhere near closed, but while internet access is increasingly available its ubiquity doesn’t translate to speed:

  1. 50% of users ages 12-29 are “mostly mobile” users, and you know what wireless connections are like,
  2. even so, the weight of the average website ( currently 1.6mb) is increasing.

Last December, analysis of data from pagespeed quantiles during an HTTP Archive crawl tried to determine how fast the web was getting slower. The fastest sites are slowing at a greater rate than the big bloated sites, likely because the assets we send–like increasingly high resolution images to compensate for increasing pixel density in our devices–are getting bigger.

The havoc this wreaks on the load times of “mobile friendly” responsive websites is detrimental. Why? Well, we know that

  • users expect a mobile website to load as fast on their phone as it does on a desktop,
  • three-quarters of users will give up on a website if it takes longer than 4 seconds to load,
  • the optimistic average load time for just a 700kb website on 3G is more like 10-12 seconds

eep O_o.

A Better Responsive Design

So there was a big change to Bootstrap in August 2013 when it was restructured from a “responsive down to mobile” framework to “mobile-first.” It has also been given a simpler, flat design, which has 100% faster paint time – but I digress. “Mobile-first” is key. Emblazon this over the door of the library web committee. Strike “responsive down to mobile.” Suppress the record.

Technically, “mobile-first” describes the structure of the stylesheet using CSS3 Media Queries, which determine when certain styles are rendered by the browser.

.example {
  styles: these load first;
}

@media screen and (min-width: 48em) {

  .example {

    styles: these load once the screen is 48 ems wide;

  }

}

The most basic styles are loaded first. As more space becomes available, designers can assume (sort of) that the user’s device has a little extra juice, that their connection may be better, so they start adding pizzazz. One might make the decision that, hey, most of the devices less than 48em (720px approximately with a base font size of 16px) are probably touch only, so let’s not load any hover effects until the screen is wider.

Nirvana

In a literal sense, mobile-first is asset management. More than that, mobile-first is this philosophical undercurrent, an implicit zen of user-centric thinking that aligns with libraries’ missions to be accessible to all patrons. Designing mobile-first means designing to the lowest common denominator: functional and fast on a cracked Blackberry at peak time; functional and fast on a ten year old machine in the bayou, a browser with fourteen malware toolbars trudging through the mire of a dial-up connection; functional and fast [and beautiful?] on a 23″ iMac. Thinking about the mobile layout first makes design committees more selective of the content squeezed on to the front page, which makes committees more concerned with the quality of that content.

The Point

This is the important statement that Bootstrap now makes. It expects the design committee to think mobile-first. It comes with all the components you could want, but they want you to trim the fat.

Future Friendly Bootstrapping

This is what you get in the stock Bootstrap:

  • buttons, tables, forms, icons, etc. (97kb)
  • a theme (20kb)
  • javascripts (30kb)
  • oh, and jQuery (94kb)

That’s almost 250kb of website. This is like a browser eating a brick of Mackinac Island Fudge – and this high calorie bloat doesn’t include images. Consider that if the median load time for a 700kb page is 10-12 seconds on a phone, half that time with out-of-the-box Bootstrap is spent loading just the assets.

While it’s not totally deal-breaking, 100kb is 5x as much CSS as an average site should have, as well as 15%-20% of what all the assets on an average page should weigh. Josh Broton

To put this in context, I like to fall back on Ilya Girgorik’s example comparing load time to user reaction in his talk “Breaking the 1000ms Time to Glass Mobile Barrier.” If the site loads in just 0-100 milliseconds, this feels instant to the user. By 100-300ms, the site already begins to feel sluggish. At 300-1000ms, uh – is the machine working? After 1 second there is a mental context switch, which means that the user is impatient, distracted, or consciously aware of the load-time. After 10 seconds, the user gives up.

By choosing not to pair down, your Bootstrapped Library starts off on the wrong foot.

The Temptation to Widgetize

Even though Bootstrap provides modals, tabs, carousels, autocomplete, and other modules, this doesn’t mean a website needs to use them. Bootstrap lets you tailor which jQuery plugins are included in the final script. The hardest part of any redesign is to let quality content determine the tools, not the ability to tabularize or scrollspy be an excuse to implement them. Oh, don’t Google those. I’ll touch on tabs and scrollspy in a few minutes.

I am going to be super presumptuous now and walk through the total Bootstrap package, then make recommendations for lightening the load.

Transitions

Transitions.js is a fairly lightweight CSS transition polyfill. What this means is that the script checks to see if your user’s browser supports CSS Transitions, and if it doesn’t then it simulates those transitions with javascript. For instance, CSS transitions often handle the smooth, uh, transition between colors when you hover over a button. They are also a little more than just pizzazz. In a recent article, Rachel Nabors shows how transition and animation increase the usability of the site by guiding the eye.

With that said, CSS Transitions have pretty good browser support and they probably aren’t crucial to the functionality of the library website on IE9.

Recommendation: Don’t Include.

 Modals

“Modals” are popup windows. There are plenty of neat things you can do with them. Additionally, modals are a pain to design consistently for every browser. Let Bootstrap do that heavy lifting for you.

Recommendation: Include

Dropdown

It’s hard to conclude a library website design committee without a lot of links in your menu bar. Dropdown menus are kind of tricky to code, and Bootstrap does a really nice job keeping it a consistent and responsive experience.

Recommendation: Include

Scrollspy

If you have a fixed sidebar or menu that follows the user as they read, scrollspy.js can highlight the section of that menu you are currently viewing. This is useful if your site has a lot of long-form articles, or if it is a one-page app that scrolls forever. I’m not sure this describes many library websites, but even if it does, you probably want more functionality than Scrollspy offers. I recommend jQuery-Waypoints – but only if you are going to do something really cool with it.

Recommendation: Don’t Include

Tabs

Tabs are a good way to break-up a lot of content without actually putting it on another page. A lot of libraries use some kind of tab widget to handle the different search options. If you are writing guides or tutorials, tabs could be a nice way to display the text.

Recommendation: Include

Tooltips

Tooltips are often descriptive popup bubbles of a section, option, or icon requiring more explanation. Tooltips.js helps handle the predictable positioning of the tooltip across browsers. With that said, I don’t think tooltips are that engaging; they’re sometimes appropriate, but you definitely use to see more of them in the past. Your library’s time is better spent de-jargoning any content that would warrant a tooltip. Need a tooltip? Why not just make whatever needs the tooltip more obvious O_o?

Recommendation: Don’t Include

Popover

Even fancier tooltips.

Recommendation: Don’t Include

Alerts

Alerts.js lets your users dismiss alerts that you might put in the header of your website. It’s always a good idea to give users some kind of control over these things. Better they read and dismiss than get frustrated from the clutter.

Recommendation: Include

Collapse

The collapse plugin allows for accordion-style sections for content similarly distributed as you might use with tabs. The ease-in-ease-out animation triggers motion-sickness and other aaarrghs among users with vestibular disorders. You could just use tabs.

Recommendation: Don’t Include

Button

Button.js gives a little extra jolt to Bootstrap’s buttons, allowing them to communicate an action or state. By that, imagine you fill out a reference form and you click “submit.” Button.js will put a little loader icon in the button itself and change the text to “sending ….” This way, users are told that the process is running, and maybe they won’t feel compelled to click and click and click until the page refreshes. This is a good thing.

Recommendation: Include

Carousel

Carousels are the most popular design element on the web. It lets a website slideshow content like upcoming events or new material. Carousels exist because design committees must be appeased. There are all sorts of reasons why you probably shouldn’t put a carousel on your website: they are largely inaccessible, have low engagement, are slooooow, and kind of imply that libraries hate their patrons.

Recommendation: Don’t Include.

Affix

I’m not exactly sure what this does. I think it’s a fixed-menu thing. You probably don’t need this. You can use CSS.

Recommendation: Don’t Include

Now, Don’t You Feel Better?

Just comparing the bootstrap.js and bootstrap.min.js files between out-of-the-box Bootstrap and one tailored to the specs above, which of course doesn’t consider the differences in the CSS, the weight of the images not included in a carousel (not to mention the unquantifiable amount of pain you would have inflicted), the numbers are telling:

File Before After
bootstrap.js 54kb 19kb
bootstrap.min.js 29kb 10kb

So, Bootstrap Responsibly

There is more to say. When bouncing this topic around twitter awhile ago, Jeremy Prevost pointed out that Bootstrap’s minified assets can be GZipped down to about 20kb total. This is the right way to serve assets from any framework. It requires an Apache config or .htaccess rule. Here is the .htaccess file used in HTML5 Boilerplate. You’ll find it well commented and modular: go ahead and just copy and paste the parts you need. You can eke out even more performance by “lazy loading” scripts at a given time, but these are a little out of the scope of this post.

Here’s the thing: when we talk about having good library websites we’re mostly talking about the look. This is the wrong discussion. Web designs driven by anything but the content they already have make grasping assumptions about how slick it would look to have this killer carousel, these accordions, nifty tooltips, and of course a squishy responsive design. Subsequently, these responsive sites miss the point: if anything, they’re mobile unfriendly.

Much of the time, a responsive library website is used as a marker that such-and-such site is credible and not irrelevant, but as such the website reflects a lack of purpose (e.g., “this website needs to increase library-card registration). A superficial understanding of responsive webdesign and easy-to-grab frameworks entail that the patron is the least priority.

 

About Our Guest Author :

Michael Schofield is a front-end librarian in south Florida, where it is hot and rainy – always. He tries to do neat things there. You can hear him talk design and user experience for libraries on LibUX.


Two Free Methods for Sharing Google Analytics Data Visualizations on a Public Dashboard

UPDATE (October 15th, 2014):

OOCharts as an API and service, described below, will be shut down November 15th, 2014.  More information about the decision by the developers to shut down the service is here. It’s not entirely surprising that the service going away, considering the Google SuperProxy option described in this post.  I’m leaving all the instructions here for OOCharts for posterity, but as of November 15th you can only use the SuperProxy or build your own with the Core Reporting API.

At this point, Google Analytics is arguably the standard way to track website usage data.  It’s easy to implement, but powerful enough to capture both wide general usage trends and very granular patterns.  However, it is not immediately obvious how to share Google Analytics data either with internal staff or external stakeholders – both of whom often have widespread demand for up-to-date metrics about library web property performance.  While access to Google Analytics can be granted by Google Analytics administrators to individual Google account-holders through the Google Analytics Admin screen, publishing data without requiring authentication requires some intermediary steps.

There are two main free methods of publishing Google Analytics visualizations to the web, and both involve using the Google Analytics API: OOCharts and the Google Analytics superProxy.  Both methods rely upon an intermediary service to retrieve data from the API and cache it, both to improve retrieval time and to avoid exceeding limits in API requests.1  The first method – OOCharts – requires much less time to get running initially. However, OOCharts’ long-standing beta status, and its status as a stand-alone service has less potential for long-term support than the second method, the Google Analytics superProxy.  For that reason, while OOCharts is certainly easier to set up, the superProxy method is definitely worth the investment in time (depending on what your needs).  I’ll cover both methods.

OOCharts Beta

OOCharts’ service facilitates the creation and storage of Google Analytics API Keys, which are required for sending secure requests to the API in order to retrieve data.

When setting up an OOCharts account, create your account utilizing the email address you use to access Google Analytics.  For example, if you log into Google Analytics using admin@mylibrary.org, I suggest you use this email address to sign up for OOCharts.  After creating an account with OOCharts, you will be directed to Google Analytics to authorize the OOCharts service to access your Google Analytics data on your behalf.

Let OOCharts gather Google Analytics Data on your Behalf.

Let OOCharts gather Google Analytics Data on your Behalf.

After authorizing the service, you will be able to generate an API key for any of the properties your to which your Google Analytics account has access.

In the OOCharts interface, click your name in the upper right corner, and then click inside the Add API Key field to see a list of available Analytics Properties from your account.

 

Once you’ve select a property, OOCharts will generate a key for your OOCharts application.  When you go back to the list of API Keys, you’ll see your keys along with property IDs (shown in brackets after the URL of your properties, e.g., [9999999].

APIkeys

Your API Keys now show your key values and your Google Analytics property IDs, both of which you’ll need to create visualizations with the OOCharts library.

 

Creating Visualizations with the OOCharts JavaScript Library

OOCharts appears to have started as a simple chart library for visualization data returned from the Google Analytics API. After you have set up your OOCharts account, download the front-end JavaScript library (available at http://docs.oocharts.com/) and upload it to your server or localhost.   Navigate to the /examples directory and locate the ‘timeline.html’ file.  This file contains a simplified example that displays web visits over time in Google Analytics’ familiar timeline format.

The code for this page is very simple, and contains two methods – JavaScript-only and with HTML Attributes – for creating OOCharts visualizations.  Below, I’ve separated out the required elements for both methods.  While either methods will work on their own, using HTML attributes allows for additional customizations and styling:

JavaScript-Only
		<h3>With JS</h3>
		<div id='chart'></div>
		
		<script src='../oocharts.js'></script>
		<script type="text/javascript">

			window.onload = function(){

				oo.setAPIKey("{{ YOUR API KEY }}");

				oo.load(function(){

					var timeline = new oo.Timeline("{{ YOUR PROFILE ID }}", "180d");

					timeline.addMetric("ga:visits", "Visits");

					timeline.addMetric("ga:newVisits", "New Visits");

					timeline.draw('chart');

				});
			};

		</script>
With HTML Attributes:
<h3>With HTML Attributes</h3>
	<div data-oochart='timeline' 
        data-oochart-metrics='ga:visits,Visits,ga:newVisits,New Visits' 
        data-oochart-start-date='180d' 
        data-oochart-profile='{{ YOUR PROFILE ID }}'></div>
		
			<script src='../oocharts.js'></script>
			<script type="text/javascript">

			window.onload = function(){

				oo.setAPIKey("{{ YOUR API KEY }}");

				oo.load(function(){
				});
			};

		</script>

 

For either method, plugin {{ YOUR API KEY }} where indicated with the API Key generated in OOCharts and replace {{ YOUR PROFILE ID }} with the associated eight-digit profile ID.  Load the page in your browser, and you get this:

With the API Key and Profile ID in place, the timeline.html example looks like this.  In this example I also adjusted the date paramter (30d by default) to 180d for more data.

With the API Key and Profile ID in place, the timeline.html example looks like this. In this example I also adjusted the date parameter (30d by default) to 180d for more data.

This example shows you two formats for the chart – one that is driven solely by JavaScript, and another that can be customized using HTML attributes.  For example, you could modify the <div> tag to include a style attribute or CSS class to change the width of the chart, e.g.:

<h3>With HTML Attributes</h3>

<div style=”width:400px” data-oochart='timeline' 
data-oochart-start-date='180d' 
data-oochart-metrics='ga:visits,Visits,ga:newVisits,New Visits' 
data-oochart-profile='{{ YOUR PROFILE ID }}'></div>

Here’s the same example.html file showing both the JavaScript-only format and the HTML-attributes format, now with a bit of styling on the HTML attributes chart to make it smaller:

You can use styling to adjust the HTML attributes example.

You can use styling to adjust the HTML attributes example.

Easy, right?  So what’s the catch?

OOCharts only allows 10,000 requests a month – which is even easier to exceed than the 50,000 limit on the Google Analytics API.  Each time your page loads, you use another request.  Perhaps more importantly, your Analytics API key and profile ID are pretty much ‘out there’ for the world to see if they view your page source, because those values are stored in your client-side JavaScript2.  If you’re making a private intranet for your library staff, that’s probably not a big deal; but if you want to publish your dashboard fully to the public, you’ll want to make sure those values are secure.  You can do this with the Google Analytics superProxy.

Google Analytics superProxy

In 2013, Google Analytics released a method of accessing Google Analytics API data that doesn’t require end-users to authenticate in order to view data, known as the Google Analytics superProxy.  Much like OOCharts, the superProxy facilitates the creation of a query engine that retrieves Google Analytics statistics through the Google Analytics Core Reporting API, caches the statistics in a separate web application service, and enables the display of Google Analytics data to end users without requiring individual authentication. Caching the data has the additional benefit of ensuring that your application will not exceed the Google Core Reporting API request limit of 50,000 requests each day. The superProxy can be set up to refresh a limited number of times per day, and most dashboard applications only need a daily refresh of data to stay current.
The required elements of this method are available on the superProxy Github page (Google Analytics, “Google Analytics superProxy”). There are four major parts to the setup of the superProxy:

  1. Setting up Google App Engine hosting,
  2. Preparing the development environment,
  3. Configuring and deploying the superProxy to Google’s App Engine Appspot host; and
  4. Writing and scheduling queries that will be used to populate your dashboard.
Set up Google App Engine hosting

First, your Google Analytics account credentials to access the Google App Engine at https://appengine.google.com/start. The superProxy application you will be creating will be freely hosted by the Google App Engine. Create your application and designate an Application Identifier that will serve as the endpoint domain for queries to your Google Analytics data (e.g., mylibrarycharts.appspot.com).

Create your App Engine Application

Create your App Engine Application

You can leave the default authentication option, Open to All Google Users, selected.  This setting only reflects access to your App Engine administrative screen and does not affect the ability for end-users to view the dashboard charts you create.  Only those Google users who have been authorized to access Google Analytics data will be able to access any Google Analytics information through the Google App Engine.

Ensure that API access to Google Analytics is turned on under the Services pane of the Google Developer’s Console. Under APIs and Auth for your project, visit the APIs menu and ensure that the Analytics API is turned on.

Turn on the Google Analytics API.

Turn on the Google Analytics API.  Make sure the name under the Projects label in the upper left corner is the same as your newly created Google App Engine project (e.g., librarycharts).

 

Then visit the Credentials menu to set up an OAuth 2.0 Client ID. Set the Authorized JavaScript Origins value to your appspot domain (e.g., http://mylibrarycharts.appspot.com). Use the same value for the Authorized Redirect URI, but add /admin/auth to the end (e.g., http://mylibrarycharts.appspot.com/admin/auth). Note the OAuth Client ID, OAuth Client Secret, and OAuth Redirect URI that are stored here, as you will need to reference them later before you deploy your superProxy application to the Google App Engine.

Finally, visit the Consent Screen menu and choose an email address (such as your Google account email address), fill in the product name field with your Application Identifier (e.g., mylibrarycharts) and save your settings. If you do not include these settings, you may experience errors when accessing your superProxy application admin menu.

Prepare the Development Environment

In order to configure superProxy and deploy it to Google App Engine you will need Python 2.7 installed and the Google App Engine Launcher (AKA the Google App Engine SDK).  Python just needs to be installed for the App Engine Launcher to run; don’t worry, no Python coding is required.

Configure and Deploy the superProxy

The superProxy application is available from the superProxy Github page. Download the .zip files and extract them onto your computer into a location you can easily reference (e.g., C:/Users/yourname/Desktop/superproxy or /Applications/superproxy). Use a text editor such as Notepad or Notepad++ to edit the src/app.yaml to include your Application ID (e.g., mylibrarycharts). Then use Notepad to edit src/config.py to include the OAuth Client ID, OAuth Client Secret, and the OAuth Redirect URI that were generated when you created the Client ID in the Google Developer’s Console under the Credentials menu. Detailed instructions for editing these files are available on the superProxy Github page.

After you have edited and saved src/app.yaml and src/config.py, open the Google App Engine Launcher previously downloaded. Go to File > Add Existing Application. In the dialogue box that appears, browse to the location of your superProxy app’s /src directory.

To upload your superProxy application, use the Google App Engine Launcher and browse to the /src directory where you saved and configured your superProxy application.

To upload your superProxy application, use the Google App Engine Launcher and browse to the /src directory where you saved and configured your superProxy application.

Click Add, then click the Deploy button in the upper right corner of the App Engine Launcher. You may be asked to log into your Google account, and a log console may appear informing you of the deployment process. When deployment has finished, you should be able to access your superProxy application’s Admin screen at http://[yourapplicationID].appspot.com/admin, replacing [yourapplicationID] with your Application Identifier.

Creating superProxy Queries

SuperProxy queries request data from your Google Analytics account and return that data to the superProxy application. When the data is returned, it is made available to an end-point that can be used to populate charts, graphs, or other data visualizations. Most data available to you through the Google Analytics native interface is available through superProxy queries.

An easy way to get started with building a query is to visit the Google Analytics Query Explorer. You will need to login with your Google Analytics account to use the Query Explorer. This tool allows you to build an example query for the Core Reporting API, which is the same API service that your superProxy application will be using.

Google Analytics API Query Explorer

Running example queries through the Google Analytics Query Explorer can help you to identify the metrics and dimensions you would like to use in superProxy queries. Be sure to note the metrics and dimensions you use, and also be sure to note the ids value that is populated for you when using the API Explorer.

 

When experimenting with the Google Analytics Query explorer, make note of all the elements you use in your query. For example, to create a query that retrieves the number of users that visited your site between July 4th and July 18th 2014, you will need to select your Google Account, Property and View from the drop-down menus, and then build a query with the following parameters:

  • ids = this is a number (usually 8 digits) that will be automatically populated for you when you choose your Google Analytics Account, Property and View. The ids value is your property ID, and you will need this value later when building your superProxy query.
  • dimensions = ga:browser
  • metrics = ga:users
  • start-date = 07-04-2014
  • end-date = 07-18-2014

You can set the max-results value to limit the number of results returned. For queries that could potentially have thousands of results (such as individual search terms entered by users), limiting to the top 10 or 50 results will retrieve data more quickly. Clicking on any of the fields will generate a menu from which you can select available options. Click Get Data to retrieve Google Analytics data and verify that your query works

Successful Google Analytics Query Explorer query result showing visits by browser.

Successful Google Analytics Query Explorer query result showing visits by browser.

After building a successful query, you can replicate the query in your superProxy application. Return to your superProxy application’s admin page (e.g., http://[yourapplicationid].appspot.com/admin) and select Create Query. Name your query something to make it easy to identify later (e.g., Users by Browser). The Refresh Interval refers to how often you want the superProxy to retrieve fresh data from Google Analytics. For most queries, a daily refresh of the data will be sufficient, so if you are unsure, set the refresh interval to 86400. This will refresh your data every 86400 seconds, or once per day.

Create a superProxy Query

Create a superProxy Query

We can reuse all of the elements of queries built using the Google Analytics API Explorer to build the superProxy query encoded URI.  Here is an example of an Encoded URI that queries the number of users (organized by browser) that have visited a web property in the last 30 days (you’ll need to enter your own profile ID in the ids value for this to work):

https://www.googleapis.com/analytics/v3/data/ga?ids=ga:99999991
&metrics=ga:users&dimensions=ga:browser&max-results=10
&start-date={30daysago}&end-date={today}

Before saving, be sure to run Test Query to see a preview of the kind of data that is returned by your query. A successful query will return a json response, e.g.:

{"kind": "analytics#gaData", "rows": 
[["Amazon Silk", "8"], 
["Android Browser", "36"], 
["Chrome", "1456"], 
["Firefox", "1018"], 
["IE with Chrome Frame", "1"], 
["Internet Explorer", "899"], 
["Maxthon", "2"], 
["Opera", "7"], 
["Opera Mini", "2"], 
["Safari", "940"]], 
"containsSampledData": false, 
"totalsForAllResults": {"ga:users": "4398"}, 
"id": "https://www.googleapis.com/analytics/v3/data/ga?ids=ga:84099180&dimensions=ga:browser&metrics=ga:users&start-date=2014-06-29&end-date=2014-07-29&max-results=10", 
"itemsPerPage": 10, "nextLink": "https://www.googleapis.com/analytics/v3/data/ga?ids=ga:84099180&dimensions=ga:browser&metrics=ga:users&start-date=2014-07-04&end-date=2014-07-18&start-index=11&max-results=10", 
"totalResults": 13, "query": {"max-results": 10, "dimensions": "ga:browser", "start-date": "2014-06-29", "start-index": 1, "ids": "ga:84099180", "metrics": ["ga:users"], "end-date": "2014-07-18"}, 
"profileInfo": {"webPropertyId": "UA-9999999-9", "internalWebPropertyId": "99999999", "tableId": "ga:84099180", "profileId": "9999999", "profileName": "Library Chart", "accountId": "99999999"}, 
"columnHeaders": [{"dataType": "STRING", "columnType": "DIMENSION", "name": "ga:browser"}, {"dataType": "INTEGER", "columnType": "METRIC", "name": "ga:users"}], 
"selfLink": "https://www.googleapis.com/analytics/v3/data/ga?ids=ga:84099180&dimensions=ga:browser&metrics=ga:users&start-date=2014-06-29&end-date=2014-07-29&max-results=10"}

Once you’ve tested a successful query, save it, which will allow the json string to become accessible to an application that can help to visualize this data. After saving, you will be directed to the management screen for your API, where you will need to click Activate Endpoint to begin publishing the results of the query in a way that is retrievable. Then click Start Scheduling so that the query data is refreshed on the schedule you determined when you built the query (e.g., once a day). Finally, click Refresh Data to return data for the first time so that you can start interacting with the data returned from your query. Return to your superProxy application’s Admin page, where you will be able to manage your query and locate the public end-point needed to create a chart visualization.

Using the Google Visualization API to visualize Google Analytics data

Included with the superProxy .zip file downloaded to your computer from the Github repository is a sample .html page located under /samples/superproxy-demo.html. This file uses the Google Visualization API to generate two pie charts from data returned from superProxy queries. The Google Visualization API is a service that can ingest raw data (such as json arrays that are returned by the superProxy) and generate visual charts and graphs. Save superproxy-demo.html onto a web server or onto your computer’s localhost.  We’ll set up the first pie chart to use the data from the Users by Browser query saved in your superProxy app.

Open superproxy-demo.html and locate this section:

var browserWrapper = new google.visualization.ChartWrapper({
// Example Browser Share Query
"containerId": "browser",
// Example URL: http://your-application-id.appspot.com/query?id=QUERY_ID&format=data-table-response
"dataSourceUrl": "REPLACE WITH Google Analytics superProxy PUBLIC URL, DATA TABLE RESPONSE FORMAT",
"refreshInterval": REPLACE_WITH_A_TIME_INTERVAL,
"chartType": "PieChart",
"options": {
"showRowNumber" : true,
"width": 630,
"height": 440,
"is3D": true,
"title": "REPLACE WITH TITLE"
}
});

Three values need to be modified to create a pie chart visualization:

  • dataSourceUrl: This value is the public end-point of the superProxy query you have created. To get this value, navigate to your superProxy admin page and click Manage Query on the Users by Browser query you have created. On this page, right click the DataTable (JSON Response) link and copy the URL (Figure 8). Paste the copied URL into superproxy-demo.html, replacing the text REPLACE WITH Google Analytics superProxy PUBLIC URL, DATA TABLE FORMAT. Leave quotes around the pasted URL.
Right-click the DataTable (JSON Response) link and copy the URL to your clipboard. The copied link will serve as the dataSouceUrl value in superproxy-demo.html.

Right-click the DataTable (JSON Response) link and copy the URL to your clipboard. The copied link will serve as the dataSouceUrl value in superproxy-demo.html.

  • refreshInterval – you can leave this value the same as the refresh Interval of your superProxy query (in seconds) – e.g., 86400.
  • title – this is the title that will appear above your pie chart, and should describe the data your users are looking at – e.g., Users by Browser.

Save the modified file to your server or local development environment, and load the saved page in a browser.  You should see a rather lovely pie chart:

Your pie chart's data will refresh automatically on the refresh schedule you set in your query.

Your pie chart’s data will refresh automatically based upon the Refresh Interval you specify in your superProxy query and your page’s JavaScript parameters.

That probably seemed like a lot of work just to make a pie chart.  But now that your app is set up, making new charts from your Google Analytics data just involves visiting your App Engine site, scheduling a new query, and referencing that with the Google Visualization API.  To me, the Google superProxy method has three distinct advantages over the simpler OOCharts method:

  • Security – Users won’t be able to view your API Keys by viewing the source of your dashboard’s web page
  • Stability – OOCharts might not be around forever.  For that matter, Google’s free App Engine service might not be around forever, but betting on Google is [mostly] a safe bet
  • Flexibility – You can create a huge range of queries, and test them out easily using the API explorer, and the Google Visualization API has extensive documentation and a fairly active user group from whom to gather advice and examples.

 

Notes

  1. There is a 50,000 request per day limit on the Analytics API.  That sounds like a lot, but it’s surprisingly easy to exceed. Consider creating a dashboard with 10 charts, each making a call to the Analytics API.  Without a service that caches the data, the data is refreshed every time a user loads a page.  After just 5,000 visits to the page (which makes 10 API calls – one for each chart – each time the page is loaded), the API limit is exceeded:  5,000 page loads x 10 calls per page = 50,000 API requests.
  2. You can use pre-built OOCharts Queries – https://app.oocharts.com/mc/query/list – to hide your profile ID (but not your API Key). There are many ways to minify and obfuscate client-side JavaScript to make it harder to read, but it’s still pretty much accessible to someone who wants to get it

Analyzing Usage Logs with OpenRefine

Background

Like a lot of librarians, I have access to a lot of data, and sometimes no idea how to analyze it. When I learned about linked data and the ability to search against data sources with a piece of software called OpenRefine, I wondered if it would be possible to match our users’ discovery layer queries against the Library of Congress Subject Headings. From there I could use the linking in LCSH to find the Library of Congress Classification, and then get an overall picture of the subjects our users were searching for. As with many research projects, it didn’t really turn out like I anticipated, but it did open further areas of research.

At California State University, Fullerton, we use an open source application called Xerxes, developed by David Walker at the CSU Chancellor’s Office, in combination with the Summon API. Xerxes acts as an interface for any number of search tools, including Solr, federated search engines, and most of the major discovery service vendors. We call it the Basic Search, and it’s incredibly popular with students, with over 100,000 searches a month and growing. It’s also well-liked – in a survey, about 90% of users said they found what they were looking for. We have monthly files of our users’ queries, so I had all of the data I needed to go exploring with OpenRefine.

OpenRefine

OpenRefine is an open source tool that deals with data in a very different way than typical spreadsheets. It has been mentioned in TechConnect before, and Margaret Heller’s post, “A Librarian’s Guide to OpenRefine” provides an excellent summary and introduction. More resources are also available on Github.

One of the most powerful things OpenRefine does is to allow queries against open data sets through a function called reconciliation. In the open data world, reconciliation refers to matching the same concept among different data sets, although in this case we are matching unknown entities against “a well-known set of reference identifiers” (Re-using Cool URIs: Entity Reconciliation Against LOD Hubs).

Reconciling Against LCSH

In this case, we’re reconciling our discovery layer search queries with LCSH. This basically means it’s trying to match the entire user query (e.g. “artist” or “cost of assisted suicide”) against what’s included in the LCSH linked open data. According to the LCSH website this includes “all Library of Congress Subject Headings, free-floating subdivisions (topical and form), Genre/Form headings, Children’s (AC) headings, and validation strings* for which authority records have been created. The content includes a few name headings (personal and corporate), such as William Shakespeare, Jesus Christ, and Harvard University, and geographic headings that are added to LCSH as they are needed to establish subdivisions, provide a pattern for subdivision practice, or provide reference structure for other terms.”

I used the directions at Free Your Metadata to point me in the right direction. One note: the steps below apply to OpenRefine 2.5 and version 0.8 of the RDF extension. OpenRefine 2.6 requires version 0.9 of the RDF extension. Or you could use LODRefine, which bundles some major extensions and I hear is great, but personally haven’t tried. The basic process shouldn’t change too much.

(1) Import your data

OpenRefine has quite a few file type options, so your format is likely already supported.

 Screenshot of importing data

(2) Clean your data

In my case, this involves deduplicating by timestamp and removing leading and trailing whitespaces. You can also remove weird punctuation, numbers, and even extremely short queries (<2 characters).

(3) Add the RDF extension.

If you’ve done it correctly, you should see an RDF dropdown next to Freebase.

Screenshot of correctly installed RDF extension

(4) Decide which data you’d like to search on.

In this example, I’ve decided to use just queries that are less than or equal to four words, and removed duplicate search queries. (Xerxes handles facet clicks as if they were separate searches, so there are many duplicates. I usually don’t, though, unless they happen at nearly the same time). I’ve also experimented with limiting to 10 or 15 characters, but there were not many more matches with 15 characters than 10, even though the data set was much larger. It depends on how much computing time you want to spend…it’s really a personal choice. In this case, I chose 4 words because of my experience with 15 characters – longer does not necessarily translate into more matches. A cursory glance at LCSH left me with the impression that the vast majority of headings (not including subdivisions, since they’d be searched individually) were 4 words or less. This, of course, means that your data with more than 4 words is unusable – more on that later.

Screenshot of adding a column based on word count using ngrams

(5) Go!

Shows OpenRefine reconciling

(6) Now you have your queries that were reconciled against LCSH, so you can limit to just those.

Screenshot of limiting to reconciled queries

Finding LC Classification

First, you’ll need to extract the cell.recon.match.id – the ID for the matched query that in the case of LCSH is the URI of the concept.

Screenshot of using cell.recon.match.id to get URI of concept

At this point you can choose whether to grab the HTML or the JSON, and create a new column based on this one by fetching URLs. I’ve never been able to get the parseJson() function to work correctly with LC’s JSON outputs, so for both HTML and JSON I’ve just regexed the raw output to isolate the classification. For more on regex see Bohyun Kim’s previous TechConnect post, “Fear No Longer Regular Expressions.”

On the raw HTML, the easiest way to do it is to transform the cells or create a new column with:

replace(partition(value,/<li property=”madsrdf:classification”>(<[^>]+>)*([A-Z]{1,2})/)[1],/<li property=”madsrdf:classification”>(<[^>]+>)*([A-Z]{1,2})/,”$2″).

Screenshot of using regex to get classification

You’ll note this will only pull out the first classification given, even if some have multiple classifications. That was a conscious choice for me, but obviously your needs may vary.

(Also, although I’m only concentrating on classification for this project, there’s a huge amount of data that you could work with – you can see an example URI for Acting to see all of the different fields).

Once you have the classifications, you can export to Excel and create a pivot table to count the instances of each, and you get a pretty table.

Table of LC Classifications

Caveats & Further Explorations

As you can guess by the y-axis in the table above, the number of matches is a very small percentage of actual searches. First I limited to keyword searches (as opposed to title/subject), then of those only ones that were 4 or fewer words long (about 65% of keyword searches). Of those, only about 1000 of the 26000 queries matched, and resulted in about 360 actual LC Classifications. Most months I average around 500, but in this example I took out duplicates even if they were far apart in time, just to experiment.

One thing I haven’t done but am considering is allowing matches that aren’t 100%. From my example above, there are another 600 or so queries that matched at 50-99%. This could significantly increase the number of matches and thus give us more classifications to work with.

Some of this is related to the types of searches that students are doing (see Michael J DeMars’ and my presentation “Making Data Less Daunting” at Electronic Resources & Libraries 2014, which this article grew out of, for some crazy examples) and some to the way that LCSH is structured. I chose LCSH because I could get linked to the LC Classification and thus get a sense of the subjects, but I’m definitely open to ideas. If you know of a better linked data source, I’m all ears.

I must also note that this is a pretty inefficient way of matching against LCSH. If you know of a way I could download the entire set, I’m interested in investigating that way as well.

Another approach that I will explore is moving away from reconciliation with LCSH (which is really more appropriate for a controlled vocabulary) to named-entity extraction, which takes natural language inputs and tries to recognize or extract common concepts (name, place, etc). Here I would use it as a first step before trying to match against LCSH. Free Your Metadata has a new named-entity extraction extension for OpenRefine, so I’ll definitely explore that option.

Planned Research

In the end, although this is interesting, does it actually mean anything? My next step with this dataset is to take a subset of the search queries and assign classification numbers. Over the course of several months, I hope to see if what I’ve pulled in automatically resembles the hand-classified data, and then draw conclusions.

So far, most of the peaks are expected – psychology and nursing are quite strong departments. There are some surprises though – education has been consistently underrepresented, based on both our enrollment numbers and when you do word counts (see our presentation for one month’s top word counts). Education students have a robust information literacy program. Does this mean that education students do complex searches that don’t match LCSH? Do they mostly use subject databases? Once again, an area for future research, should these automatic results match the classifications I do by hand.

What do you think? I’d love to hear your feedback or suggestions.

About Our Guest Author

Jaclyn Bedoya has lived and worked on three continents, although currently she’s an ER Librarian at CSU Fullerton. It turns out that growing up in Southern California spoils you, and she’s happiest being back where there are 300 days of sunshine a year. Also Disneyland. Reach her @spamgirl on Twitter or jaclynbedoya@gmail.com