Analyzing EZProxy Logs

Analyzing EZProxy logs may not be the most glamorous task in the world, but it can be illuminating. Depending on your EZProxy configuration, log analysis can allow you to see the top databases your users are visiting, the busiest days of the week, the number of connections to your resources occurring on or off-campus, what kinds of users (e.g., staff or faculty) are accessing proxied resources, and more.

What’s an EZProxy Log?

EZProxy logs are not significantly different from regular server logs.  Server logs are generally just plain text files that record activity that happens on the server.  Logs that are frequently analyzed to provide insight into how the server is doing include error logs (which can be used to help diagnose problems the server is having) and access logs (which can be used to identify usage activity).

EZProxy logs are a kind of modified access log, which record activities (page loads, http requests, etc.) your users undertake while connected in an EZProxy session. This article will briefly outline five potential methods for analyzing EZProxy logs:  AWStats, Piwik, EZPaarse, a custom Python script for parsing starting-point URLS (SPU) logs, and a paid option called Splunk.

The ability of  any log analyzer will of course depend upon how your EZProxy log directives are configured.  You will need to know your LogFormat and/or LogSPU directives in order to configure most log file analyzing solutions.  In EZProxy, you can see how your logs are formatted in config.txt/ezproxy.cfg by looking for the LogFormat directive, 1  e.g.,

LogFormat %h %l %u %t “%r” %s %b “%{user-agent}i”

and / or, to log Starting Point URLs (SPUs):

LogSPU -strftime log/spu/spu%Y%m.log %h %l %u %t “%r” %s %b “%{ezproxy-groups}i”

Logging Starting Point URLs can be useful because those tend to be users either clicking into a database or the full-text of an article, but no activity after that initial contact is logged.  This type of logging does not log extraneous resource loading, such as loading scripts and images – which often clutter up your traditional LogFormat logs and lead to misleadingly high hits.  LogSPU directives can be defined in addition to traditional LogFormat to provide two different possible views of your users’ data.  SPULogs can be easier to analyze and give more interesting data, because they can give a clearer picture of which links and databases are most popular  among your EZProxy users.  If you haven’t already set it up, SPULogs can be a very useful way to observe general usage trends by database.

You can find some very brief anonymized EZProxy log sample files on Gist:

On a typical EZProxy installation, historical monthly logs can be found inside the ezproxy/log directory.  By default they will rotate out every 12 months, so you may only find the past year of data stored on your server.

AWStats

Get It:  http://www.awstats.org/#DOWNLOAD

Best Used With:  Full Logs or SPU Logs

Code / Framework:  Perl

    An example AWStats monthly history report. Can you tell when our summer break begins?

An example AWStats monthly history report. Can you tell when our summer break begins?

AWStats Pros:

  • Easy installation, including on localhost
  • You can define your unique LogFormat easily in AWStats’ .conf file.
  • Friendly, albeit a little bit dated looking, charts show overall usage trends.
  • Extensive (but sometimes tricky) customization options can be used to more accurately represent sometimes unusual EZProxy log data.
Hourly traffic distribution in AWStats.  While our traffic peaks during normal working hours, we have steady usage going on until about 1 AM, after which point it crashes pretty hard.  We could use this data to determine  how much virtual reference staffing we should have available during these hours.

Hourly traffic distribution in AWStats. While our traffic peaks during normal working hours, we have steady usage going on until about Midnight, after which point it crashes pretty hard. We could use this data to determine how much virtual reference staffing we should have available during these hours.

 

AWStats Cons:

  • If you make a change to .conf files after you’ve ingested logs, the changes do not take effect on already ingested data.  You’ll have to re-ingest your logs.
  • Charts and graphs are not particularly (at least easily) customizable, and are not very modern-looking.
  • Charts are static and not interactive; you cannot easily cross-section the data to make custom charts.

Piwik

Get It:  http://piwik.org/download/

Best Used With:  SPULogs, or embedded on web pages web traffic analytic tool

Code / Framework:  Python

piwik visitor dashboard

The Piwik visitor dashboard showing visits over time. Each point on the graph is interactive. The report shown actually is only displaying stats for a single day. The graphs are friendly and modern-looking, but can be slow to load.

Piwik Pros:

  • The charts and graphs generated by Piwik are much more attractive and interactive than those produced by AWStats, with report customizations very similar to what’s available in Google Analytics.
  • If you are comfortable with Python, you can do additional customizations to get more details out of your logs.
Piwik file ingestion in PowerShell

To ingest a single monthly log took several hours. On the plus side, with this running on one of Lauren’s monitors, anytime someone walked into her office they thought she was doing something *really* technical.

Piwik Cons:

  • By default, parsing of large log files seems to be pretty slow, but performance may depend on your environment, the size of your log files and how often you rotate your logs.
  • In order to fully take advantage of the library-specific information your logs might contain and your LogFormat setup, you might have to do some pretty significant modification of Piwik’s import_logs.py script.
When looking at popular pages in Piwik you’re somewhat at the mercy that the subdirectories of databases have meaningful labels; luckily EBSCO does, as shown here.  We have a lot of users looking at EBSCO Ebooks, apparently.

When looking at popular pages in Piwik you’re somewhat at the mercy that the subdirectories of database URLs have meaningful labels; luckily EBSCO does, as shown here. We have a lot of users looking at EBSCO Ebooks, apparently.

EZPaarse

Get Ithttp://analogist.couperin.org/ezpaarse/download

Best Used With:  Full Logs or SPULogs

Code / Framework:  Node.js

ezPaarse’s friendly drag and drop interface.  You can also copy/paste lines for your logs to try out the functionality by creating an account at http://ezpaarse.couperin.org.

ezPaarse’s friendly drag and drop interface. You can also copy/paste lines for your logs to try out the functionality by creating an account at http://ezpaarse.couperin.org.

EZPaarse Pros:

  • Has a lot of potential to be used to analyze existing log data to better understand e-resource usage.
  • Drag-and-drop interface, as well as copy/paste log analysis
  • No command-line needed
  • Its goal is to be able to associate meaningful metadata (domains, ISSNs) to provide better electronic resource usage statistics.
ezPaarse Excel output generated from a sample log file, showing type of resource (article, book, etc.) ISSN, publisher, domain, filesize, and more.

ezPaarse Excel output generated from a sample log file, showing type of resource (article, book, etc.) ISSN, publisher, domain, filesize, and more.

EZPaarse Cons:

  • This isn’t really a con per se, but it is under development.  In Lauren’s testing, we couldn’t get of the logs to ingest correctly (perhaps due to a somewhat non-standard EZProxy logformat) but the samples files provided worked well.  As development continues  it can be expected to become more flexible with different kinds of log formats supported.
  • It’s tricky to customize the log formatting correctly, and in Lauren’s testing, if bibliographic information cannot be found for your electronic resources, the data is returned a little strangely.
  • Output is in Excel Sheets rather than a dashboard-style format.

Write Your Own with Python

Get Started With:  https://github.com/robincamille/ezproxy-analysis/blob/master/ezp-analysis.py

Best used with: SPU logs

Code / Framework:  Python

code

Screenshot of a Python script, available at Robin Davis’ Github

 

Custom Script Pros:

  • You will have total control over what data you care about. DIY analyzers are usually written up because you’re looking to answer a specific question, such as “How many connections come from within the Library?”
  • You will become very familiar with the data! As librarians in an age of user tracking, we need to have a very good grasp of the kinds of data that our various services collect from our patrons, like IP addresses.
  • If your script is fairly simple, it should run quickly. Robin’s script took 5 minutes to analyze almost 6 years of SPU logs.
  • Your output will probably be a CSV, a flexible and useful data format, but could be any format your heart desires. You could even integrate Python libraries like Plotly to generate beautiful charts in addition to tabular data.
  • If you use Python for other things in your day-to-day, analyzing structured data is a fun challenge. And you can impress your colleagues with your Pythonic abilities!

 

Action shot: running the script from the command line. (Source)

Action shot: running the script from the command line.

Custom Script Cons:

  • If you have not used Python to input/output files or analyze tables before, this could be challenging.
  • The easiest way to run the script is within an IDE or from the command line; if this is the case, it will likely only be used by you.
  • You will need to spend time ascertaining what’s what in the logs.
  • If you choose to output data in a CSV file, you’ll need more elbow grease to turn the data into a beautiful collection of charts and graphs.
output

Output of the sample script is a labeled CSV that divides connections by locations and user type (student or faculty). (Source)

Splunk (Paid Option)

Best Used with:  Full Logs and SPU Logs

Get It (as a free trial):  http://www.splunk.com/download

Code / Framework:  Various, including Python

A Splunk distribution showing traffic by days of the week.  You can choose to visualize this data in several formats, such as a bar chart or scatter plot.  Notice that this chart was generated by a syntactical query in the upper left corner:  host=lmagnuson| top limit=20 date_wday

A Splunk distribution showing traffic by days of the week. You can choose to visualize this data in several formats, such as a bar chart or scatter plot. Notice that this chart was generated by a syntactical query in the upper left corner: host=lmagnuson| top limit=20 date_wday

Splunk Pros:  

  • Easy to use interface, no scripting/command line required (although command line interfacing (CLI) is available)
  • Incredibly fast processing.  As soon as you import a file, splunk begins ingesting the file and indexing it for searching
  • It’s really strong in interactive searching.  Rather than relying on canned reports, you can dynamically and quickly search by keywords or structured queries to generate data and visualizations on the fly.
Here's a search for log entries containing a URL (digital.films.com), which Splunk uses to create a chart showing the hours of the day that this URL is being accessed.  This particular database is most popular around 4 PM.

Here’s a search for log entries containing a URL (digital.films.com), which Splunk uses to display a chart showing the hours of the day that this URL is being accessed. This particular database is most popular around 4 PM.

Splunk Cons:

    • It has a little bit of a learning curve, but it’s worth it for the kind of features and intelligence you can get from Splunk.
    • It’s the only paid option on this list.  You can try it out for 60 days with up to 500MB/day a day, and certain non-profits can apply to continue using Splunk under the 500MB/day limit.  Splunk can be used with any server access or error log, so a library might consider partnering with other departments on campus to purchase a license.2

What should you choose?

It depends on your needs, but AWStats is always a tried and true easy to install and maintain solution.  If you have the knowledge, a custom Python script is definitely better, but obviously takes time to test and develop.  If you have money and could partner with others on your campus (or just need a one-time report generated through a free trial), Splunk is very powerful, generates some slick-looking charts, and is definitely work looking into.  If there are other options not covered here, please let us know in the comments!

About our guest author: Robin Camille Davis is the Emerging Technologies & Distance Services Librarian at John Jay College of Criminal Justice (CUNY) in New York City. She received her MLIS from the University of Illinois Urbana-Champaign in 2012 with a focus in data curation. She is currently pursuing an MA in Computational Linguistics from the CUNY Graduate Center.

Notes
  1. Details about LogFormat and what each %/lettter value means can be found at http://www.oclc.org/support/services/ezproxy/documentation/cfg/logformat.en.html; LogSPU details can be found http://oclc.org/support/services/ezproxy/documentation/cfg/logspu.en.html
  2. Another paid option that offers a free trial, and comes with extensions made for parsing EZProxy logs, is Sawmill: https://www.sawmill.net/downloads.html

Making Open Access Everyone’s Business

Librarians should have a role in promoting open access content. The best methods and whether they are successful is a matter of heated debate. Take for an example a recent post by Micah Vandergrift on the ACRL Scholarly Communications mailing list, calling on librarians to stage a publishing walkout and only publish in open access library and information science journals. Many have already done so. Others, like myself, have published in traditional journals (only once in my case) but make a point of making their work available in institutional repositories. I personally would not publish in a journal that did not allow such use of my work, and I know many who feel the same way. 1 The point is, of course, to ensure that librarians are not be hypocritical in their own publishing and their use of repositories to provide open access–a long-standing problem pointed out by Dorothea Salo [2.Salo, Dorothea. “Innkeeper at the Roach Motel,” December 11, 2007. http://digital.library.wisc.edu/1793/22088.], among others2 We know that many of the reasons that faculty may hesitate to participate in open access publishing relate to promotion and tenure requirements, which generally are more flexible for academic librarians (though not in all cases–see Abigail Goben’s open access tenure experiment). I suspect that many of the reasons librarians aren’t participating more in open access has partly to do with more mundane reasons of forgetting to do so, or fearing that work is not good enough to make public.

But it shouldn’t be only staunch advocates of open access, open peer review, or new digital models for work and publishing who are participating. We have to find ways to advocate and educate in a gentle but vigorous manner, and reach out to new faculty and graduate students who need to start participating now if the future will be different. Enter Open Access Week, a now eight-year-old celebration of open access organized by SPARC. Just as Black Friday is the day that retailers hope to be in the black, Open Access Week has become an occasion to organize around and finally share our message with willing ears. Right?

It can be, but it requires a good deal of institutional dedication to make it happen. At my institution, Open Access Week is a big deal. I am co-chair of a new Scholarly Communications committee which is now responsible for planning the week (the committee used to just plan the week, but the scope has been extended). The committee has representation from Systems, Reference, Access Services, and the Information Commons, and so we are able to touch on all aspects of open access. Last year we had events five days out of five; this year we are having events four days out of five. Here are some of the approaches we are taking to creating successful conversations around open access.

    • Focus on the successes and the impact of your faculty, whether or not they are publishing in open access journals.

The annual Celebration of Faculty Scholarship takes place during Open Access Week, and brings together physical material published by all faculty at a cocktail reception. We obtain copies of articles and purchase books written by faculty, and set up laptops to display digital projects. This is a great opportunity to find out exactly what our faculty are working on, and get a sense of them as researchers that we may normally lack. It’s also a great opportunity to introduce the concept of open access and recruit participants to the institutional repository.

    • Highlight the particular achievements of faculty who are participating in open access.

We place stickers on materials at the Celebration that are included in the repository or are published in open access journals. This year we held a panel with faculty and graduate students who participate in open access publishing to discuss their experiences, both positive and negative.

  • Demonstrate the value the library adds to open access initiatives.

Recently bepress (which creates the Digital Commons repositories on which ours runs) introduced a real time map of repositories downloads that was a huge hit this year. It was a compelling visual illustration of the global impact of work in the repository. Faculty were thrilled to see their work being read across the world, and it helped to solve the problem of invisible impact. We also highlighted our impact with a new handout that lists key metrics around our repository, including hosting a new open access journal.

  • Talk about the hard issues in open access and the controversies surrounding it, for instance, CC-BY vs. CC-NC-ND licenses.

It’s important to not sugarcoat or spin challenging issues in open access. It’s important to include multiple perspectives and invite difficult conversations. Show scholars the evidence and let them draw their own conclusions, though make sure to step in and correct misunderstandings.

  • Educate about copyright and fair use, over and over again.

These issues are complicated even for people who work on them every day, and are constantly changing. Workshops, handouts, and consultation on copyright and fair use can help people feel more comfortable in the classroom and participating in open access.

  • Make it easy.

Examine what you are asking people to do to participate in open access. Rearrange workflows, cut red tape, and improve interfaces. Open Access Week is a good time to introduce new ideas, but this should be happening all year long.

We can’t expect revolutions in policy and and practice to happen overnight, or without some  sacrifice. Whether you choose to make your stand to only publish in open access journals or some other path, make your stand and help others who wish to do the same.

Notes
  1. Publishers have caught on to this tendency in librarians. For instance, Taylor and Francis has 12-18 month repository embargoes for all its journals except LIS journals. Whether this is because of the good work we have done in advocacy or a conciliatory gesture remains up for debate.
  2. Xia, Jingfeng, Sara Kay Wilhoite, and Rebekah Lynette Myers. “A ‘librarian-LIS Faculty’ Divide in Open Access Practice.” Journal of Documentation 67, no. 5 (September 6, 2011): 791–805. doi:10.1108/00220411111164673.

Hacking in Python with PyMARC

marc icon with an arrow pointing to a spreadsheet icon

The pymarc Python library is an extremely handy library that can be used for writing scripts to read, write, edit, and parse MARC records.  I was first introduced to pymarc through Heidi Frank’s excellent Code4Lib journal article on using pymarc in conjunction with MARCedit.1.  Here at ACRL TechConnect, Becky Yoose has also written about some cool applications of pymarc.

In this post, I’m going to walk through a couple of applications of using pymarc to make comma-separated (.csv) files for batch loading in DSpace and in the KBART format (e.g., for OCLC KnowledgeBase purposes).  I know what you’re thinking – why write a script when I can use MARCedit for that?  Among MARCedit’s many indispensable features is its Export Tab-Delimited Data feature. 2.  But there might be a couple of use cases where scripts can come in handy:

  • Your MARC records lack data that you need in the tab-delimited or .csv output – for example, a field used for processing in whatever system you’re loading the data into that isn’t found in the MARC record source.
  • Your delimited file output requires things like empty fields or parsed, modified fields.  MARCedit is great for exporting whole MARC fields and subfields into delimited files, but you may need to pull apart a MARC element or parse a MARC element out with some additional data.  Pymarc is great for this.
  • You have to process on a lot of records frequently, and just want something tailor-made for your use case.  While you can certainly use MARCedit to save a frequently used export template, editing that saved template when you need to change the output parameters of your delimited file isn’t easy.  With a Python script, you can easily edit your script if your delimited file requirements change.

 

Some General Notes about Python

Python is a useful scripting language to know for a lot of reasons, and for small scripting projects can have a fairly shallow learning curve.  In order to start using Python, you’ll need to set your development environment.  Macs already come with Python installed, but to double-check which version you have installed, launch Terminal and type Python -v.  In a Windows environment, you’ll need to do a few more steps, detailed here.  Full Python documentation for 2.x can be found here (though personally, I find it a little dense at times!), and CodeAcademy features some pretty extensive tutorials to help you learn the syntax quickly.  Google also has some pretty extensive tutorials on Python.  Personally, I find it easiest to learn when I have a project that actually means something to me, so if you’re familiar with MARC records,  just downloading the Python scripts mentioned in this post below and learning how to run them can be a good start.

Spacing and indentation in Python is very important, so if you’re getting errors running scripts make sure that your conditional statements are indented properly 3. Also the version of Python on your machine makes a really big difference, and the version will determine whether your code runs successfully.  These examples have all been tested with Python 2.6 and 2.7, but not with Python 3 or above.  I find that Python 2.x has more help resources out on the web, which is nice to be able to take advantage of when you’re first starting out.

Use Case 1:  MARC to KBART

The complete script described below, along with some sample MARC records, is on GitHub.

The KBART format is a NISO/United Kingdom Serials Group (UKSG) initiative designed to standardize information for use with Knowledge Base systems.  The KBART format is a series of standardized fields that can be used to identify serial coverage (e.g., start date and end date) URLs, publisher information, and more.4  Notably, it’s the required format for adding and modifying customized collections in OCLC’s Knowledge Base. 5.

In this use case, I’m using OCLC’s modified KBART file – most of the fields are KBART standard but a few fields are specific to OCLC’s Knowledge Base, e.g., oclc_collection_name.6.  Typically, these KBART files would be created either manually, or generated by using some VLOOKUP Excel magic with an existing KBART file 7.  In my use case, I needed to batch migrate a bunch of data stored in MARC records to the OCLC Knowledge Base, and these particular records didn’t always correspond neatly to existing OCLC Knowledge Base Collections.  For example, one collection we wanted to manage in the OCLC Knowledge Base was the library’s print holdings so that these titles were displayed in OCLC’s A-Z Journal lookup tool.

First, I’ll need a few helper libraries for my script:

import csv
from pymarc import MARCReader
from os import listdir
from re import search

Then, I’ll declare the source directory where my MARC records are (in this case, in the same directory the script lives in a folder called marc) and instruct the script to go find all the .mrc files in that directory.  Note that this enables the script to process either a single file of multiple MARC records, or multiple distinct files, all at once.

# change the source directory to whatever directory your .mrc files are in
SRC_DIR = 'marc/'

The script identifies MARC records by looking for all files ending in .mrc using the Python filter function.  Within the filter function, lambda creates a one-off anonymous function to define the filter parameters: 8

# get a list of all .mrc files in source directory 
file_list = filter(lambda x: search('.mrc', x), listdir(SRC_DIR))

Now I’ll define the output parameters.  I need a tab-delimited file, not a comma-delimited file, but either way I’ll use the csv.writer function to create a .txt file and define the tab delimiter (\t).  I also don’t really want the fields quoted unless there’s a special character or something that might cause a problem reading the file, so I’ll set quoting to minimal:

#create tab delimited text file that quotes if special characters are present
csv_out = csv.writer(open('kbart.txt', 'w'), delimiter = '\t', 
quotechar = '"', quoting = csv.QUOTE_MINIMAL)


And I’ll also create the header row, which includes the KBART fields required by the OCLC Knowledge Base:

#create the header row
csv_out.writerow(['publication_title', 'print_identifier', 
'online_identifier', 'date_first_issue_online', 
'num_first_vol_online', 'num_first_issue_online', 
'date_last_issue_online', 'num_last_vol_online', 
'num_last_issue_online', 'title_url', 'first_author', 
'title_id', 'coverage_depth', 'coverage_notes', 
'publisher_name', 'location', 'title_notes', 'oclc_collection_name', 
'oclc_collection_id', 'oclc_entry_id', 'oclc_linkscheme', 
'oclc_number', 'action'])

Next, I’ll start a loop for writing a row into the tab-delimited text file for every MARC record found.

for item in file_list:
 fd = file(SRC_DIR + '/' + item, 'r')
 reader = MARCReader(fd)

By default, we’ll need to set each field’s value to blank (”), so that errors are avoided if the value is not present in the MARC record.  We can set all the values to blank to start with in one statement:

for record in reader:
    publication_title = print_identifier = online_identifier 
    = date_first_issue_online = num_first_vol_online 
    = num_first_issue_online = date_last_issue_online 
    = num_last_vol_online = num_last_issue_online 
    = title_url = first_author = title_id = coverage_depth 
    = coverage_notes = publisher_name = location = title_notes 
    = oclc_collection_name = oclc_collection_id = oclc_entry_id 
    = oclc_linkscheme = oclc_number = action = ''

Now I can start pulling in data from MARC fields.  At its simplest, for each field defined in my CSV, I can pull data out of MARC fields using a construction like this:

    #title_url
    if record ['856'] is not None:
      title_url = record['856']['u']

I can use generic python string parsing tools to clean up fields.  For example, if I need to strip the ending / out of a MARC 245$a field, I can use .rsplit to return everything before that final slash (/):

 
   # publication_title
    if record['245'] is not None:
      publication_title = record['245']['a'].rsplit('/', 1)[0]
      if record['245']['b'] is not None:
          publication_title = publication_title + " " + record['245']['b']

Also note that once you’ve declared a variable value (like publication_title) you can re-use that variable name to add more onto the string, as I’ve done with the 245$b subtitle above.

As is usually the case for MARC records, the trickiest business has to do with parsing out serial ranges.  The KBART format is really designed to capture, as accurately as possible, thresholds for beginning and ending dates.  MARC makes this…complicated.  Many libraries might use the 866 summary field to establish summary ranges, but there are varying local practices that determine what these might look like.  In my case, the minimal information I was looking for included beginning and ending years, and the records I was processing with the script stored summary information fairly cleanly in the 863 and 866 fields:

    # date_first_issue_online
    if record ['866'] is not None:
      date_first_issue_online = record['863']['a'].rsplit('-', 1)[0]
 
    # date_last_issue_online
    if record ['866'] is not None:
      date_last_issue_online = record['866']['a'].rsplit('-', 1)[-1]

Where further adjustments were necessary, and I couldn’t easily account for the edge cases programmatically, I imported the KBART file into OpenRefine for further cleanup.  A sample KBART file created by this script is available here.

Use Case 2:  MARC to DSpace Batch Upload

Again, the complete script for transforming MARC records for DSpace ingest is on GitHub.

This use case is very similar to the KBART scenario.  We use DSpace as our institutional repository, and we had about 2000 historical theses and dissertation MARC  records to transform and ingest into DSpace.  DSpace facilitates batch-uploading metadata as CSV files according to the RFC4180 format, which basically means all the field elements use double quotes. 9  While the fields being pulled from are different, the structure of the script is almost exactly the same.

When we first define the CSV output, we need to ensure that quoting is used for all elements – csv.QUOTE_ALL:

csv_out = csv.writer(open('output/theses.txt', 'w'), delimiter = '\t', 
quotechar = '"', quoting = csv.QUOTE_ALL)

The other nice thing about using Python for this transformation is the ability to program in static text that would be the same for all the lines in the CSV file.  For example, I parsed together a more friendly display of the department the thesis was submitted to like so:

# dc.contributor.department
if record ['690']['x'] is not None:
dccontributordepartment = ('UniversityName.  Department of ') + 
record['690']['x'].rsplit('.', 1)[0]

You can view a sample output file created by this script on here on Github.

Other PyMARC Projects

There are lots of other, potentially more interesting things that can be done with pymarc, although certainly one of its most common applications is converting MARC to more flexible formats, such as MARCXML 10.  If you’re interested in learning more,  join the pymarc Google Group, where you can often get help by the original pymarc developers (Ed Summers, Mark Matienzo, Gabriel Farrell and Geoffrey Spear).

Notes
  1. Frank, Heidi. “Augmenting the Cataloger’s Bag of Tricks: Using MarcEdit, Python, and pymarc for Batch-Processing MARC Records Generated from the Archivists’ Toolkit.” Code4Lib Journal, 20 (2013):  http://journal.code4lib.org/articles/8336
  2. https://www.youtube.com/watch?v=qkzJmNOvY00
  3. For the specifications on indentation, take a look at http://legacy.python.org/dev/peps/pep-0008/#id12
  4. http://www.uksg.org/kbart/s5/guidelines/data_fields
  5. http://oclc.org/content/dam/support/knowledge-base/kb_modify.pdf
  6.  A sample blank KBART Excel sheet for OCLC’s Knowledge Base can be found here:  http://www.oclc.org/support/documentation/collection-management/kb_kbart.xlsx.  KBART files must be saved as tab-delimited .txt files prior to upload, but are obviously more easily edited manually in Excel
  7. If you happen to be in need of this for OCLC’s KB, or  are in need of comparing two Excel files, here’s a walkthrough of how to use VLOOKUP to select owned titles from a large OCLC KBART file: http://youtu.be/mUhkMzpPnBE
  8. A nice tutorial on using lambda and filter together can be found here:  http://www.u.arizona.edu/~erdmann/mse350/topics/list_comprehensions.html#filter
  9.  https://wiki.duraspace.org/display/DSDOC18/Batch+Metadata+Editing
  10.  http://docs.evergreen-ils.org/2.3/_migrating_your_bibliographic_records.html

Migrating to LibGuides 2.0

This summer Springshare released LibGuides 2.0, which is a complete revamp of the LibGuides system. Many libraries use LibGuides, either as course/research guides or in some cases as the entire library website, and so this is something that’s been on the mind of many librarians this summer, whichever side of LibGuides they usually see. The process of migrating is not too difficult, but the choices you make in planning the new interface can be challenging. As the librarians responsible for the migration, we will discuss our experience of planning and implementing the new LibGuides platform.

Making the Decision to Migrate

While migrating this summer was optional, Springshare will probably only support LibGuides 1 for another two years, and at Loyola we felt it was better to move sooner rather than later. Over the past few years there were perpetual LibGuides cleanup projects, and this seemed to be a good opportunity to finalize that work. At the same time, we wanted to experiment with new designs for the library’s website that would bring it in closer alignment with the university’s new brand as well as make the site responsive, and LibGuides seemed like the ideal place to experiment with some of those ideas. Several new features, revealed on Springshare’s blog, resonated with subject-area specialists which was another reason to push for a migration sooner than later. We also wanted to have it in place before the first day of classes, which gave us a few months to experiment.

The Reference and Electronic Resources librarian, Will Kent, as well as the Head of Reference, Niamh McGuigan, and the Digital Services Librarian, Margaret Heller, worked in concert to make decisions, as well as inviting all the other reference and instruction librarians (as well as anyone else who was interested) to participate in the process. There were a few ground rules the core team went by, however: we were migrating and the process was iterative, i.e. we weren’t waiting for perfection to launch.

Planning the Migration

During the migration planning process, the small team of three librarians worked together to create a timeline, report to the library staff on progress, solicit feedback on the system, and update the LibGuide policies to reflect the new changes and functions. As far as front-end migration went, we addressed large staff-wide meetings, provided updates, polled subject specialists on the progress, prepared our 400 databases for conversion to the new A-Z list, and demonstrated new features, and opened changes that they should be aware of. We would relay updates from Springshare and handle any troubleshooting questions as they happened.

Given the new features – new categories, new ways of searching, the A-Z database list, and other features, it was important for us to sit down, discuss standards, and update our content policies. The good news was that most of our content was in good shape for the migration. The process was swift and barring inevitable, tiny bugs went smoothly.

Our original timeline was to present the migration steps at our June monthly joint meeting of collections and reference staff, and give a timeline one month until the July meeting to complete the work. For various reasons this ended up stretching until mid-August, but we still launched the day before classes began. We are constantly in the process of updating guide types, adding new resources, and re-classifying boxes to adhere to our new policies.

Working on the Design

LibGuides 2.0 provides two basic templates, a left navigation menu and a top tabbed menu that looks similar to the original LibGuides (additional templates are available with the LibGuides CMS product). We had originally discussed using the left navigation box template and originally began a design based on this, but ultimately people felt more comfortable with the tabbed navigation. Whiteboard sketch of the LibGuides UI

For the initial prototype, Margaret worked off a template that we’d used before for Omeka. This mirrors the Loyola University Chicago template very closely. We kept all of the LibGuides standard template–i.e. 1-3 columns with the number of columns and sections within the column determined by the page creator, but added a few additional pieces in the header and footer, as well as making big changes to the tabs.

The first step in planning the design was to understand what customization happened in the template, and which in the header and footer which are entered separately in the admin UI. Margaret sketched out our vision for the site on the whiteboard wall to determine existing selectors and those that would need to be added, as well as get a sense of whether we would need to change the content section at all. In the interests of completing the project in a timely fashion, we determined that the bare minimum of customization to unify the research guides with the rest of the university websites would be the first priority.

For those still planning a redesign, the Code4Lib community has many suggestions on what to consider. The main thing to consider is that LibGuides 2.0 is based on the Bootstrap 3.0 framework, which Michael Schofield recently implored us to use responsibly. Other important considerations are the accessibility of the solution you pick, and use of white space.

LibGuides Look & Feel UI tabs The Look & Feel section under ‘Admin’ has several tabs with sections for Header and Footer, Custom CSS/JS, and layout of pages–Guide Pages Layout is the most relevant for this post.

Just as in the previous version of LibGuides, one can enter custom code for the header and footer (which in this case is almost the same as the regular library website), as well link to a custom CSS file (we did not include any custom Javascript here, but did include several Google Fonts and our custom icon). The Guide Pages Layout is new, and this is where one can edit the actual template that creates each page. We didn’t make any large changes here, but were still able to achieve a unique look with custom CSS.

The new LibGuides platform is responsive, but we needed to account for several items we added to the interface. We added a search box that would allow users to search the entire university website, as well as several new logos, so Margaret added a few media queries to adjust these features on a phone or tablet, as well as adjust the spacing of the custom footer.

Improving the Design

Our first design was ready to present to the subject librarians a month after the migration process started. It was based on the principle of matching the luc.edu pages closely (example), in which the navigation tabs across the top have unusual cutouts, and section titles are very large. No one was very happy with this result, however, as it made the typical LibGuides layout with multiple sections on a page unusable and the tabs not visible enough. While one approach would have been to change the navigation to left navigation menu and limit the number of sections, the majority of the subject librarians preferred to keep things closer to what they had been, with a view to moving toward a potential new layout in the future.

Once we determined a literal interpretation of the university website was not usable for our content, we found inspiration for the template body from another section of the university website that was aimed at presenting a lot of dynamic content with multiple sections, but kept the standard luc.edu header. This allowed us to create a page that was recognizably part of Loyola, but presented our LibGuides content in a much more usable form.

Sticky Tabs

Sticky Tabs

The other piece we borrowed from the university website was sticky tabs. This was an attempt to make the tabs more visible and usable based on what we knew from usability testing on the old platform and what users would already know from the university site. Because LibGuides is based on the Bootstrap framework, it was easy to drop this in using the Affix plugin (tutorial on how to use this)1. The tabs are translucent so they don’t obscure content as one scrolls down.

Our final result was much more popular with everyone. It has a subtle background color and border around each box with a section header that stands out but doesn’t overwhelm the content. The tabs are not at all like traditional LibGuides tabs, functioning somewhat more like regular header links.

mainview

Final result.

Next Steps

Over the summer we were not able to conduct usability testing on the new interface due to the tight timeline, so the first step this fall is to integrate it into our regular usability testing schedule to make iterative changes based on user feedback. We also need to continue to audit the page to improve accessibility.

The research guides are one of the most used links on our website (anywhere between 10,000 and 20,000 visits per month), so our top priority was to make sure the migration did not interfere with use – both in terms of patron access and content creation by the subject-area librarians. Thanks to our feedback sessions, good communication with Springshare, and reliable new platform, the migration went smoothly without interruption.

About our guest author: Will Kent is Reference/Instruction and Electronic Resources Librarian and subject specialist for Nursing and Chemistry at Loyola University Chicago. He received his MSLIS from University of Illinois Urbana-Champaign in 2011 with a certificate in Community Informatics.

Notes
  1. You may remember that in the Bootstrap Responsibly post Michael suggested it wasn’t necessary to use this, but it is the most straightforward way in LibGuides 2.0

Using the Stripe API to Collect Library Fines by Accepting Online Payment

Recently, my library has been considering accepting library fines via online. Currently, many library fines of a small amount that many people owe are hard to collect. As a sum, the amount is significant enough. But each individual fines often do not warrant even the cost for the postage and the staff work that goes into creating and sending out the fine notice letter. Libraries that are able to collect fines through the bursar’s office of their parent institutions may have a better chance at collecting those fines. However, others can only expect patrons to show up with or to mail a check to clear their fines. Offering an online payment option for library fines is one way to make the library service more user-friendly to those patrons who are too busy to visit the library in person or to mail a check but are willing to pay online with their credit cards.

If you are new to the world of online payment, there are several terms you need to become familiar with. The following information from the article in SixRevisions is very useful to understand those terms.1

  • ACH (Automated Clearing House) payments: Electronic credit and debit transfers. Most payment solutions use ACH to send money (minus fees) to their customers.
  • Merchant Account: A bank account that allows a customer to receive payments through credit or debit cards. Merchant providers are required to obey regulations established by card associations. Many processors act as both the merchant account as well as the payment gateway.
  • Payment Gateway: The middleman between the merchant and their sponsoring bank. It allows merchants to securely pass credit card information between the customer and the merchant and also between merchant and the payment processor.
  • Payment Processor: A company that a merchant uses to handle credit card transactions. Payment processors implement anti-fraud measures to ensure that both the front-facing customer and the merchant are protected.
  • PCI (the Payment Card Industry) Compliance: A merchant or payment gateway must set up their payment environment in a way that meets the Payment Card Industry Data Security Standard (PCI DSS).

Often, the same company functions as both payment gateway and payment processor, thereby processing the credit card payment securely. Such a product is called ‘Online payment system.’ Meyer’s article I have cited above also lists 10 popular online payment systems: Stripe, Authorize.Net, PayPal, Google Checkout, Amazon Payments, Dwolla, Braintree, Samurai by FeeFighters, WePay, and 2Checkout. Bear in mind that different payment gateways, merchant accounts, and bank accounts may or may not work together, your bank may or may not work as a merchant account, and your library may or may not have a merchant account. 2

Also note that there are fees in using online payment systems like these and that different systems have different pay structures. For example, Authorize.net has the $99 setup fee and then charges $20 per month plus a $0.10 per-transaction fee. Stripe charges 2.9% + $0.30 per transaction with no setup or monthly fees. Fees for mobile payment solutions with a physical card reader such as Square may go up much higher.

Among various online payment systems, I picked Stripe because it was recommended on the Code4Lib listserv. One of the advantages for using Stripe is that it acts as both the payment gateway and the merchant account. What this means is that your library does not have to have a merchant account to accept payment online. Another big advantage of using Stripe is that you do not have to worry about the PCI compliance part of your website because the Stripe API uses a clever way to send the sensitive credit card information over to the Stripe server while keeping your local server, on which your payment form sits, completely blind to such sensitive data. I will explain this in more detail later in this post.

Below I will share some of the code that I have used to set up Stripe as my library’s online payment option for testing. This may be of interest to you if you are thinking about offering online payment as an option for your patrons or if you are simply interested in how an online payment API works. Even if your library doesn’t need to collect library fines via online, an online payment option can be a handy tool for a small-scale fund-raising drive or donation.

The first step to take to make Stripe work is getting an API keys. You do not have to create an account to get API keys for testing. But if you are going to work on your code more than one day, it’s probably worth getting an account. Stripe API has excellent documentation. I have read ‘Getting Started’ section and then jumped over to the ‘Examples’ section, which can quickly get you off the ground. (https://stripe.com/docs/examples) I found an example by Daniel Schröter in GitHub from the list of examples in the Stripe’s Examples section and decided to test out. (https://github.com/myg0v/Simple-Bootstrap-Stripe-Payment-Form) Most of the time, getting an example code requires some probing and tweaking such as getting all the required library downloaded and sorting out the paths in the code and adding API keys. This one required relatively little work.

Now, let’s take a look at the form that this code creates.

borrowedcode

In order to create a form of my own for testing, I decided to change a few things in the code.

  1. Add Patron & Payment Details.
  2. Allow custom amount for payment.
  3. Change the currency from Euro to US dollars.
  4. Configure the validation for new fields.
  5. Hide the payment form once the charge goes through instead of showing the payment form below the payment success message.

html

4. can be done as follows. The client-side validation is performed by Bootstrapvalidator jQuery Plugin. So you need to get the syntax correct to get the code, which now has new fields, to work properly.
validator

This is the Javascript that allows you to send the data submitted to your payment form to the Stripe server. First, include the Stripe JS library (line 24). Include JQuery, Bootstrap, Bootstrap Form Helpers plugin, and Bootstrap Validator plugin (line 25-28). The next block of code includes an event handler for the form, which send the payment information to the Stripe via AJAX when the form is submitted. Stripe will validate the payment information and then return a token that identifies this particular transaction.

jspart

When the token is received, this code calls for the function, stripeResponseHandler(). This function, stripeResponseHandler() checks if the Stripe server did not return any error upon receiving the payment information and, if no error has been returned, attaches the token information to the form and submits the form.

jspart2

The server-side PHP script then checks if the Stripe token has been received and, if so, creates a charge to send it to Stripe as shown below. I am using PHP here, but Stripe API supports many other languages than PHP such as Ruby and Python. So you have many options. The real payment amount appears here as part of the charge array in line 326. If the charge succeeds, the payment success message is stored in a div to be displayed.

phppart

The reason why you do not have to worry about the PCI compliance with Stripe is that Stripe API asks to receive the payment information via AJAX and the input fields of sensitive information does not have the name attribute and value. (See below for the Card Holder Name and Card Number information as an example; Click to bring up the clear version of the image.)  By omitting the name attribute and value, the local server where the online form sits is deprived of any means to retrieve the information in those input fields submitted through the form. Since sensitive information does not touch the local server at all, PCI compliance for the local server becomes no concern. To clarify, not all fields in the payment form need to be deprived of the name attribute. Only the sensitive fields that you do not want your web server to have access to need to be protected this way. Here, for example, I am assigning the name attribute and value to fields such as name and e-mail in order to use them later to send a e-mail receipt.

(NB. Please click images to see the enlarged version.)

Screen Shot 2014-08-17 at 8.01.08 PM

Now, the modified form has ‘Fee Category’, custom ‘Payment Amount,’ and some other information relevant to the billing purpose of my library.

updated

When the payment succeeds, the page changes to display the following message.

success

Stripe provides a number of fake card numbers for testing. So you can test various cases of failures. The Stripe website also displays all payments and related tokens and charges that are associated with those payments. This greatly helps troubleshooting. One thing that I noticed while troubleshooting is that Stripe logs sometimes do lag behind. That is, when a payment would succeed, associated token and charge may not appear under the “Logs” section immediately. But you will see the payment shows up in the log. So you will know that associated token and charge will eventually appear in the log later.

recent_payment

Once you are ready to test real payment transactions, you need to flip the switch from TEST to LIVE located on the top left corner. You will also need to replace your API keys for ‘TESTING’ (both secret and public) with those for ‘LIVE’ transaction. One more thing that is needed before making your library getting paid with real money online is setting up SSL (Secure Sockets Layer) for your live online payment page. This is not required for testing but necessary for processing live payment transactions. It is not a very complicated work. So don’t be discouraged at this point. You just have to buy a security certificate and put it in your Web server. Speak to your system administrator for how to get the SSL set up for your payment page. More information about setting up SSL can be found in the Stripe documentation I just linked above.

My library has not yet gone live with this online payment option. Before we do, I may make some more modifications to the code to fit the staff workflow better, which is still being mapped out. I am also planning to place the online payment page behind the university’s Shibboleth authentication in order to cut down spam and save some tedious data entry by library patrons by getting their information such as name, university email, student/faculty/staff ID number directly from the campus directory exposed through Shibboleth and automatically inserting it into the payment form fields.

In this post, I have described my experience of testing out the Stripe API as an online payment solution. As I have mentioned above, however, there are many other online payment systems out there. Depending your library’s environment and financial setup, different solutions may work better than others. To me, not having to worry about the PCI compliance by using Stripe was a big plus. If your library accepts online payment, please share what solution you chose and what factors led you to the particular online payment system in the comments.

* This post has been based upon my recent presentation, “Accepting Online Payment for Your Library and ‘Stripe’ as an Example”, given at the Code4Lib DC Unconference. Slides are available at the link above.

Notes
  1. Meyer, Rosston. “10 Excellent Online Payment Systems.” Six Revisions, May 15, 2012. http://sixrevisions.com/tools/online-payment-systems/.
  2. Ullman, Larry. “Introduction to Stripe.” Larry Ullman, October 10, 2012. http://www.larryullman.com/2012/10/10/introduction-to-stripe/.

Please Don’t Kill the Web: A Screed on Performance & Austerity

Understanding web performance is important for everyone, whether you’re a front-end web developer working with HTML and CSS, an application developer writing services that will interact with the web, or a content author who writes for the web without using code. While there are only a few simple tricks to making web pages load quickly, they’re unfortunately eschewed all too often.

Libraries can ill afford to alienate anyone in their communities. We aim to be open institutions that welcome anyone to use our services, regardless of race, gender, class, creed, ability, or anything else. Yet when we make websites that work only for high-powered desktop computers with broadband connections, we privilege the wealthy. Design a slow enough website and even laptops on decent wireless connections may struggle to load a site in a timely fashion. But what about people who only own smart phones or people who live in rural areas where dial-up is the only option? Poor web performance renders sites unusable for some and frustrating for all.

But what can I do?!?

Below, I’ll elaborate on why certain practices make sites faster, but I want to outline some actions you can take to immediately speed up your sites.

First of all, test your main web properties to identify where the pain points are. Google Pagespeed and Yahoo’s YSlow are two good services that not only spot a broad range of potential problems but provide easy instructions for ameliorating them. They both output letter grades but it’s not worth worrying about getting an A; simply find the laggard sites and look for low-hanging fruit to fix.

Speaking of low-hanging fruit, I’m going to guess you have images on your site. In fact, images make up the bulk of most websites’ download size.1 So if you’re a content author uploading images to a site, ensuring that their file sizes are as small as possible goes a long way to speeding up your site. First of all, ask if the image is really necessary. While images certainly aid a site’s visual appeal, so does clean design with thoughtful colors and highlighted calls to action. It’s becoming easier and easier to avoid images too, as CSS advances and techniques like SVG and icon fonts become more popular. In general, if an image isn’t a logo or picture, it can be replaced through those technologies with less of a performance impact.

If you’re certain an image is necessary, ensure that you’re uploading an appropriately sized one. If you upload a 1200×900 image which will be scaled down to 400×300 inside an <img> tag, change the dimensions before uploading. Lastly, run the image through an optimizer that removes useless gunk from the file (this includes metadata and unused color profiles). ImageOptim is a great choice on Mac. I don’t use Windows so I can’t vouch for these, but FileOptimizer and RIOT (Radical Image Optimization Tool) appear to be popular choices for that platform.

What’s the second largest source of bytes in a website? JavaScript. Scripts are exponentially smaller than images but a prime opportunity to trim download size. You should always minify your JavaScript before putting it on a live site. What’s minification? It takes all the functionally useless bytes out of a file, producing text that’s operational but smaller. Here’s an example script before and after minification.

Before (236 bytes):

// add a proxy server prefix to all links with class=proxied
var links = document.querySelectorAll('a[href].proxied');

Array.prototype.forEach.call(links, function (link) {
    var proxy = 'https://proxy.example.com/login?url='
        , newHref = proxy + link.href;

    link.href = newHref;
});

After (169 bytes):

var links=document.querySelectorAll("a[href].proxied");Array.prototype.forEach.call(links,function(r){var e="https://proxy.example.com/login?url=",l=e+r.href;r.href=l});

By running my script through UglifyJS, I reduced its size 28%. We can see where Uglify applied its tricks: it removed a comment, removed whitespace (including line breaks), and changed my local variable names to single letters. Yet my script’s actions are the same, thus there’s no reason to deliver the larger, human-readable version to a browser.

Minifying JavaScript is a best practice, but CSS can also be minified. CSS benefits primarily through whitespace removal because selectors and properties must all remain the same. While it may not be an option in situations where your pages are served up by a CMS or vendor application, HTML can benefit substantially from having comments removed, quotes taken off of attributes, and even closing tags erased (perfectly valid under HTML5). While the outcome is ugly, the vast majority of your users won’t peak at your source code but will appreciate the quicker page load.

There’s one last optimization that most content authors or front-end web developers can do: combine like files. Rather than make the browser request several separate scripts and stylesheets, combine your assets into as few files as possible. Saving HTTP requests helps sites load quickly, especially over cellular connections. While it’s possible to minify and combine files one-by-one by pasting them into online tools, ideally you have a development workflow that employs a tool like Grunt or Gulp to minify your text-based resources and concatenate them into one big package. 2

When going through one’s scripts and stylesheets looking for optimizations, it’s again pertinent to ask how much is truly necessary. Yes, you can accomplish some gorgeous things with just a little code, slick modal dialogs and luscious fly-out menus. But is that really what your website needs? Are users struggling because your nav bar doesn’t collapse and stick to the top of the page as you scroll, or because your link text is obscure? Throwing scripts at a site can create more issues than it resolves.

But what can my server admin do?!?

If you have access to your library’s servers, then you’ll be able to change a few settings that will make broad improvements. If not, try emailing your local server admin to see if they can make these alterations.

One of the easiest, and biggest, improvements you can make is to turn on compression. With compression on, your server packs up requests into smaller files (think .zip archives) before sending them over the web. This makes a huge difference with most web assets because they’re repetitive plain text which is very amenable to compression algorithms. The best part is that turning on compression is typically a matter of copy-pasting a few lines into a server configuration file. If you’re unsure of how to do it, see the HTML5 Boilerplate Server Configs project. If you’re not certain your site is using compression, try the aptly-named GzipWTF.

What’s the only thing faster than sending a minified, compressed file? Not having to send that file at all. Browsers cache assets if they’re told to, meaning that repeat visits to your site won’t incur repeat downloads. This significantly increases speed if a visitor spends a lot of time on your site. Caching is, again, a matter of inserting a few configuration lines. It can be a little tricky, however, because when you update cached files you want users to download the new version, bypassing their local cache. One work-around is filename-based versioning. 3

Don’t have server access? Or are your servers taxed to the breaking point? Popular front-end resources are often available over Content Delivery Networks (CDNs). These CDNs have servers spread across the globe whose only job is to serve up static assets which get cached in browsers. Because many sites use them, users may already have assets cached in their browser (e.g. if they’ve visited Reddit recently, they have version 2.1.1. of the jQuery JavaScript library from a popular CDN run by Google). While Google and Microsoft run well-known CDNs for a few popular libraries, the lesser-known jsDelivr, CDNJS, and Bootstrap CDN give you even greater options.

The Basic Rules

I’ve avoided the theory behind web performance thus far, but not because it’s complex. There are only two primary rules: send fewer bytes and send fewer requests. So given a choice between two files that fulfill the same function, choose the smaller one (the logic behind minification and compression). If you can combine multiple files into one or avoiding sending something altogether, that saves requests (the logic behind concatenation and caching).

Sending fewer bytes makes intuitive sense; we know that larger files take longer to download. Not only that, we’ve all probably visually witnessed the effects of file size as manifested in progress bars or massive images slowly tiling into place. The benefits of concatenation, however, may seem peculiar. Why would sending one 50kb file be quicker than sending two 25kb files, or five 10kb files? It turns out that there are inefficient redundancies across multiple HTTP requests. A request has a few stages:

  • A DNS lookup translates a domain like “acrl.ala.org” into an IP address like “38.69.5.149”
  • The client and server establish a connection with a TCP handshake
  • The server receives the request and determines what to send back
  • Finally, the data is pushed out over the network

Throughout these steps there may be a great amount of latency as data travels through the air and over wires. All of that latency is repeated with each request, making multiple requests slower. So even if we serve all our assets from the same domain, thus requiring only one DNS request, and configure our server to not repeat the TCP connection on each request, there’s still additional overhead to multiple requests.

Here’s a visual explanation of this phenomenon:

While the DNS, TCP, and download times are identical for two scripts sent independently and the product of their concatenation, the total “Time To First Byte” (analogous to latency) is greater with the dual requests.

Incidentally, the next version of the HTTP specification—HTTP 2.0, currently active as the SPDY protocol—features request multiplexing. The ability to pack multiple requests into one connection virtually eliminates the need for concatenation. It’s a long way off though, since the spec hasn’t been finalized and it will need to be implemented in a backwards-compatible way in both web browsers and servers.

Don’t Sweat What You Can’t Control

Here’s a sad truth: a lot of your library’s web properties probably have terrible performance and there’s little you can do about it. Libraries rely heavily upon third-party websites that may give you a box for pasting custom HTML into, but no or limited control over page templates. Since we’re purchasing a web site and not a web service, we trade convenience for quality. 4 Perhaps because making a generically useful website is really tough, or perhaps because they just don’t know, most of these vendor sites ignore best practices. Similarly, we tend to reuse large open source applications without auditing their performance characteristics. I began listing some of the more egregious offenders, but the list became so long that I’ve put it in a separate appendix below.

There’s More

There’s a lot more to web performance, including avoiding redirects, combining images into sprites, and putting scripts down towards the end of the <body> tag. An entire field has grown up around the topic, spurred on by the rising complexity of web applications and the prominence of mobile devices. Google’s Make the Web Faster project is great place to learn more, as is Steve Souder’s dated but classic High Performance Web Sites. The main takeaway, however, is that understanding a few tenets of web performance can make a difference. By slimming your images and combining your CSS, you can make your sites accessible to more devices and quicker for everyone.

Appendix A: The Wall of Performance Shame

I don’t mean to imply that any of these software packages are poorly made or bad choices. In fact, I’m quite fond of many of them. But run a few through tools like Google Pagespeed and it quickly becomes apparent that even the easiest optimizations aren’t being applied.


LibGuides 1.0 loads all its 5 scripts in the <head> and includes two large utility libraries in jQuery and Dojo, which I struggle to believe is necessary. LibGuides 2.0 does the right thing in relying on CDNs to serve up a few libraries, but still includes 5 scripts, two of which are unminified (despite having “.min” in the name).

The EQUELLA digital repository software loads 11 CSS files and and 14 scripts in the <head> of the document. I didn’t bother to check them all, but the ones I did view are unminified (e.g. they’ve left jQuery at full size, which increases the download 154kb for no good reason). This essentially crushes the page load on all but desktop devices with decent bandwidth; I cannot imagine a user enduring the wait on a phone’s 3G connection.

DSpace loads 9 stylesheets, most of which appear unminified, and 5 scripts (which, luckily, are minified) so far up in the <head> of the document that the <title> of the page loads noticeably slow on a phone.

VuFind loads 10 scripts in a row in the <head>; these could be easily combined.

Thinking a little broader beyond library-specific software, Drupal could do a better job concatenating and minifying resources. While the performance tools have a simple checkbox for combining scripts, by default Drupal will still load multiple (3-6 stylesheets and scripts) resources on each page. The JavaScript and CSS that come with many contributed modules tend to not be minified for some reason and Drupal sometimes doesn’t process them itself. I don’t know the best practices for providing client-side assets with a module but many authors seem to be ignoring performance.

On a positive note, the Blacklight discovery layer does a great job of combining all CSS and JavaScript into one file apiece, probably due to the fact it’s built with Ruby on Rails which provides some nice tools for conforming to best practices. The Koha ILS also appears to keep the number of files down and pushes most of its scripts to the bottom of the document.

Notes

  1. The HTTP Archive is a wonderful source for data like this.
  2. These development workflows are a bit much to go into at the moment, but trust me that they’re not too difficult to set up. There are template configurations which you can use that make it relatively painless to simply point a piece of software at a folder full of scripts and watch an optimized output appear in a prescribed destination. If there’s interest in another blog post with more details, leave me a note in the comments and I’ll happily write one.
  3. This Stack Overflow answer gives a decent overview of filename-based versioning with some further reading. Google’s Make the Web Faster project has a nice article on how HTTP caching works.
  4. When I say “web service” I mean platforms which expose all their functionality through an API, which allows one to write a custom interface layer on top. That’s often a lot of work but would provide total control over things like the number and order of resources loaded in the HTML.

Bootstrap Responsibly

Bootstrap is the most popular front-end framework used for websites. An estimate by meanpath several months ago sat it firmly behind 1% of the web – for good reason: Bootstrap makes it relatively painless to puzzle together a pretty awesome plug-and-play, component-rich site. Its modularity is its key feature, developed so Twitter could rapidly spin-up internal microsites and dashboards.

Oh, and it’s responsive. This is kind of a thing. There’s not a library conference today that doesn’t showcase at least one talk about responsive web design. There’s a book, countless webinars, courses, whole blogs dedicated to it (ahem), and more. The pressure for libraries to have responsive, usable websites can seem to come more from the likes of us than from the patronbase itself, but don’t let that discredit it. The trend is clear and it is only a matter of time before our libraries have their mobile moment.

Library websites that aren’t responsive feel dated, and more importantly they are missing an opportunity to reach a bevy of mobile-only users that in 2012 already made up more than a quarter of all web traffic. Library redesigns are often quickly pulled together in a rush to meet the growing demand from stakeholders, pressure from the library community, and users. The sprint makes the allure of frameworks like Bootstrap that much more appealing, but Bootstrapped library websites often suffer the cruelest of responsive ironies:

They’re not mobile-friendly at all.

Assumptions that Frameworks Make

Let’s take a step back and consider whether using a framework is the right choice at all. A front-end framework like Bootstrap is a Lego set with all the pieces conveniently packed. It comes with a series of templates, a blown-out stylesheet, scripts tuned to the environment that let users essentially copy-and-paste fairly complex web-machinery into being. Carousels, tabs, responsive dropdown menus, all sorts of buttons, alerts for every occasion, gorgeous galleries, and very smart decisions made by a robust team of far-more capable developers than we.

Except for the specific layout and the content, every Bootstrapped site is essentially a complete organism years in the making. This is also the reason that designers sometimes scoff, joking that these sites look the same. Decked-out frameworks are ideal for rapid prototyping with a limited timescale and budget because the design decisions have by and large already been made. They assume you plan to use the framework as-is, and they don’t make customization easy.

In fact, Bootstrap’s guide points out that any customization is better suited to be cosmetic than a complete overhaul. The trade-off is that Bootstrap is otherwise complete. It is tried, true, usable, accessible out of the box, and only waiting for your content.

Not all Responsive Design is Created Equal

It is still common to hear the selling point for a swanky new site is that it is “responsive down to mobile.” The phrase probably rings a bell. It describes a website that collapses its grid as the width of the browser shrinks until its layout is appropriate for whatever screen users are carrying around. This is kind of the point – and cool, as any of us with a browser-resizing obsession could tell you.

Today, “responsive down to mobile” has a lot of baggage. Let me explain: it represents a telling and harrowing ideology that for these projects mobile is the afterthought when mobile optimization should be the most important part. Library design committees don’t actually say aloud or conceive of this stuff when researching options, but it should be implicit. When mobile is an afterthought, the committee presumes users are more likely to visit from a laptop or desktop than a phone (or refrigerator). This is not true.

See, a website, responsive or not, originally laid out for a 1366×768 desktop monitor in the designer’s office, wistfully depends on visitors with that same browsing context. If it looks good in-office and loads fast, then looking good and loading fast must be the default. “Responsive down to mobile” is divorced from the reality that a similarly wide screen is not the common denominator. As such, responsive down to mobile sites have a superficial layout optimized for the developers, not the user.

In a recent talk at An Event Apart–a conference–in Atlanta, Georgia, Mat Marquis stated that 72% of responsive websites send the same assets to mobile sites as they do desktop sites, and this is largely contributing to the web feeling slower. While setting img { width: 100%; } will scale media to fit snugly to the container, it is still sending the same high-resolution image to a 320px-wide phone as a 720px-wide tablet. A 1.6mb page loads differently on a phone than the machine it was designed on. The digital divide with which librarians are so familiar is certainly nowhere near closed, but while internet access is increasingly available its ubiquity doesn’t translate to speed:

  1. 50% of users ages 12-29 are “mostly mobile” users, and you know what wireless connections are like,
  2. even so, the weight of the average website ( currently 1.6mb) is increasing.

Last December, analysis of data from pagespeed quantiles during an HTTP Archive crawl tried to determine how fast the web was getting slower. The fastest sites are slowing at a greater rate than the big bloated sites, likely because the assets we send–like increasingly high resolution images to compensate for increasing pixel density in our devices–are getting bigger.

The havoc this wreaks on the load times of “mobile friendly” responsive websites is detrimental. Why? Well, we know that

  • users expect a mobile website to load as fast on their phone as it does on a desktop,
  • three-quarters of users will give up on a website if it takes longer than 4 seconds to load,
  • the optimistic average load time for just a 700kb website on 3G is more like 10-12 seconds

eep O_o.

A Better Responsive Design

So there was a big change to Bootstrap in August 2013 when it was restructured from a “responsive down to mobile” framework to “mobile-first.” It has also been given a simpler, flat design, which has 100% faster paint time – but I digress. “Mobile-first” is key. Emblazon this over the door of the library web committee. Strike “responsive down to mobile.” Suppress the record.

Technically, “mobile-first” describes the structure of the stylesheet using CSS3 Media Queries, which determine when certain styles are rendered by the browser.

.example {
  styles: these load first;
}

@media screen and (min-width: 48em) {

  .example {

    styles: these load once the screen is 48 ems wide;

  }

}

The most basic styles are loaded first. As more space becomes available, designers can assume (sort of) that the user’s device has a little extra juice, that their connection may be better, so they start adding pizzazz. One might make the decision that, hey, most of the devices less than 48em (720px approximately with a base font size of 16px) are probably touch only, so let’s not load any hover effects until the screen is wider.

Nirvana

In a literal sense, mobile-first is asset management. More than that, mobile-first is this philosophical undercurrent, an implicit zen of user-centric thinking that aligns with libraries’ missions to be accessible to all patrons. Designing mobile-first means designing to the lowest common denominator: functional and fast on a cracked Blackberry at peak time; functional and fast on a ten year old machine in the bayou, a browser with fourteen malware toolbars trudging through the mire of a dial-up connection; functional and fast [and beautiful?] on a 23″ iMac. Thinking about the mobile layout first makes design committees more selective of the content squeezed on to the front page, which makes committees more concerned with the quality of that content.

The Point

This is the important statement that Bootstrap now makes. It expects the design committee to think mobile-first. It comes with all the components you could want, but they want you to trim the fat.

Future Friendly Bootstrapping

This is what you get in the stock Bootstrap:

  • buttons, tables, forms, icons, etc. (97kb)
  • a theme (20kb)
  • javascripts (30kb)
  • oh, and jQuery (94kb)

That’s almost 250kb of website. This is like a browser eating a brick of Mackinac Island Fudge – and this high calorie bloat doesn’t include images. Consider that if the median load time for a 700kb page is 10-12 seconds on a phone, half that time with out-of-the-box Bootstrap is spent loading just the assets.

While it’s not totally deal-breaking, 100kb is 5x as much CSS as an average site should have, as well as 15%-20% of what all the assets on an average page should weigh. Josh Broton

To put this in context, I like to fall back on Ilya Girgorik’s example comparing load time to user reaction in his talk “Breaking the 1000ms Time to Glass Mobile Barrier.” If the site loads in just 0-100 milliseconds, this feels instant to the user. By 100-300ms, the site already begins to feel sluggish. At 300-1000ms, uh – is the machine working? After 1 second there is a mental context switch, which means that the user is impatient, distracted, or consciously aware of the load-time. After 10 seconds, the user gives up.

By choosing not to pair down, your Bootstrapped Library starts off on the wrong foot.

The Temptation to Widgetize

Even though Bootstrap provides modals, tabs, carousels, autocomplete, and other modules, this doesn’t mean a website needs to use them. Bootstrap lets you tailor which jQuery plugins are included in the final script. The hardest part of any redesign is to let quality content determine the tools, not the ability to tabularize or scrollspy be an excuse to implement them. Oh, don’t Google those. I’ll touch on tabs and scrollspy in a few minutes.

I am going to be super presumptuous now and walk through the total Bootstrap package, then make recommendations for lightening the load.

Transitions

Transitions.js is a fairly lightweight CSS transition polyfill. What this means is that the script checks to see if your user’s browser supports CSS Transitions, and if it doesn’t then it simulates those transitions with javascript. For instance, CSS transitions often handle the smooth, uh, transition between colors when you hover over a button. They are also a little more than just pizzazz. In a recent article, Rachel Nabors shows how transition and animation increase the usability of the site by guiding the eye.

With that said, CSS Transitions have pretty good browser support and they probably aren’t crucial to the functionality of the library website on IE9.

Recommendation: Don’t Include.

 Modals

“Modals” are popup windows. There are plenty of neat things you can do with them. Additionally, modals are a pain to design consistently for every browser. Let Bootstrap do that heavy lifting for you.

Recommendation: Include

Dropdown

It’s hard to conclude a library website design committee without a lot of links in your menu bar. Dropdown menus are kind of tricky to code, and Bootstrap does a really nice job keeping it a consistent and responsive experience.

Recommendation: Include

Scrollspy

If you have a fixed sidebar or menu that follows the user as they read, scrollspy.js can highlight the section of that menu you are currently viewing. This is useful if your site has a lot of long-form articles, or if it is a one-page app that scrolls forever. I’m not sure this describes many library websites, but even if it does, you probably want more functionality than Scrollspy offers. I recommend jQuery-Waypoints – but only if you are going to do something really cool with it.

Recommendation: Don’t Include

Tabs

Tabs are a good way to break-up a lot of content without actually putting it on another page. A lot of libraries use some kind of tab widget to handle the different search options. If you are writing guides or tutorials, tabs could be a nice way to display the text.

Recommendation: Include

Tooltips

Tooltips are often descriptive popup bubbles of a section, option, or icon requiring more explanation. Tooltips.js helps handle the predictable positioning of the tooltip across browsers. With that said, I don’t think tooltips are that engaging; they’re sometimes appropriate, but you definitely use to see more of them in the past. Your library’s time is better spent de-jargoning any content that would warrant a tooltip. Need a tooltip? Why not just make whatever needs the tooltip more obvious O_o?

Recommendation: Don’t Include

Popover

Even fancier tooltips.

Recommendation: Don’t Include

Alerts

Alerts.js lets your users dismiss alerts that you might put in the header of your website. It’s always a good idea to give users some kind of control over these things. Better they read and dismiss than get frustrated from the clutter.

Recommendation: Include

Collapse

The collapse plugin allows for accordion-style sections for content similarly distributed as you might use with tabs. The ease-in-ease-out animation triggers motion-sickness and other aaarrghs among users with vestibular disorders. You could just use tabs.

Recommendation: Don’t Include

Button

Button.js gives a little extra jolt to Bootstrap’s buttons, allowing them to communicate an action or state. By that, imagine you fill out a reference form and you click “submit.” Button.js will put a little loader icon in the button itself and change the text to “sending ….” This way, users are told that the process is running, and maybe they won’t feel compelled to click and click and click until the page refreshes. This is a good thing.

Recommendation: Include

Carousel

Carousels are the most popular design element on the web. It lets a website slideshow content like upcoming events or new material. Carousels exist because design committees must be appeased. There are all sorts of reasons why you probably shouldn’t put a carousel on your website: they are largely inaccessible, have low engagement, are slooooow, and kind of imply that libraries hate their patrons.

Recommendation: Don’t Include.

Affix

I’m not exactly sure what this does. I think it’s a fixed-menu thing. You probably don’t need this. You can use CSS.

Recommendation: Don’t Include

Now, Don’t You Feel Better?

Just comparing the bootstrap.js and bootstrap.min.js files between out-of-the-box Bootstrap and one tailored to the specs above, which of course doesn’t consider the differences in the CSS, the weight of the images not included in a carousel (not to mention the unquantifiable amount of pain you would have inflicted), the numbers are telling:

File Before After
bootstrap.js 54kb 19kb
bootstrap.min.js 29kb 10kb

So, Bootstrap Responsibly

There is more to say. When bouncing this topic around twitter awhile ago, Jeremy Prevost pointed out that Bootstrap’s minified assets can be GZipped down to about 20kb total. This is the right way to serve assets from any framework. It requires an Apache config or .htaccess rule. Here is the .htaccess file used in HTML5 Boilerplate. You’ll find it well commented and modular: go ahead and just copy and paste the parts you need. You can eke out even more performance by “lazy loading” scripts at a given time, but these are a little out of the scope of this post.

Here’s the thing: when we talk about having good library websites we’re mostly talking about the look. This is the wrong discussion. Web designs driven by anything but the content they already have make grasping assumptions about how slick it would look to have this killer carousel, these accordions, nifty tooltips, and of course a squishy responsive design. Subsequently, these responsive sites miss the point: if anything, they’re mobile unfriendly.

Much of the time, a responsive library website is used as a marker that such-and-such site is credible and not irrelevant, but as such the website reflects a lack of purpose (e.g., “this website needs to increase library-card registration). A superficial understanding of responsive webdesign and easy-to-grab frameworks entail that the patron is the least priority.

 

About Our Guest Author :

Michael Schofield is a front-end librarian in south Florida, where it is hot and rainy – always. He tries to do neat things there. You can hear him talk design and user experience for libraries on LibUX.


Two Free Methods for Sharing Google Analytics Data Visualizations on a Public Dashboard

UPDATE (October 15th, 2014):

OOCharts as an API and service, described below, will be shut down November 15th, 2014.  More information about the decision by the developers to shut down the service is here. It’s not entirely surprising that the service going away, considering the Google SuperProxy option described in this post.  I’m leaving all the instructions here for OOCharts for posterity, but as of November 15th you can only use the SuperProxy or build your own with the Core Reporting API.

At this point, Google Analytics is arguably the standard way to track website usage data.  It’s easy to implement, but powerful enough to capture both wide general usage trends and very granular patterns.  However, it is not immediately obvious how to share Google Analytics data either with internal staff or external stakeholders – both of whom often have widespread demand for up-to-date metrics about library web property performance.  While access to Google Analytics can be granted by Google Analytics administrators to individual Google account-holders through the Google Analytics Admin screen, publishing data without requiring authentication requires some intermediary steps.

There are two main free methods of publishing Google Analytics visualizations to the web, and both involve using the Google Analytics API: OOCharts and the Google Analytics superProxy.  Both methods rely upon an intermediary service to retrieve data from the API and cache it, both to improve retrieval time and to avoid exceeding limits in API requests.1  The first method – OOCharts – requires much less time to get running initially. However, OOCharts’ long-standing beta status, and its status as a stand-alone service has less potential for long-term support than the second method, the Google Analytics superProxy.  For that reason, while OOCharts is certainly easier to set up, the superProxy method is definitely worth the investment in time (depending on what your needs).  I’ll cover both methods.

OOCharts Beta

OOCharts’ service facilitates the creation and storage of Google Analytics API Keys, which are required for sending secure requests to the API in order to retrieve data.

When setting up an OOCharts account, create your account utilizing the email address you use to access Google Analytics.  For example, if you log into Google Analytics using admin@mylibrary.org, I suggest you use this email address to sign up for OOCharts.  After creating an account with OOCharts, you will be directed to Google Analytics to authorize the OOCharts service to access your Google Analytics data on your behalf.

Let OOCharts gather Google Analytics Data on your Behalf.

Let OOCharts gather Google Analytics Data on your Behalf.

After authorizing the service, you will be able to generate an API key for any of the properties your to which your Google Analytics account has access.

In the OOCharts interface, click your name in the upper right corner, and then click inside the Add API Key field to see a list of available Analytics Properties from your account.

 

Once you’ve select a property, OOCharts will generate a key for your OOCharts application.  When you go back to the list of API Keys, you’ll see your keys along with property IDs (shown in brackets after the URL of your properties, e.g., [9999999].

APIkeys

Your API Keys now show your key values and your Google Analytics property IDs, both of which you’ll need to create visualizations with the OOCharts library.

 

Creating Visualizations with the OOCharts JavaScript Library

OOCharts appears to have started as a simple chart library for visualization data returned from the Google Analytics API. After you have set up your OOCharts account, download the front-end JavaScript library (available at http://docs.oocharts.com/) and upload it to your server or localhost.   Navigate to the /examples directory and locate the ‘timeline.html’ file.  This file contains a simplified example that displays web visits over time in Google Analytics’ familiar timeline format.

The code for this page is very simple, and contains two methods – JavaScript-only and with HTML Attributes – for creating OOCharts visualizations.  Below, I’ve separated out the required elements for both methods.  While either methods will work on their own, using HTML attributes allows for additional customizations and styling:

JavaScript-Only
		<h3>With JS</h3>
		<div id='chart'></div>
		
		<script src='../oocharts.js'></script>
		<script type="text/javascript">

			window.onload = function(){

				oo.setAPIKey("{{ YOUR API KEY }}");

				oo.load(function(){

					var timeline = new oo.Timeline("{{ YOUR PROFILE ID }}", "180d");

					timeline.addMetric("ga:visits", "Visits");

					timeline.addMetric("ga:newVisits", "New Visits");

					timeline.draw('chart');

				});
			};

		</script>
With HTML Attributes:
<h3>With HTML Attributes</h3>
	<div data-oochart='timeline' 
        data-oochart-metrics='ga:visits,Visits,ga:newVisits,New Visits' 
        data-oochart-start-date='180d' 
        data-oochart-profile='{{ YOUR PROFILE ID }}'></div>
		
			<script src='../oocharts.js'></script>
			<script type="text/javascript">

			window.onload = function(){

				oo.setAPIKey("{{ YOUR API KEY }}");

				oo.load(function(){
				});
			};

		</script>

 

For either method, plugin {{ YOUR API KEY }} where indicated with the API Key generated in OOCharts and replace {{ YOUR PROFILE ID }} with the associated eight-digit profile ID.  Load the page in your browser, and you get this:

With the API Key and Profile ID in place, the timeline.html example looks like this.  In this example I also adjusted the date paramter (30d by default) to 180d for more data.

With the API Key and Profile ID in place, the timeline.html example looks like this. In this example I also adjusted the date parameter (30d by default) to 180d for more data.

This example shows you two formats for the chart – one that is driven solely by JavaScript, and another that can be customized using HTML attributes.  For example, you could modify the <div> tag to include a style attribute or CSS class to change the width of the chart, e.g.:

<h3>With HTML Attributes</h3>

<div style=”width:400px” data-oochart='timeline' 
data-oochart-start-date='180d' 
data-oochart-metrics='ga:visits,Visits,ga:newVisits,New Visits' 
data-oochart-profile='{{ YOUR PROFILE ID }}'></div>

Here’s the same example.html file showing both the JavaScript-only format and the HTML-attributes format, now with a bit of styling on the HTML attributes chart to make it smaller:

You can use styling to adjust the HTML attributes example.

You can use styling to adjust the HTML attributes example.

Easy, right?  So what’s the catch?

OOCharts only allows 10,000 requests a month – which is even easier to exceed than the 50,000 limit on the Google Analytics API.  Each time your page loads, you use another request.  Perhaps more importantly, your Analytics API key and profile ID are pretty much ‘out there’ for the world to see if they view your page source, because those values are stored in your client-side JavaScript2.  If you’re making a private intranet for your library staff, that’s probably not a big deal; but if you want to publish your dashboard fully to the public, you’ll want to make sure those values are secure.  You can do this with the Google Analytics superProxy.

Google Analytics superProxy

In 2013, Google Analytics released a method of accessing Google Analytics API data that doesn’t require end-users to authenticate in order to view data, known as the Google Analytics superProxy.  Much like OOCharts, the superProxy facilitates the creation of a query engine that retrieves Google Analytics statistics through the Google Analytics Core Reporting API, caches the statistics in a separate web application service, and enables the display of Google Analytics data to end users without requiring individual authentication. Caching the data has the additional benefit of ensuring that your application will not exceed the Google Core Reporting API request limit of 50,000 requests each day. The superProxy can be set up to refresh a limited number of times per day, and most dashboard applications only need a daily refresh of data to stay current.
The required elements of this method are available on the superProxy Github page (Google Analytics, “Google Analytics superProxy”). There are four major parts to the setup of the superProxy:

  1. Setting up Google App Engine hosting,
  2. Preparing the development environment,
  3. Configuring and deploying the superProxy to Google’s App Engine Appspot host; and
  4. Writing and scheduling queries that will be used to populate your dashboard.
Set up Google App Engine hosting

First, your Google Analytics account credentials to access the Google App Engine at https://appengine.google.com/start. The superProxy application you will be creating will be freely hosted by the Google App Engine. Create your application and designate an Application Identifier that will serve as the endpoint domain for queries to your Google Analytics data (e.g., mylibrarycharts.appspot.com).

Create your App Engine Application

Create your App Engine Application

You can leave the default authentication option, Open to All Google Users, selected.  This setting only reflects access to your App Engine administrative screen and does not affect the ability for end-users to view the dashboard charts you create.  Only those Google users who have been authorized to access Google Analytics data will be able to access any Google Analytics information through the Google App Engine.

Ensure that API access to Google Analytics is turned on under the Services pane of the Google Developer’s Console. Under APIs and Auth for your project, visit the APIs menu and ensure that the Analytics API is turned on.

Turn on the Google Analytics API.

Turn on the Google Analytics API.  Make sure the name under the Projects label in the upper left corner is the same as your newly created Google App Engine project (e.g., librarycharts).

 

Then visit the Credentials menu to set up an OAuth 2.0 Client ID. Set the Authorized JavaScript Origins value to your appspot domain (e.g., http://mylibrarycharts.appspot.com). Use the same value for the Authorized Redirect URI, but add /admin/auth to the end (e.g., http://mylibrarycharts.appspot.com/admin/auth). Note the OAuth Client ID, OAuth Client Secret, and OAuth Redirect URI that are stored here, as you will need to reference them later before you deploy your superProxy application to the Google App Engine.

Finally, visit the Consent Screen menu and choose an email address (such as your Google account email address), fill in the product name field with your Application Identifier (e.g., mylibrarycharts) and save your settings. If you do not include these settings, you may experience errors when accessing your superProxy application admin menu.

Prepare the Development Environment

In order to configure superProxy and deploy it to Google App Engine you will need Python 2.7 installed and the Google App Engine Launcher (AKA the Google App Engine SDK).  Python just needs to be installed for the App Engine Launcher to run; don’t worry, no Python coding is required.

Configure and Deploy the superProxy

The superProxy application is available from the superProxy Github page. Download the .zip files and extract them onto your computer into a location you can easily reference (e.g., C:/Users/yourname/Desktop/superproxy or /Applications/superproxy). Use a text editor such as Notepad or Notepad++ to edit the src/app.yaml to include your Application ID (e.g., mylibrarycharts). Then use Notepad to edit src/config.py to include the OAuth Client ID, OAuth Client Secret, and the OAuth Redirect URI that were generated when you created the Client ID in the Google Developer’s Console under the Credentials menu. Detailed instructions for editing these files are available on the superProxy Github page.

After you have edited and saved src/app.yaml and src/config.py, open the Google App Engine Launcher previously downloaded. Go to File > Add Existing Application. In the dialogue box that appears, browse to the location of your superProxy app’s /src directory.

To upload your superProxy application, use the Google App Engine Launcher and browse to the /src directory where you saved and configured your superProxy application.

To upload your superProxy application, use the Google App Engine Launcher and browse to the /src directory where you saved and configured your superProxy application.

Click Add, then click the Deploy button in the upper right corner of the App Engine Launcher. You may be asked to log into your Google account, and a log console may appear informing you of the deployment process. When deployment has finished, you should be able to access your superProxy application’s Admin screen at http://[yourapplicationID].appspot.com/admin, replacing [yourapplicationID] with your Application Identifier.

Creating superProxy Queries

SuperProxy queries request data from your Google Analytics account and return that data to the superProxy application. When the data is returned, it is made available to an end-point that can be used to populate charts, graphs, or other data visualizations. Most data available to you through the Google Analytics native interface is available through superProxy queries.

An easy way to get started with building a query is to visit the Google Analytics Query Explorer. You will need to login with your Google Analytics account to use the Query Explorer. This tool allows you to build an example query for the Core Reporting API, which is the same API service that your superProxy application will be using.

Google Analytics API Query Explorer

Running example queries through the Google Analytics Query Explorer can help you to identify the metrics and dimensions you would like to use in superProxy queries. Be sure to note the metrics and dimensions you use, and also be sure to note the ids value that is populated for you when using the API Explorer.

 

When experimenting with the Google Analytics Query explorer, make note of all the elements you use in your query. For example, to create a query that retrieves the number of users that visited your site between July 4th and July 18th 2014, you will need to select your Google Account, Property and View from the drop-down menus, and then build a query with the following parameters:

  • ids = this is a number (usually 8 digits) that will be automatically populated for you when you choose your Google Analytics Account, Property and View. The ids value is your property ID, and you will need this value later when building your superProxy query.
  • dimensions = ga:browser
  • metrics = ga:users
  • start-date = 07-04-2014
  • end-date = 07-18-2014

You can set the max-results value to limit the number of results returned. For queries that could potentially have thousands of results (such as individual search terms entered by users), limiting to the top 10 or 50 results will retrieve data more quickly. Clicking on any of the fields will generate a menu from which you can select available options. Click Get Data to retrieve Google Analytics data and verify that your query works

Successful Google Analytics Query Explorer query result showing visits by browser.

Successful Google Analytics Query Explorer query result showing visits by browser.

After building a successful query, you can replicate the query in your superProxy application. Return to your superProxy application’s admin page (e.g., http://[yourapplicationid].appspot.com/admin) and select Create Query. Name your query something to make it easy to identify later (e.g., Users by Browser). The Refresh Interval refers to how often you want the superProxy to retrieve fresh data from Google Analytics. For most queries, a daily refresh of the data will be sufficient, so if you are unsure, set the refresh interval to 86400. This will refresh your data every 86400 seconds, or once per day.

Create a superProxy Query

Create a superProxy Query

We can reuse all of the elements of queries built using the Google Analytics API Explorer to build the superProxy query encoded URI.  Here is an example of an Encoded URI that queries the number of users (organized by browser) that have visited a web property in the last 30 days (you’ll need to enter your own profile ID in the ids value for this to work):

https://www.googleapis.com/analytics/v3/data/ga?ids=ga:99999991
&metrics=ga:users&dimensions=ga:browser&max-results=10
&start-date={30daysago}&end-date={today}

Before saving, be sure to run Test Query to see a preview of the kind of data that is returned by your query. A successful query will return a json response, e.g.:

{"kind": "analytics#gaData", "rows": 
[["Amazon Silk", "8"], 
["Android Browser", "36"], 
["Chrome", "1456"], 
["Firefox", "1018"], 
["IE with Chrome Frame", "1"], 
["Internet Explorer", "899"], 
["Maxthon", "2"], 
["Opera", "7"], 
["Opera Mini", "2"], 
["Safari", "940"]], 
"containsSampledData": false, 
"totalsForAllResults": {"ga:users": "4398"}, 
"id": "https://www.googleapis.com/analytics/v3/data/ga?ids=ga:84099180&dimensions=ga:browser&metrics=ga:users&start-date=2014-06-29&end-date=2014-07-29&max-results=10", 
"itemsPerPage": 10, "nextLink": "https://www.googleapis.com/analytics/v3/data/ga?ids=ga:84099180&dimensions=ga:browser&metrics=ga:users&start-date=2014-07-04&end-date=2014-07-18&start-index=11&max-results=10", 
"totalResults": 13, "query": {"max-results": 10, "dimensions": "ga:browser", "start-date": "2014-06-29", "start-index": 1, "ids": "ga:84099180", "metrics": ["ga:users"], "end-date": "2014-07-18"}, 
"profileInfo": {"webPropertyId": "UA-9999999-9", "internalWebPropertyId": "99999999", "tableId": "ga:84099180", "profileId": "9999999", "profileName": "Library Chart", "accountId": "99999999"}, 
"columnHeaders": [{"dataType": "STRING", "columnType": "DIMENSION", "name": "ga:browser"}, {"dataType": "INTEGER", "columnType": "METRIC", "name": "ga:users"}], 
"selfLink": "https://www.googleapis.com/analytics/v3/data/ga?ids=ga:84099180&dimensions=ga:browser&metrics=ga:users&start-date=2014-06-29&end-date=2014-07-29&max-results=10"}

Once you’ve tested a successful query, save it, which will allow the json string to become accessible to an application that can help to visualize this data. After saving, you will be directed to the management screen for your API, where you will need to click Activate Endpoint to begin publishing the results of the query in a way that is retrievable. Then click Start Scheduling so that the query data is refreshed on the schedule you determined when you built the query (e.g., once a day). Finally, click Refresh Data to return data for the first time so that you can start interacting with the data returned from your query. Return to your superProxy application’s Admin page, where you will be able to manage your query and locate the public end-point needed to create a chart visualization.

Using the Google Visualization API to visualize Google Analytics data

Included with the superProxy .zip file downloaded to your computer from the Github repository is a sample .html page located under /samples/superproxy-demo.html. This file uses the Google Visualization API to generate two pie charts from data returned from superProxy queries. The Google Visualization API is a service that can ingest raw data (such as json arrays that are returned by the superProxy) and generate visual charts and graphs. Save superproxy-demo.html onto a web server or onto your computer’s localhost.  We’ll set up the first pie chart to use the data from the Users by Browser query saved in your superProxy app.

Open superproxy-demo.html and locate this section:

var browserWrapper = new google.visualization.ChartWrapper({
// Example Browser Share Query
"containerId": "browser",
// Example URL: http://your-application-id.appspot.com/query?id=QUERY_ID&format=data-table-response
"dataSourceUrl": "REPLACE WITH Google Analytics superProxy PUBLIC URL, DATA TABLE RESPONSE FORMAT",
"refreshInterval": REPLACE_WITH_A_TIME_INTERVAL,
"chartType": "PieChart",
"options": {
"showRowNumber" : true,
"width": 630,
"height": 440,
"is3D": true,
"title": "REPLACE WITH TITLE"
}
});

Three values need to be modified to create a pie chart visualization:

  • dataSourceUrl: This value is the public end-point of the superProxy query you have created. To get this value, navigate to your superProxy admin page and click Manage Query on the Users by Browser query you have created. On this page, right click the DataTable (JSON Response) link and copy the URL (Figure 8). Paste the copied URL into superproxy-demo.html, replacing the text REPLACE WITH Google Analytics superProxy PUBLIC URL, DATA TABLE FORMAT. Leave quotes around the pasted URL.
Right-click the DataTable (JSON Response) link and copy the URL to your clipboard. The copied link will serve as the dataSouceUrl value in superproxy-demo.html.

Right-click the DataTable (JSON Response) link and copy the URL to your clipboard. The copied link will serve as the dataSouceUrl value in superproxy-demo.html.

  • refreshInterval – you can leave this value the same as the refresh Interval of your superProxy query (in seconds) – e.g., 86400.
  • title – this is the title that will appear above your pie chart, and should describe the data your users are looking at – e.g., Users by Browser.

Save the modified file to your server or local development environment, and load the saved page in a browser.  You should see a rather lovely pie chart:

Your pie chart's data will refresh automatically on the refresh schedule you set in your query.

Your pie chart’s data will refresh automatically based upon the Refresh Interval you specify in your superProxy query and your page’s JavaScript parameters.

That probably seemed like a lot of work just to make a pie chart.  But now that your app is set up, making new charts from your Google Analytics data just involves visiting your App Engine site, scheduling a new query, and referencing that with the Google Visualization API.  To me, the Google superProxy method has three distinct advantages over the simpler OOCharts method:

  • Security – Users won’t be able to view your API Keys by viewing the source of your dashboard’s web page
  • Stability – OOCharts might not be around forever.  For that matter, Google’s free App Engine service might not be around forever, but betting on Google is [mostly] a safe bet
  • Flexibility – You can create a huge range of queries, and test them out easily using the API explorer, and the Google Visualization API has extensive documentation and a fairly active user group from whom to gather advice and examples.

 

Notes

  1. There is a 50,000 request per day limit on the Analytics API.  That sounds like a lot, but it’s surprisingly easy to exceed. Consider creating a dashboard with 10 charts, each making a call to the Analytics API.  Without a service that caches the data, the data is refreshed every time a user loads a page.  After just 5,000 visits to the page (which makes 10 API calls – one for each chart – each time the page is loaded), the API limit is exceeded:  5,000 page loads x 10 calls per page = 50,000 API requests.
  2. You can use pre-built OOCharts Queries – https://app.oocharts.com/mc/query/list – to hide your profile ID (but not your API Key). There are many ways to minify and obfuscate client-side JavaScript to make it harder to read, but it’s still pretty much accessible to someone who wants to get it

LibraryQuest Levels Up

Almost a year ago, GVSU Libraries launched LibraryQuest, our mobile quest-based game. It was designed to teach users about library spaces and services in a way that (we hoped) would be fun and engaging.  The game was released “into the wild” in the last week of August, 2013, which is the beginning of our fall semester.  It ran continuously until late November, shortly after midterms (we wanted to end early enough in the semester that we still had students on campus for post-game assessment efforts).  For details on the early development of the game, take a look at my earlier ACRL TechConnect post.  This article will focus on what happened after launch.

Screenshot of one of the quests.

A screenshot from one of our simpler quests as it was being played.

Running the Game

Once the app was released, we settled on a schedule that would put out between three to five new quests each month the game ran.  Designing quests is very time intensive, and 3-5 a month was all we could manage with the man and woman power we had available. We also had short duration quests run at random intervals to encourage students to keep checking the app.  Over the course of the game, we created about 30 quests total.  Almost all quests were designed with a specific educational objective in mind, such as showing students how a specific library system worked or where something or someone was in the physical building.  Quests were chiefly designed by our Digital Initiatives Librarian (me) with help and support from our implementation team and other staff in the library as needed.

For most of the quests, we developed quest write-up sheets like this one:  Raiders of the Lost…Bin.  The sheets detailed the name of the quest, points, educational objective, steps, completion codes, and any other information that defined the quest.  These sheets proved invaluable whenever a staff member needed to know something about a quest, which was often.  Even simple quests like the one above required a fair amount of cooperation and coordination.  For the raiders quest, we needed a special cataloging record created, we had to tag several plastic crowns and get them into our automatic storage and retrieval system.

For every quest players completed, they earned points.  For every thirty points they earned, they were entered once in a drawing to win an iPad.  This was a major component of the game’s advertising, since we imagined it would be the biggest draw to play (and we may not have been right, as you’ll see).  Once the game closed in November, we held the drawing, publicized the winner, and then commenced a round of post-game assessment.

Thank You for Playing: Post-Game Assessment

When the game wrapped in mid-November, we took some time to examine the statistics the game had collected.  One of our very talented design students created a game dashboard that showed all the metrics collected by the game database in graphic form. The final tally of registered players came in at 397. That means 397 people downloaded the app and logged in at least once (in case you’re curious, the total enrollment of GVSU is 25,000 students). This number probably includes a few non-students (since anyone could download the app), but we did some passes throughout the life of the game to remove non-student players from the tally and so feel confident that the vast majority of registered players are students. During development, we set a goal of having at least 300 registered players, based mostly on the cost of the game and how much money we had spent on other outreach efforts.  So we did, technically, meet that goal, but a closer examination of the numbers paints a more nuanced picture of student participation.

A screenshot of the LibraryQuest Dashboard

A detail from our game dashboard. This part shows overall numbers and quest completion dates.

Of the registered players, 173 earned points, meaning they completed at least one quest.  That means that 224 players downloaded the app and logged in at least once, but then failed to complete any quest content.  Clearly, getting players to take the first step and get involved in the game was somehow problematic.  There are any number of explanations for this, including encounters with technical problems that may have turned players off (the embedded QR code scanner was a problem throughout the life of the game), an unwillingness to travel to locations to do physical quests, or something else entirely.  The maximum number of points you could earn was 625, which was attained by one person, although a few others came close. Players tended to cluster at the lower and middle of the point spectrum, which was entirely expected.  Getting the maximum number of points required a high degree of dedication, since it meant paying very close attention to the app for all the temporary, randomly appearing quests.

A shot of a different aspect of the LibraryQuest Game dashboard.

Another detail from the game dashboard, showing acceptance and completion metrics for some of the quests. We used this extensively to determine which quests to retire at the end of the month.

In general, online-only quests were more popular than quests involving physical space, and were taken and completed more often.  Of the top five most-completed quests, four are online-only.  There are a number of possible explanations for this, including the observation offered by one of our survey recipients that possibly a lot of players were stationed at our downtown campus and didn’t want to travel to our Allendale campus, which is where most of the physical quests were located.

Finally, of our 397 registered users, only 60 registered in the second semester the game ran.  The vast majority signed up soon after game launch, and registrations tapered off over time.  This reinforced data we collected from other sources that suggested the game ran for too long and the pacing needed to be sped up.

In addition to data collected from the game itself, we also put out two surveys over the course of the game. The first was a mid-game survey that asked questions about quest design (what quests students liked or didn’t like, and why).  Responses to this survey were bewilderingly contradictory. Students would cite a quest as their favorite, while others would cite the exact same quest as their least favorite (and often for the same reasons).  The qualitative post-game evaluation we did provides some possible explanation for this (see below).  The second survey was a simple post-game questionnaire that asked whether students had enjoyed the game, whether they’d learned something, and if this was something we should continue doing.  We also asked if they had learned anything, and if so, what they had learned.  90% of the respondents to this survey indicated that they had learned something about the library, that they thought this was a good idea, and that it was something we should do again.

Finally, we offered players points and free coffee to come in to the library and spend 15-20 minutes talking to us about their experience playing the game.  We kept questions short and simple to keep within the time window.  We asked about overall impressions of the game, if the students would change anything, if they learned anything (and if so, what) and about what quests they liked or didn’t like and why.  The general tone of the feedback was very positive. Students seemed intrigued by the idea and appreciated that the library was trying to teach in nontraditional, self-directed ways.  When asked to sum up their overall impressions of the game, students said things like “Very well done, but could be improved upon”or “good but needs polish,” or my personal favorite: “an effective use of bribery to learn about the library.”

One of the things we asked people about was whether the game had changed how they thought about the library. They typically answered that it wasn’t so much that the game had changed how they thought about the library so much as it changed the way they thought about themselves in relation to it. They used words like “”aware,” “confident,” and “knowledgeable.”  They felt like they knew more about what they could do here and what we could do for them. Their retention of some of the quest content was remarkable, including library-specific lingo and knowledge of specific procedures (like how to use the retrieval system and how document delivery worked).

Players noted a variety of problems with the game. Some were technical in nature. The game app takes a long time to load, likely because of the way the back-end is designed. Some of them didn’t like the facebook login.  Stability on android devices was problematic (this is no surprise, as developing the android version was by far the more problematic part of developing the app).  Other problems were nontechnical, including quest content that didn’t work or took too long (my own lack of experience designing quests is to blame), communication issues (there’s no way to let us know when quest content doesn’t work), the flow and pacing of new quests (more content faster), and marketing issues.  These problems may in part account for the low on boarding numbers in terms of players that actually completed content.

They also had a variety of reasons for playing. While most cited the iPad grand prize as the major motivator, several of them said they wanted to learn about the library or were curious about the game, and that they thought it might be fun. This may explain differing reactions to the quest content survey that so confused me.  People who just wanted to have fun were irked by quests that had an overt educational goal.  Students who just wanted the iPad didn’t want to do lengthy or complex quests. Students who loved games for the fun wanted very hard quests that challenged them.  This diversity of desire is something all game developers struggle to cope with, and it’s a challenge for designing popular games that appeal to a wide variety of people.

Where to Go from Here

Deciding whether or not Library Quest has been successful depends greatly on what angle you look at the results from.  On one hand, the game absolutely taught people things.  Students in the survey and interviews were able to list concrete things they knew how to do, often in detail and using terminology directly from the game.  One student proudly showed us a book she had gotten from ILL, which she hadn’t known how to use before she played.  On the other hand, the overall participation was low, especially when contrasted against the expense and staff time of creating and running the game.  Looking only at the money spent (approximately $14,700), it’s easy to calculate an output of about $85 per student reached (173 with points) in development, prizes, and advertising.  The challenge is creating engaging games that are appealing to a large number of students in a way that’s economical in terms of staff time and resources.

After looking at all of this data and talking to Yeti CGI, our development partners, we feel there is still a great deal for us to learn here, and the results are promising enough that we should continue to experiment.  Both organizations feel there is still a great deal to learn about making games in physical space and that we’ve just scratched the surface of what we might be able to do.  With the lessons we have learned from this round of the game, we are looking to completely redesign the way the game app works, as well as revise the game into a shorter, leaner experience that does not require as much content or run so long.   In addition, we are seeking campus partners who would be interested in using the app in classes, as part of student life events, or in campus orientation.  Even if these events don’t directly involve the library, we can learn from the experience how to design better quest content that the library can use.  Embedding the app in smaller and more fixed events helps with marketing and cost issues.

Because the app design is so expensive, we are looking into the possibility of a research partnership with Yeti CGI.  We could both benefit from learning more about how mobile gaming works in a physical space, and sharing those lessons would get us Yeti’s help rebuilding the app as well as working with us to figure out content creation and pacing, without another huge outlay of development capital. We are also looking at ways to turn the game development itself into an educational opportunity. By working with our campus mobile app development lab, we can provide opportunities for GVSU students to learn app design.  Yeti is looking at making more of the game’s technical architecture open (for example, we are thinking about having all quest content marked up in XML) so that students can build custom interfaces and tools for the game.

Finally, we are looking at grants to support running and revising the game.  Our initial advertising and incentives budgets were very low, and we are curious to see what would happen if we put significant resources into those areas.  Would we see bigger numbers?  Would other kinds of rewards in addition to the iPad (something asked for by students) entice players into completing more quest content?  Understanding exactly how much money needs to be put into incentives and advertising can help quantify the total cost of running a large, open game for libraries, which would be valuable information for other libraries contemplating running large-scale games.

Get the LibraryQuest App: (iPhoneAndroid)

 

About our Guest Author:
Kyle Felker is the Digital Initiatives Librarian at Grand Valley State University Libraries, where he has worked since February of 2012.  He is also a longtime gamer.  He can be reached at felkerk@gvsu.edu, or on twitter @gwydion9.


A Short and Irreverently Non-Expert Guide to CSS Preprocessors and Frameworks

It took me a long time to wrap my head around what a CSS preproccessor was and why you might want to use one. I knew everyone was doing it, but for the amount of CSS I was doing, it seemed like overkill. And one more thing to learn! When you are a solo web developer/librarian, it’s always easier to clutch desperately at the things you know well and not try to add more complexity. So this post is for those of you are in my old position. If you’re already an expert, you can skip the post and go straight to the comments to sell us on your own favorite tools.

The idea, by the way, is that you will be able to get one of these up and running today. I promise it’s that easy (assuming you have access to install software on your computer).

Ok, So Why Should I Learn This?

Creating a modern and responsive website requires endless calculations, CSS adjustments, and other considerations that it’s not really possible to do it (especially when you are a solo web developer) without building on work that lots of other people have done. Even if you aren’t doing anything all that sophisticated in your web design, you probably want at minimum to have a nicely proportioned columnar layout with colors and typefaces that are easy to adapt as needed. Bonus points if your site is responsive so it looks good on tablets and phones. All of that takes a lot of math and careful planning to accomplish, and requires much attention to documentation if you want others to be able to maintain the site.

Frameworks and preprocessors take care of many development challenges. They do things like provide responsive columnar layouts that are customizable to your design needs, provide “mixins” of code others have written, allow you to nest elements rather than writing CSS with selectors piled on selectors, and create variables for any elements you might repeat such as typefaces and colors. Once you figure it out, it’s pretty addictive to figure out where you can be more efficient in your CSS. Then if your institution switches the shade of red used for active links, H1s, and footer text, you only have to change one variable to update this rather than trying to find all the places that color appears in your stylesheets.

What You Should Learn

If I’ve convinced you this is something worth putting time into, now you have to figure out where you should spend that time.This post will outline one set of choices, but you don’t need to stick with this if you want to get more involved in learning more.

Sometimes these choices are made for you by whatever language or content management system you are using, and usually one will dictate another. For instance, if you choose a framework that uses Sass, you should probably learn Sass. If the theme you’ve chosen for your content management system already includes preprocessor files, you’ll save yourself lots of time just by going with what that theme has chosen. It’s also important to note that you can use a preprocessor with any project, even just a series of flat HTML files, and in that case definitely use whatever you prefer.

I’m going to point out just a few of each here, since this is basically what I’ve used with which I am familiar. There are lots and lots of all of these things out there, and I would welcome people with specific experience to describe it in the comments.

Before you get too scared about the languages these tools use, it’s really only relevant to beginners insofar as you have to have one of these languages installed on your system to compile your Sass or Less files into CSS.

Preprocessors

Sass was developed in 2006. It can be written with Sass or SCSS syntax. Sass runs on Ruby.

Less was developed in 2009. It was originally written in Ruby, but now is in Javascript.

A Completely Non-Comprehensive List of CSS Frameworks

Note! You don’t have to use a framework for every project. If you have a two column layout with a simple breakpoint and don’t need all the overhead, feel free to not use one. But for most projects, frameworks provide things like responsive layouts and useful features (for instance tabs, menus, image galleries, and various CSS tricks) you might want to build in your site without a lot of thought.

  • Compass (runs on Sass)
  • Bootstrap (runs on Less)
  • Foundation (runs on Sass)
  • There are a million other ones out there too. Google search for css frameworks gets “About 4,860,000 results”. So I’m not kidding about needing to figure it out based on the needs of your project and what preprocessor you prefer.
A Quick and Dirty Tutorial for Getting These Up and Running

Let’s try out Sass and Compass. I started working with them when I was theming Omeka, and I run them on my standard issue work Windows PC, so I know you can do it too!

Ingredients:

Stir to combine.

Ok it’s not quite that easy. But it’s not a lot harder.

  1. Sass has great installation instructions. I’ve always run it from the command line without a fuss, and I don’t even particularly care for the command line. So feel free to follow those instructions.
  2. If you don’t have Ruby installed already (likely on a standard issue work Windows PC–it is already installed on Macs), install it. Sass suggests RubyInstaller, and I second that suggestion. There are some potential confusing or annoying things about doing it this way, but don’t worry about it until one pops up.
  3. If you are on Windows, open up a new command line window by typing cmd in your start search bar. If you are on a Mac, open up your Terminal app.
  4. Now all you have to do is install Sass with one line of code in your command line client/terminal window: gem install sass. If you are on a Mac, you might get an error message and  have to use the command sudo gem install sass.
  5. Next install Compass. You’ve got it, just type gem install compass.
  6. Read through this tutorial on Getting Started with Sass and Compass. It’s basically what I just said, but will link you out to a bunch of useful tutorials.
  7. If you decide the command line isn’t for you, you might want to look into other options for installing and compiling your files. The Sass installation instructions linked above give a few suggestions.
Working on a Project With These Tools

When you start a project, Compass will create a series of files Sass/SCSS files (you choose which syntax) that you will edit and then compile into CSS files. The easiest way to get started is to head over to the Compass installation page and use their menu to generate the command line text you’ll need to get started.

For this example, I’ll create a new project called acrltest by typing compass create acrltest. In the image below you’ll see what this looks like. This text also provides some information about what to do next, so you may want to copy it to save it for later.

compass1

You’ll now have a new directory with several recommended SCSS files, but you are by no means limited to these files. Sass provides the @import command, which imports additional SCSS files, which can either be full CSS files, or “partials”, which allow you to define styles without creating an additional CSS file. Partials start with an underscore. The normal practice is to have a partial _base.scss file, where you identify base styles. You will then import these to your screen.scss file to use whatever you have in that file.

Here’s what this looks like.

On _base.scss, I’ve created two variables.

$headings: #345fff;
$links: #c5ff34;

Now on my screen.scss file, I’ve imported my file and can use these variables.

@import "base";

a {
    color: $links;
   }

 h1 {
    color: $headings;
 }

Now this doesn’t do anything yet since there are no CSS files that you can use on the web. Here’s how you actually make the CSS files.

Open up your command prompt or terminal window and change to the directory your Compass project is in. Then type compass watch, and Compass will compile your SCSS files to CSS. Everytime you save your work, the CSS will be updated, which is very handy.

compass2

Now you’ll have something like this in the screen.css file in the stylesheets directory:

 

/* line 9, ../sass/screen.scss */
a {
  color: #c5ff34;
}

/* line 13, ../sass/screen.scss */
h1 {
  color: #345fff;
}

Note that you will have comments inserted telling you where this all came from.

Next Steps

This was an extremely basic overview, and there’s a lot more I didn’t cover–one of the major things being all the possibilities that frameworks provide. But I hope you start to get why this can ultimately make your life easier.

If you have a little experience with this stuff, please share it in the comments.