#libtechgender: A Post in Two Parts

Conversations about gender relations, bias, and appropriate behavior have bubbled up all over the technology sector recently. We have seen conferences adopt codes of conduct that strive to create welcoming atmospheres. We have also seen cases of bias and harassment, cases that may once have been tolerated or ignored, now being identified and condemned. These conversations, like gender itself, are not simple or binary but being able to listen respectfully and talk honestly about uncomfortable topics offers hope that positive change is possible.

On October 28th Sarah Houghton, the director of the San Rafael Public Library, moderated a panel on gender in library technology at the Internet Librarian conference. In today’s post I’d like to share my small contributions to the panel discussion that day and also to share how my understanding of the issues changed after the discussion there. It is my hope that more conversations—more talking and more listening—about gender issue in library technology will be sparked from this start.

Part I: Internet Librarian Panel on Gender

Our panel’s intent was to invite librarians into a public conversation about gender issues. In the Internet Librarian program our invitation read:

Join us for a lively panel and audience discussion about the challenges of gender differences in technology librarianship. The topics of fairness and bias with both genders have appeared in articles, blogs, etc and this panel of women and men who work in libraries and gender studies briefly share personal experiences, then engage the audience about experiences and how best to increase understanding between the genders specifically in the area of technology work in librarianship. 1
Panelists: Sarah Houghton, Ryan Claringbole, Emily Clasper, Kate Kosturski, Lisa Rabey, John Bultena, Tatum Lindsay, and Nicholas Schiller

My invitation to participate on the stemmed from blog posts I wrote about how online conversations about gender issues can go off the rails and become disasters. I used my allotted time to share some simple suggestions I developed observing these conversations. Coming from my personal (white cis straight male) perspective, I paid attention to things that I and my male colleagues do and say that result in unintended offense, silencing, and anger in our female colleagues. By reverse engineering these conversational disasters, I attempted to learn from unfortunate mistakes and build better conversations. Assuming honest good intentions, following these suggestions can help us avoid contention and build more empathy and trust.

  1. Listen generously. Context and perspective are vital to these discussions. If we’re actively cultivating diverse perspectives then we are inviting ideas that conflict with our assumptions. It’s more effective to assume these ideas come from unfamiliar but valid contexts than to assume they are automatically unreasonable. By deferring judgement until after new ideas have been assimilated and understood we can avoid silencing voices that we need to hear.
  2. Defensive responses can be more harmful than offensive responses. No one likes to feel called on the carpet, but the instinctive responses we can give when we feel blamed or accused can be worse than simply giving offense. Defensive denials can lead to others feeling silenced, which is much more damaging and divisive than simple disagreement. It can be the difference between communicating  “you and I disagree on this matter” and communicating “you are wrong and don’t get a voice in this conversation.” That kind of silencing and exclusion can be worse than simply giving offense.
  3. It is okay to disagree or to be wrong. Conversations about gender are full of fear. People are afraid to speak up for fear of reprisal. People are afraid to say the wrong thing and be revealed as a secret misogynist. People are afraid. The good news is that conversations where all parties feel welcome, respected, and listened to can be healing. Because context and perspective matter so much in how we address issues, once we accept the contexts and perspectives of others, we are more likely to receive acceptance of our own perspectives and contexts. Given an environment of mutual respect and inclusion, we don’t need to be afraid of holding unpopular views. These are complex issues and once trust is established, complex positions are acceptable.

This is what I presented at the panel session and I still stand behind these suggestions. They can be useful tools for building better conversations between people with good intentions. Specifically, they can help men in our field avoid all-too-common barriers to productive conversation.

That day I listened and learned a lot from the audience and from my fellow panelists. I shifted my priorities. I still think cultivating better conversations is an important goal. I still want to learn how to be a better listener and colleague.  I think these are skills that don’t just happen, but need to be intentionally cultivated. That said, I came in to the panel believing that the most important gender related issue in library technology was finding ways for well-intentioned colleagues to communicate effectively about an uncomfortable problem. Listening to my colleagues tell their stories, I learned that there are more direct and pressing gender issues in libraries.

Part II: After the Panel

As I listened to my fellow panelists tell their stories and then as I listened to people in the audience share their experiences, no one else seemed particularly concerned about well-intentioned people having misunderstandings or about breakdowns in communication. Instead, they related a series of harrowing personal experiences where men (and women, but mostly men) were directly harassing, intentionally abusive, and strategically cruel in ways that are having a very large impact on the daily work, career paths, and the quality of life of my female colleagues. I assumed that since this kind of harassment clearly violates standard HR policies that the problem is adequately addressed by existing administrative policies. That assumption is incorrect.

It is easy to ignore what we don’t see and I don’t see harassment taking place in libraries and I don’t often hear it discussed. It has been easy to underestimate the prevalence and impact it has on many of my colleagues. Listening to librarians.

Then, after the conference one evening, a friend of mine was harassed on the street and I had another assumption challenged. It happened quickly, but a stranger on the street harassed my friend while I watched in stunned passivity. 2 I arrived at the conference feeling positive about my grasp of the issues and also feeling confident about my role as an ally. I left feeling shaken and doubting both my thoughts and my actions.

In response to the panel and its aftermath, I’ve composed three more points to reflect what I learned. These aren’t suggestions, like I brought to the panel, instead they are realizations or statements. I’m obviously not an expert on the topic and I’m not speaking from a seat of authority. I’m relating stories and experiences told by others and they tell them much better than I do. In the tradition of geeks and hackers now that I have learned something new I’m sharing it with the community in hopes that my experience moves the project forward. It is my hope that better informed and more experienced voices will take this conversation farther than I am able to. These three realizations may be obvious to some, because they were not obvious to me, it seems useful to clearly articulate them.

  1. Intentional and targeted harassment of women is a significant problem in the library technology field. While subtle micro aggressions, problem conversations, and colleagues who deny that significant gender issues exist in libraries are problematic, these issues are overshadowed by direct and intentional harassing behavior targeting gender identity or sex. The clear message I heard at the panel was that workplace harassment is a very real and very current threat to women working in library technology fields.
  2. This harassment is not visible to those not targeted by it. It is easy to ignore what we do not see. Responses to the panel included many library technology women sharing their experiences and commenting that it was good to hear others’ stories. Even though the experience of workplace harassment was common, those who spoke of it reported feelings of isolation. While legislation and human resources polices clearly state harassment is unacceptable and unlawful, it still happens and when it happens the target can be isolated by the experience. Those of us who participate in library conferences, journals, and online communities can help pierce this isolation by cultivating opportunities to talk about these issues openly and public. By publicly talking about gender issues, we can thwart isolation and make the problems more visible to those who are not direct targets of harassment.
  3. This is a cultural problem, not only an individual problem. While no one point on the gender spectrum has a monopoly on either perpetrating or being the target of workplace harassment, the predominant narrative in our panel discussion was men harassing women. Legally speaking, these need to be treated as individual acts, but as a profession, we can address the cultural aspects of the issue. Something in our library technology culture is fostering an environment where women are systematically exposed to bad behavior from men.

In the field of Library Technology, we spend a lot of our time and brain power intentionally designing user experiences and assessing how users interact with our designs. Because harassment of some of our own is pervasive and cultural, I suggest we turn the same attention and intentionality to designing a workplace culture that is responsive to the needs of all of us who work here. I look forward to reading conference presentations, journal articles, and online discussions where these problems are publicly identified and directly addressed rather than occurring in isolation or being ignored.

  1. infotoday.com/il2013/Monday.asp#TrackD
  2. I don’t advocate a macho confrontational response or take responsibility for the actions of others, but an ally has their friend’s back and that night I did not speak up.

Building a Dynamic Image Display with Drupal & Isotope

I am in love with Isotope.  It’s not often that you hear someone profess their love for a JQuery library (unless it’s this), but there it is.  I want to display everything in animated grids.

I also love Views Isotope, a Drupal 7 module that enabled me to create a dynamic image gallery for our school’s Year in Review.  This module (paired with a few others) is instrumental in building our new digital library.

yearinreview

In this blog post, I will walk you through how we created the Year in Review page, and how we plan to extrapolate the design to our collection views in the Knowlton Digital Library.  This post assumes you have some basic knowledge of Drupal, including an understanding of content types, taxonomy terms and how to install a module.

Year in Review Project

Our Year in Review project began over the summer, when our communications team expressed an interest in displaying the news stories from throughout the school year in an online, interactive display.  The designer on our team showed me several examples of card-like interfaces, emphasizing the importance of ease and clean graphics.  After some digging, I found Isotope, which appeared to be the exact solution we needed.  Isotope, according to its website, assists in creating “intelligent, dynamic layouts that can’t be achieved with CSS alone.”  This JQuery library provides for the display of items in a masonry or grid-type layout, augmented by filters and sorting options that move the items around the page.

At first, I was unsure we could make this library work with Drupal, the content management system we employ for our main web site and our digital library.  Fortunately I soon learned – as with many things in Drupal – there’s a module for that.  The Views Isotope module provides just the functionality we needed, with some tweaking, of course.

We set out to display a grid of images, each representing a news story from the year.  We wanted to allow users to filter those news stories based on each of the sections in our school: Architecture, Landscape Architecture and City and Regional Planning.  News stories might be relevant to one, two or all three disciplines.  The user can see the news story title by hovering over the image, and read more about the new story by clicking on the corresponding item in the grid.

Views Isotope Basics

Views Isotope is installed in the same way as other Drupal modules.  There is an example in the module and there are also videos linked from the main module page to help you implement this in Views.  (I found this video particularly helpful.)

You must have the following modules installed to use Views Isotope:

You also need to install the Isotope JQuery library.  It is important to note that Isotope is only free for non-commercial projects.  To install the library, download the package from the Isotope GitHub repository.  Unzip the package and copy the whole directory into your libraries directory.  Within your Drupal installation, this should be in the /sites/all/libraries folder.  Once the module and the library are both installed, you’re ready to start.

If you have used Drupal, you have likely used Views.  It is a very common way to query the underlying database in order to display content.The Views Isotope module provides additional View types: Isotope Grid, Isotope Filter Block and Isotope Sort Block.  These three view types combine to provide one display.  In my case, I have not yet implemented the Sort Block, so I won’t discuss it in detail here.

To build a new view, go to Structure > Views > Add a new view.  In our specific example, we’ll talk about the steps in more detail.  However, there’s a few important tenets of using Views Isotope, regardless of your setup:

  1. There is a grid.  The View type Isotope Grid powers the main display.
  2. The field on which we want to filter is included in the query that builds the grid, but a CSS class is applied which hides the filters from the grid display and shows them only as filters.
  3. The Isotope Filter Block drives the filter display.  Again, a CSS class is applied to the fields in the query to assign the appropriate display and functionality, instead of using default classes provided by Views.
  4. Frequently in Drupal, we are filtering on taxonomy terms.  It is important that when we display these items we do not link to the taxonomy term page, so that a click on a term filters the results instead of taking the user away from the page.

With those basic tenets in mind, let’s look at the specific process of building the Year in Review.

Building the Year in Review

Armed with the Views Isotope functionality, I started with our existing Digital Library Drupal 7 instance and one content type, Item.  Items are our primary content type and contain many, many fields, but here are the important ones for the Year in Review:

  • Title: text field containing the headline of the article
  • Description: text field containing the shortened article body
  • File: File field containing an image from the article
  • Item Class: A reference to a taxonomy term indicating if the item is from the school archives
  • Discipline: Another term reference field which ties the article to one or more of our disciplines: Architecture, Landscape Architecture or City and Regional Planning
  • Showcase: Boolean field which flags the article for inclusion in the Year in Review

The last field was essential so that the communications team liaison could curate the page.  There are more news articles in our school archives then we necessarily want to show in the Year in Review, and the showcase flag solves this problem.

In building our Views, we first wanted to pull all of the Items which have the following characteristics:

  • Item Class: School Archives
  • Showcase: True

So, we build a new View.  While logged in as administrator, we click on Structure, Views then Add a New View.  We want to show Content of type Item, and display an Isotope Grid of fields.  We do not want to use a pager.  In this demo, I’m going to build a Page View, but a Block works as well (as we will see later).  So my settings appear as follows:

GridSettings

Click on Continue & edit.  For the Year in Review we next needed to add our filters – for Item Class and Showcase.  Depending on your implementation, you may not need to filter the results, but likely you will want to narrow the results slightly.  Next to Filter Criteria, click on Add.

addclass0I first searched for Item Class, then clicked on Apply.

addclass1Next, I need to select a value for Item Class and click on Apply.

addclassI repeated the process with the Showcase field.

addshowcase

If you click Update Preview at the bottom of the View edit screen, you’ll see that much of the formatting is already done with just those steps.

preview1

Note that the formatting in the image above is helped along by some CSS.  To style the grid elements, the Views Isotope module contains its own CSS in the module folder ([drupal_install]/sites/all/modules/views_isotope).  You can move forward with this default display if it works for your site.  Or, you can override this in the site’s theme files, which is what I’ve done above.  In my theme CSS file, I have applied the following styling to the class “isotope-element”

.isotope-element {
    float: left;
    height: 140px;
    margin: 6px;
    overflow: hidden;
    position: relative;
    width: 180px;
}
I put the above code in my CSS file associated with my theme, and it overrides the default Views Isotope styling.  “isotope-element” is the class applied to the div which contains all the fields being displayed for each item.  Let’s add a few more items and see how the rendered HTML looks.
First, I want to add an image.  In my case, all of my files are fields of type File, and I handle the rendering through Media based on file type.  But you could use any image field, also.

addfile

I use the Rendered File Formatter and select the Grid View Mode, which applies an Image Style to the file, resizing it to 180 x 140.  Clicking Update Preview again shows that the image has been added each item.

imageAndtext

This is closer, but in our specific example, we want to hide the title until the user hovers over the item.  So, we need to add some CSS to the title field.

hidetitle

In my CSS file, I have the following:

.isotope-grid-text {
    background: none repeat scroll 0 0 #4D4D4F;
    height: 140px;
    left: 0;
    opacity: 0;
    position: absolute;
    top: 0;
    width: 100%;
    z-index: 20;
}

Note the opacity is 0 – which means the div is transparent, allowing the image to show through.  Then, I added a hover style which just changes the opacity to mostly cover the image:

.isotope-grid-text {
  opacity: 0.9;
}

Now, if we update preview, we should see the changes.

imagewhover

The last thing we need to do is add the Discipline field for each item so that we can filter.

There are two very important things here.  First, we want to make sure that the field is not formatted as a link to the term, so we select Plain text as the Formatter.

Second, we need to apply a CSS class here as well, so that the Discipline fields show in filters, not in the grid.  To do that, check the Customize field HTML and select the DIV element.  Then, select Create a class and enter “isotope-filter”.  Also, uncheck “Apply default classes.”  Click Apply.
addfilter1

Using Firebug, I can now look at the generated HTML from this View and see that isotope-element <div> contains all the fields for each item, though the isotope-filter class loads Discipline as hidden.

<div class="isotope-element landscape-architecture" data-category="landscape-architecture">
  <div class="views-field views-field-title"> (collapsed for brevity) </div>
  <div class="views-field views-field-field-file"> (collapsed for brevity) </div>
  <div>  
    <div class="isotope-filter">Landscape Architecture</div>
  </div>
</div>

You might also notice that the data-category for this element is assigned as landscape-architecture, which is our Discipline term for this item.  This data-category will drive the filters.

So, let’s save our View by clicking Save at the top and move on to create our filter block.  Create a new view, but this time create a block which displays taxonomy terms of type Discipline.  Then, click on Continue & Edit.

filterblockThe first thing we want to do is adjust view so that the default row wrappers are not applied.  Note: this is the part I ALWAYS forget, and then when my filters don’t work it takes me forever to track it down.

Click on Settings next to Fields.

fieldsettingsUncheck the Provide default field wrapper elements.  Click Apply.

fieldsettings2

Next, we do not want the fields to be links to term pages, because a user click should filter the results, not link back to the term.  So, click on the term name to edit that field.  Uncheck the box next to “Link this field to its taxonomy term page”.  Click on Apply.

term-nolink

Save the view.

The last thing is to make the block appear on the page with the grid.  In practice, Drupal administrators would use Panels or Context to accomplish this (we use Context), but it can also be done using the Blocks menu.

So, go to Structure, then click on Blocks.  Find our Isotope-Filter Demo block.  Because it’s a View, the title will begin with “View:”

blockname

Click Configure.  Set block settings so that the Filter appears only on the appropriate Grid page, in the region which is appropriate for your theme.  Click save.

blocksettings

Now, let’s visit our /isotope-grid-demo page.  We should see both the grid and the filter list.

final

It’s worth noting that here, too, I have customized the CSS.  If we look at the rendered HTML using Firebug, we can see that the filter list is in a div with class “isotope-options” and the list itself has a class of “isotope-filters”.

<div class="isotope-options">
  <ul class="isotope-filters option-set clearfix" data-option-key="filter">
    <li><a class="" data-option-value="*" href="#filter">All</a></li>
    <li><a class="filterbutton" href="#filter" data-option-value=".architecture">Architecture</a></li>
    <li><a class="filterbutton selected" href="#filter" data-option-value=".city-and-regional-planning">City and Regional Planning</a></li>
    <li><a class="filterbutton" href="#filter" data-option-value=".landscape-architecture">Landscape Architecture</a></li>
  </ul>
</div>

I have overridden the CSS for these classes to remove the background from the filters and change the list-style-type to none, but you can obviously make whatever changes you want.  When I click on one of the filters, it shows me only the news stories for that Discipline.  Here, I’ve clicked on City and Regional Planning.

crpfilter

Next Steps

So, how do we plan to use this in our digital library going forward?  So far, we have mostly used the grid without the filters, such as in one of our Work pages.  This shows the metadata related to a given work, along with all the items tied to that work.  Eventually, each of the taxonomy terms in the metadata will be a link.  The following grids are all created with blocks instead of pages, so that I can use Context to override the default term or node display.

WorkScreenShot

However, in our recently implemented Collection view, we allow users to filter the items based on their type: image, video or document.  Here, you see an example of one of our lecture collections, with the videos and the poster in the same grid, until the user filters for one or the other.

CollectionPage

There are two obstacles to using this feature in a more widespread manner throughout the site.  First, I have only recently figured out how to implement multiple filter options.  For example, we might want to filter our news stories by Discipline and Semester.  To do this, we rewrite the sorting fields in our Grid display so that they all display in one field.  Then, we create two Filter blocks, one for each set of terms.  Implementing this across the site so that users can sort by say, item type and vocabulary term, will make it more useful to us.

Second, we have several Views that might return upwards of 500 items.  Loading all of the image files for this result set is costly, especially when you add in the additional overhead of a full image loading in the background for a Colorbox overlay and Drupal performance issues.  The filters will not work across pages, so if I use pager, I will only filter the items on the page I’m viewing.  I believe this can fixed somehow using Infinite Scroll (as described in several ways here), but I have not tried yet.

With these two advanced options, there are many options for improving the digital library interface.  I am especially interested in how to use multiple filters on a set of search results returned from a SOLR index.

What other extensions might be useful?  Let us know what you think in the comments.

Resources

 


Come to Our Meet-up at #LITAforum!

Are you interested in writing a guest post or becoming a regular contributor to ACRL TechConnect blog? Or, do you blog about library technology?

Three of ACRL TechConnect blog authors, Bohyun, Eric, and Margaret, will be at LITA Forum this year.  (Two of us are on the LITA Forum planning committee and yes, we are very active at LITA as well as in ACRL! :)

So, we decided to have a small meet-up!

Come chat with us about everyday challenges and solutions in library technology over drinks.

This is an informal meet-up & All are welcome!

TechConnect Meet-up at #LITAforum


View NRHA Louisville Activities Map in a larger map
 

 


Responsibilities For Open Access

In honor of Open Access Week, I want to look at some troubling recent discussions about open access, and what academic librarians who work with technology can do. As the manager of an open access institutional repository, I strongly believe that providing greater access to academic research is a good worth pursuing. But I realize that this comes at a cost, and that we have a responsibility to ensure that open access also means integrity and quality.

On “stings” and quality

By now, the article by John Bohannon in Science has been thoroughly dissected in the blogosphere 1. This was not a study per se, but rather a piece of investigative journalism looking into the practices of open access journals. Bohannon submitted variations on an article written under African pseudonyms from fake universities that “any reviewer with more than a high-school knowledge of chemistry…should have spotted the paper’s short-comings immediately.” Over the course of 10 months, he submitted these articles to 304 open access journals whose names he drew from the Directory of Open Access Journals and Jeffrey Beall’s list of predatory open access publishers. Ultimately 157 of the journals accepted the article and 98 rejected it, when any real peer review would have meant that it was rejected in all cases. It is very worth noting that in an analysis of the raw data that Bohannon supplied some publishers on Beall’s list rejected the paper immediately, which is a good reminder to take all curative efforts with an appropriate amount of skepticism 2.

There are certainly many methodological flaws in this investigation, which Mike Taylor outlines in detail in his post 3, and which he concludes was specifically aimed at discrediting open access journals in favor of journals such as Science. As Michael Eisen outlinesScience has not been immune to publishing articles that should have been rejected after peer review–though Bohannon informed Eisen that he intended to look at a variety of journals but this was not practical, and this decision was not informed by editors at Science. Eisen’s conclusion is that “peer review is a joke” and that we need to stop regarding the publication of an article in any journal as evidence that the article is worthwhile 4. Phil Davis at the Scholarly Kitchen took issue with this conclusion (among others noted above), since despite the flaws, this did turn up incontrovertible evidence that “a large number of open access publishers are willfully deceiving readers and authors that articles published in their journals passed through a peer review process…” 5. His conclusion is that open access agencies such as OASPA and DOAJ should be better at policing themselves, and that on the other side Jeffrey Beall should be cautious about suggesting a potential for guilt without evidence. I think one of the more level-headed responses to this piece comes from outside the library and scholarly publishing world in Steven Novella’s post on Neurologica, a blog focused on science and skepticism written by an academic neurologist. He is a fan of open access and wider access to information, but makes the point familiar to all librarians that the internet creates many more opportunities to distribute both good and bad information. Open access journals are one response to the opportunities of the internet, and in particular author-pays journals like “all new ‘funding models’ have the potential of creating perverse incentives.” Traditional journals fall into the same trap when they rely on impact factor to drive subscriptions, which means they may end up publishing “sexy” studies of questionable validity or failing to publish replication studies which are the backbone of the scientific method–and in fact the only real way to establish results no matter what type of peer review has been done 6.

More “perverse incentives”

So far the criticisms of open access have revolved around one type of “gold” open access, wherein the author (or a funding agency) pays article publication fees. “Green” open access, in which a version of the article is posted in a repository is not susceptible to abuse in quite the same way. Yet a new analysis of embargo policies by Shan Sutton shows that some publishers are targeting green open access through new policies. Springer used to have a 12 month embargo for mandated deposit in repositories such as PubMed, but now has extended it to all institutional repositories. Emerald changed its policy so that any mandated deposit to a repository (whether by funder or institutional mandate) was subject to a 24 month embargo  7.

In both cases, paid immediate open access is available for $1,595 (Emerald) or $3,000 (Springer). It seems that the publishers are counting that a “mandate” means that funds are available for this sort of hyrbid gold open access, but that ignores the philosophy behind such mandates. While federal open access mandates do in theory have a financial incentive that the public should not have to pay twice for research, Sutton argues that open access “mandates” at institutions are actually voluntary initiatives by the faculty, and provide waivers without question 8. Additionally, while this type of open access does provide public access to the article, it does not address larger issues of reuse of the text or data in the true sense of open access.

What should a librarian do?

The issues above are complex, but there are a few trends that we can draw on to understand our responsibilities to open access. First, there is the issue of quality, both in terms of researcher experience in working with a journal, and that of being able to trust the validity of an individual article. Second, we have to be aware of the terms under which institutional policies may place authors. As with many such problems, the technological issues are relatively trivial. To actually address them meaningfully will not happen with technology alone, but with education, outreach, and network building.

The major thing we can take away from Bohannon’s work is that we have to help faculty authors to make good choices about where they submit articles. Anyone who works with faculty has stories of extremely questionable practices by journals of all types, both open access and traditional. Speaking up about those practices on an individual basis can result in lawsuits, as we saw earlier this year. Are there technical solutions that can help weed out predatory publishers and bad journals and articles? The Library Loon points out that many factors, some related to technology, have meant that both positive and negative indicators of journal quality have become less useful in recent years. The Loon suggests that “[c]reating a reporting mechanism where authors can rate and answer relatively simple questions about their experiences with various journals seems worthwhile.” 9

The comments to this post have some more suggestions, including open peer review and a forum backed by a strong editor that could be a Yelp-type site for academic publisher reputation. I wrote about open peer review earlier this year in the context of PeerJ, and participants in that system did indeed find the experience of publishing in a journal with quick turnarounds and open reviews pleasant. (Bohannon did not submit a fake article to PeerJ). This solution requires that journals have a more robust technical infrastructure as well as a new philosophy to peer review. More importantly, this is not a solution librarians can implement for our patrons–it is something that has to come from the journals.

The idea that seems to be catching on more is the “Yelp” for scholarly publishers. This seems like a good potential solution, albeit one that would require a great deal of coordinated effort to be truly useful. The technical parts of this type of solution would be relatively easy to carry out. But how to ensure that it is useful for its users? The Yelp analog may be particularly helpful here. When it launched in 2004, it asked users who were searching for some basic information about their question, and to provide the email addresses of additional people whom they would have traditionally asked for this information. Yelp then emailed those people as well as others with similar searches to get reviews of local businesses to build up its base of information. 10 Yelp took a risk in pursuing content in that way, since it could have been off-putting to potential users. But local business information was valuable enough to early users that they were willing to participate, and this seems like a perfect model to build up a base of information on journal publisher practices.

This helps address the problem of predatory publishers and shifting embargoes, but it doesn’t help as much with the issue of quality assurance for the article content. Librarians teach students how to find articles that claim to be peer reviewed, but long before Bohannon we knew that peer review quality varies greatly, and even when done well tells us nothing about the validity of the research findings. Education about the scholarly communication cycle, the scientific method, and critical thinking skills are the most essential tools to ensure that students are using appropriate articles, open access or not. However, those skills are difficult to bring to bear for even the most highly experienced researchers trying to keep up with a large volume of published research. There are a few technical solutions that may be of help here. Article level metrics, particularly alternative metrics, can aid in seeing how articles are being used. (For more on altmetrics, see this post from earlier this year).

One of the easiest options for article level metrics is the Altmetric.com bookmarklet. This provides article level metrics for many articles with a DOI, or articles from PubMed and arXiv. Altmetric.com offers an API with a free tier to develop your own app. An open source option for article level metrics is PLOS’s Article-Level Metrics, a Ruby on Rails application. These solutions do not guarantee article quality, of course, but hopefully help weed out more marginal articles.

No one needs to be afraid of open access

For those working with institutional repositories or other open access issues, it sometimes seems very natural for Open Access Week to fall so near Halloween. But it does not have to be frightening. Taking responsibility for thoughtful use of technical solutions and on-going outreach and education is essential, but can lead to important changes in attitudes to open access and changes in scholarly communication.

 

Notes

  1. Bohannon, John. “Who’s Afraid of Peer Review?” Science 342, no. 6154 (October 4, 2013): 60–65. doi:10.1126/science.342.6154.60.
  2. “Who Is Afraid of Peer Review: Sting Operation of The Science: Some Analysis of the Metadata.” Scholarlyoadisq, October 9, 2013. http://scholarlyoadisq.wordpress.com/2013/10/09/who-is-afraid-of-peer-review-sting-operation-of-the-science-some-analysis-of-the-metadata/.
  3. Taylor, Mike. “Anti-tutorial: How to Design and Execute a Really Bad Study.” Sauropod Vertebra Picture of the Week. Accessed October 17, 2013. http://svpow.com/2013/10/07/anti-tutorial-how-to-design-and-execute-a-really-bad-study/.
  4. Eisen, Michael. “I Confess, I Wrote the Arsenic DNA Paper to Expose Flaws in Peer-review at Subscription Based Journals.” It Is NOT Junk, October 3, 2013. http://www.michaeleisen.org/blog/?p=1439.
  5. Davis, Phil. “Open Access ‘Sting’ Reveals Deception, Missed Opportunities.” The Scholarly Kitchen. Accessed October 17, 2013. http://scholarlykitchen.sspnet.org/2013/10/04/open-access-sting-reveals-deception-missed-opportunities/.
  6. Novella, Steven. “A Problem with Open Access Journals.” Neurologica Blog, October 7, 2013. http://theness.com/neurologicablog/index.php/a-problem-with-open-access-journals/.
  7. Sutton, Shan C. “Open Access, Publisher Embargoes, and the Voluntary Nature of Scholarship: An Analysis.” College & Research Libraries News 74, no. 9 (October 1, 2013): 468–472.
  8. Ibid., 469
  9. Loon, Library. “A Veritable Sting.” Gavia Libraria, October 8, 2013. http://gavialib.com/2013/10/a-veritable-sting/.
  10. Cringely, Robert. “The Ears Have It.” I, Cringely, October 14, 2004. http://www.pbs.org/cringely/pulpit/2004/pulpit_20041014_000829.html.

An Experiment with Publishing on GitHub

Scholarly publishing, if you haven’t noticed, is nearing a crisis. Authors are questioning the value added by publishers. Open Access publications are growing in number and popularity. Peer review is being criticized and re-invented. Libraries are unable to pay price increases for subscription journals. Traditional measures of scholarly impact and journal rankings are being questioned while new ones are developed. Fresh business models or publishing platforms appear to spring up daily.1

I personally am a little frustrated with scholarly publishing, albeit for reasons not entirely related to the above. I find that most journals haven’t adapted to the digital age yet and thus are still employing editorial workflows and yielding final products suited to print.

How come I have yet to see a journal article PDF with clickable hyperlinks? For that matter, why is PDF still the dominant file format? What advantage does a fixed-width format hold over flexible, fluid-width HTML?2 Why are raw data not published alongside research papers? Why are software tools not published alongside research papers? How come I’m still submitting black-and-white charts to publications which are primarily read online? Why are digital-only publications still bound to regular publication schedules when they could publish like blogs, as soon as the material is ready? To be fair, some journals have answered some of these questions, but the issues are still all too frequent.

So, as a bit of an experiment, I recently published a short research study entirely on GitHub.3 I included the scripts used to generate data, the data, and an article-like summary of the whole process.

What makes it possible

Unfortunately, I wouldn’t recommend my little experiment for most scholars, except perhaps for pre- or post-prints of work published elsewhere. Why? The primary reason people publish research is for tenure review, for enhancing a CV. I won’t list my study—though, arguably, I should be able to—simply because it didn’t go through the usual scholarly publishing gauntlet. It wasn’t peer-reviewed, it didn’t appear in a journal, and it wouldn’t count for much in the eyes of traditional faculty members.

However, I’m at a community college. Research and publication are not among my position’s requirements. I’m judged on my teaching and various library responsibilities, while publications are an unnecessary bonus. Would it help to have another journal article on my CV? Yes, probably. But there’s little pressure and personally I’m more interested in experimentation than in lengthening my list of publications.

Other researchers might also worry about someone stealing their ideas or data if they begin publishing an incomplete project. For me, again, publication isn’t really a competitive field. I would be happy to see someone reuse my project, even if they didn’t give proper attribution back. Openness is an advantage, not a vulnerability.

It’s ironic that being at a non-research institution frees me up to do research. It’s done mostly in my free-time, which isn’t great, but the lack of pressure means I can play with modes of publication, or not worry about the popularity of journals I submit to. To some degree, this is indicative of structural problems with scholarly publishing: there’s inertia in that, in order to stay in the game and make a name for yourself, you can’t do anything too wild. You need to publish, and publish in the recognized titles. Only tenured faculty, who after all owe at least some of their success to the current system, can risk dabbling with new publishing models and systems of peer-review.

What’s really good

GitHub, and the web more generally, are great mediums for scholarship. They address several of my prior questions.

For one, the web is just as suited to publishing data as text. There’s no limit on file format or (practically) size. Even if I was analyzing millions of data points, I could make a compressed archive available for others to download, verify, and reuse in their own research. For my project, I used a Google Spreadsheet which allows others to download the data or simply view it on the web. The article itself can be published on GitHub Pages, which provides free hosting for static websites.

article on GitHub pages

Here’s how the final study looks when published on GitHub Pages.

While my study didn’t undergo any peer review, it is open for feedback via a pull request or the “issues” queue on GitHub. Typically, peer review is a closed process. It’s not apparent what criticisms were leveled at an article, or what the authors did to address them. Having peer review out in the open not only illuminates the history of a particular article but also makes it easier to see the value being added. Luckily, there are more and more journals with open peer review, such as PeerJ which we’ve written about previously. When I explain peer review to students, I often open up the “Peer Review history” section of a PeerJ article. Students can see that even articles written by professional researchers have flaws which the reviewing process is designed to identify and mitigate.

Another benefit of open peer review, present in publishing on GitHub too, is the ability to link to specific versions of an article. This has at least two uses. First of all, it has historical value in that one can trace the thought process of the researcher. Much like original manuscripts are a source of insight for literary analyses, merely being able to trace the evolution of a journal article enables new research projects in and of itself.

Secondly, as web content can be a moving target as it is revised over time, being able to link to specific versions aids those referencing a work. Linking to a git “commit” (think a particular point in time), possibly using perma.cc or the Internet Archive to store a copy of the project as it existed then, is an elegant way of solving this problem. For instance, at one point I manually removed some data points which were inappropriate for the study I was performing. One can inspect the very commit where I did this, seeing which lines of text were deleted and possibly identifying any mistakes which were made.

I’ve also grown tired of typical academic writing. The tendency to value erudite over straightforward language, lengthy titles with the snarky half separated from the actually descriptive half by a colon, the anxiety about the particularities of citations and style manuals; all of these I could do without. Let’s write compelling, truthful content without fetishizing consistency and losing the uniqueness of our voice. I’m not saying my little study achieves much in this regard, but it was a relief to be free to write in whatever manner I found most suitable.

Finally, and most encouraging in my mind, the time to publication of a research project can be greatly reduced with new web-based means. I wrote a paper in graduate school which took almost two years to appear in a peer-reviewed journal; by the time I was given the pre-prints to review, I’d entirely forgotten about it. On GitHub, all delays were solely my fault. While it’s true (you can see so in the project’s history) that the seeds of this project were planted nearly a year ago, I started working in earnest just a few months ago and finished the writing in early October.

What’s really bad

GitHub, while a great company which has reduced the effort needed to use version control with its clean web interface and graphical applications, is not the most universally understood platform. I have little doubt that if I were to publish a study on my blog, I would receive more commentary. For one, GitHub requires an account which only coders or technologists would be likely to have already, while many comment platforms (like Disqus) build off of common social media accounts like Twitter and Facebook. Secondly, while GitHub’s “pull requests” are more powerful than comments in that they can propose changes to the actual content of a project, they’re doubtless less understood as well. Expecting scholarly publishing to suddenly embrace software development methodologies is naive at best.

As a corollary to GitHub’s rather niche appeal, my article hasn’t undergone any semblance of peer review. I put it out there; if someone spots an inaccuracy, I’ll make note of and address it, but no relevant parties will necessarily critique the work. While peer review has its problems—many intimate with the problems of scholarly publishing at large—I still believe in the value of the process. It’s hard to argue a publication has reached an objective conclusion when only a single pair of eyes have scrutinized it.

Researchers who are afraid of having their work stolen, or of publishing incomplete work which may contain errors, will struggle to accept open publishing models using tools like GitHub. Prof Hacker, in an excellent post on “Forking the Academy”, notes many cultural challenges to moving scholarly publishing towards an open source software model. Scholars may worry that forking a repository feels like plagiarism or goes against the tradition of valuing original work. To some extent, these fears may come more from misunderstandings than genuine problems. Using version control, it’s perfectly feasible to withhold publishing a project until it’s complete and to remove erroneous missteps taken in the middle of a work. Theft is just as possible under the current scholarly publishing model; increasing the transparency and speed of one’s publishing does not give license to others to take credit for it. Unless, of course, one uses a permissive license like the Public Domain.

Convincing academics that the fears above are unwarranted or can be overcome is a challenge that cannot be overstated. In all likelihood, GitHub as a platform will never be a major player in scholarly publishing. The learning curve, both technical and cultural, is simply too great. Rather, a good starting point would be to let the appealing aspects of GitHub—versioning, pull requests, issues, granular attribution of authorship at the commit level—inform the development of new, user-friendly platforms with final products that more closely resemble traditional journals. Prof Hacker, again, goes a long way towards developing this with a wish list for a powerful collaborative writing platform.

What about the IR?

The discoverability of web publications is problematic. While I’d like to think my research holds value for others’ literature reviews, it’s never going to show up while searching in a subscription database. It seems unreasonable to ask researchers, who already look in many places to compile complete bibliographies, to add GitHub to their list of commonly consulted sources. Further fracturing the scholarly publishing environment not only inconveniences researchers but it goes against the trend of discovery layers and aggregators (e.g. Google Scholar) which aim to provide a single search across multiple databases.

On the other hand, an increasing amount of research‐from faculty and students alike—is conducted through Google, where GitHub projects will appear alongside pre-prints in institutional repositories. Simply being able to tweet out a link to my study, which is readable on a smartphone and easily saved to any read-it-later service, likely increases its readership over stodgy PDFs sitting in subscription databases.

Institutional repositories solve some, but not all, of the deficiencies of publishing on GitHub. Discoverability is increased because researchers at your institution may search the IR just like they do subscription databases. Futhermore, thanks to the Open Archives Initiative and the OAI-PMH standard, content can be aggregated from multiple IRs into larger search engines like OCLC’s OAIster. However, none of the major IR software players support versioned publication. Showing work-in-progress, linking to specific points in time of a work, and allowing for easy reuse are all lost in the IR.

Every publication in its place

As I’ve stated, publishing independently on GitHub isn’t for everyone. It’s not going to show up on your CV and it’s not necessarily going to benefit from the peer review process. But plenty of librarians are already doing something similar, albeit a bit less formal: we’re writing blog posts with original research or performing quick studies at our respective institutions. It’s not a great leap to put these investigations under version control and then publish them on the web. GitHub could be a valuable compliment to more traditional venues, reducing the delay between when data is collected and when it’s available for public consumption. Furthermore, it’s not at all mutually exclusive with article submissions. One could gain both the immediate benefit of getting one’s conclusions out there, but also produce a draft of a journal article.

As scholarly publishing continues to evolve, I hope we’ll see a plethora of publishing models rather than one monolithic process replacing traditional print-based journals. Publications hosted on GitHub, or a similar platform, would sit nicely alongside open, web-based publications like PeerJ, scholarly blog/journal hybrids like In The Library with the Lead Pipe, deposits in Institutional Repositories, and numerous other sources of quality content.

Notes

  1. I think a lot of these statements are fairly well-recognized in the library community, but here’s some evidence: the recent Open Access “sting” operation (which we’ll cover more in-depth in a forthcoming post) that exposed flaws in some journals’ peer review process, altmetrics, PeerJ, other experiments with open peer review (e.g. by Shakespeare Quarterly), the serials crisis (which is well-known enough to have a Wikipedia entry), predictions that all scholarship will be OA in a decade or two, and increasing demands that scholarly journals allow text mining access all come to mind.
  2. I’m totally prejudiced in this matter because I read primarily through InstaPaper. A journal like Code4Lib, which publishes in HTML, is easy to send to read-it-later services, while PDFs aren’t. PDFs also are hard to read on smartphones, but they can preserve details like layout, tables, images, and font choices better than HTML. A nice solution is services which offer a variety of formats for the same content, such as Open Journal Systems with its ability to provide HTML, PDF, and ePub versions of articles.
  3. For non-code uses of GitHub, see our prior Tech Connect post.

Redesigning the Item Record Summary View in a Library Catalog and a Discovery Interface

A. Oh, the Library Catalog

Almost all librarians have a love-hate relationship with their library catalogs (OPAC), which are used by library patrons. Interestingly enough, I hear a lot more complaints about the library catalog from librarians than patrons. Sometimes it is about the catalog missing certain information that should be there for patrons. But many other times, it’s about how crowded the search results display looks. We actually all want a clean-looking, easy-to-navigate, and efficient-to-use library catalog. But of course, it is much easier to complain than to come up with an viable alternative.

Aaron Schmidt has recently put forth an alternative design for a library item record. In his blog post, he suggests a library catalog shifts its focus from the bibliographic information (or metadata if not a book) of a library item to a patron’s tasks performed in relation to the library item so that the catalog functions more as “a tool that prioritizes helping people accomplish their tasks, whereby bibliographic data exists quietly in the background and is exposed only when useful.” This is a great point. Throwing all the information at once to a user only overwhelms her/him. Schmidt’s sketch provides a good starting point to rethink how to design the library catalog’s search results display.

Screen Shot 2013-10-09 at 1.34.08 PM

From the blog post, “Catalog Design” by Aaron Schmidt

B. Thinking about Alternative Display Design

The example above is, of course, too simple to apply to the library catalog of an academic library straight away. For an usual academic library patron to determine whether s/he wants to either check out or reserve the item, s/he is likely to need a little more information than the book title, the author, and the book image. For example, students who look for textbooks, the edition information as well as the year of publication are important. But I take it that Schmidt’s point was to encourage more librarians to think about alternative designs for the library catalog rather than simply compare what is available and pick what seems to be the best among those.

Screen Shot 2013-10-09 at 1.44.36 PM

Florida International University Library Catalog – Discovery layer, Mango, provided by Florida Virtual Campus

Granted that there may be limitations in how much we can customize the search results display of a library catalog. But that is not a reason to stop thinking about what the optimal display design would be for the library catalog search results. Sketching alternatives can be in itself a good exercise in evaluating the usability of an information system even if not all of your design can be implemented.

Furthermore, more and more libraries are implementing a discovery layer over their library catalogs, which provides much more room to customize the display of search results than the traditional library catalog. Open source discovery systems such as Blacklight or VuFind provides great flexibility in customizing the search results display. Even proprietary discovery products such as Primo, EDS, Summon offer a level of customization by the libraries.

Below, I will discuss some principles to follow in sketching alternative designs for search results in a library catalog, present some of my own sketches, and show other examples implemented by other libraries or websites.

C. Principles

So, if we want to improve the item record summary display to be more user-friendly, where can we start and what kind of principles should we follow? These are the principles that I followed in coming up with my own design:

  • De-clutter.
  • Reveal just enough information that is essential to determine the next action.
  • Highlight the next action.
  • Shorten texts.

These are not new principles. They are widely discussed and followed by many web designers including librarians who participate in their libraries’ website re-design. But we rarely apply these to the library catalog because we think that the catalog is somehow beyond our control. This is not necessarily the case, however. Many libraries implement discovery layers to give a completely different and improved look from that of their ILS-es’ default display.

Creating a satisfactory design on one’s own instead of simply pointing out what doesn’t work or look good in existing designs is surprisingly hard but also a refreshing challenge. It also brings about the positive shift of focus in thinking about a library catalog from “What is the problem in the catalog?” to “What is a problem and what can we change to solve the problem?”

Below I will show my own sketches for an item record summary view for the library catalog search results. These are clearly a combination of many other designs that I found inspiring in other library catalogs. (I will provide the source of those elements later in this post.) I tried to mix and revise them so that the result would follow those four principles above as closely as possible. Check them out and also try creating your own sketches. (I used Photoshop for creating my sketches.)

D. My Own Sketches

Here is the basic book record summary view. What I tried to do here is giving just enough information for the next action but not more than that: title, author, type, year, publisher, number of library copies and holds. The next action for a patron is to check the item out. On the other hand, undecided patrons will click the title to see the detailed item record or have the detailed item record to be texted, printed, e-mailed, or to be used in other ways.

(1) A book item record

Screen Shot 2013-10-09 at 12.46.38 PM

This is a record of a book that has an available copy to check out. Only when a patron decides to check out the item, the next set of information relevant to that action – the item location and the call number – is shown.

(2) With the check-out button clicked

check out box open

If no copy is available for check-out, the best way to display the item is to signal that check-out is not possible and to highlight an alternative action. You can either do this by graying out the check-out button or by hiding the button itself.

Many assume that adding more information would automatically increase the usability of a website. While there are cases in which this could be true, often a better option is to reveal information only when it is relevant.

I decided to gray out the check-out button when there is no available copy and display the reserve button, so that patrons can place a hold. Information about how many copies the library has and how many holds are placed (“1 hold / 1 copy”) would help a patron to decide if they want to reserve the book or not.

(3) A book item record when check-out is not available

Screen Shot 2013-10-09 at 12.34.54 PM

I also sketched two other records: one for an e-Book without the cover image and the other with the cover image. Since the appropriate action in this case is reading online, a different button is shown. You may place the ‘Requires Login’ text or simply omit it because most patrons will understand that they will have to log in to read a library e-book and also the read-online button will itself prompt log in once clicked anyway.

(4) An e-book item record without a book cover

Screen Shot 2013-10-09 at 12.35.54 PM

(5) An e-book item record with a book cover

Screen Shot 2013-10-09 at 12.48.33 PM

(6) When the ‘Read Online’ button is clicked, an e-book item record with multiple links/providers

When there are multiple options for one electronic resource, those options can be presented in a similar way in which multiple copies of a physical book are shown.

Screen Shot 2013-10-09 at 12.35.22 PM

(6) A downloadable e-book item record

For a downloadable resource, changing the name of the button to ‘download’ is much more informative.

Screen Shot 2013-10-09 at 12.35.13 PM

(7) An e-journal item record

Screen Shot 2013-10-09 at 12.47.31 PM

(7) When the ‘Read Online’ button is clicked, an e-journal item record with multiple links/providers

Screen Shot 2013-10-09 at 12.41.56 PM

E. Inspirations

Needless to say, I did not come up with my sketches from scratch. Here are the library catalogs whose item record summary view inspired me.

torontopublic

Toronto Public Library catalog has an excellent item record summary view, which I used as a base for my own sketches. It provides just enough information for the summary view. The title is hyperlinked to the detailed item record, and the summary view displays the material type and the year in bod for emphasis. The big green button also clearly shows the next action to take. It also does away with unnecessary labels that are common in library catalog such as ‘Author:’ ‘Published:’ ‘Location:’ ‘Link:.’

User Experience Designer Ryan Feely, who worked on Toronto Public Library’s catalog search interface, pointed out the difference between a link and an action in his 2009 presentation “Toronto Public Library Website User Experience Results and Recommendations.” Actions need to be highlighted as a button or in some similar design to stand out to users (slide 65). And ideally, only the actions available for a given item should be displayed.

Another good point which Feely makes (slide 24) is that an icon is often the center of attention and so a different icon should be used to signify different type of materials such as a DVD or an e-Journal. Below are the icons that Toronto Public Library uses for various types of library materials that do not have unique item images. These are much more informative than the common “No image available” icon.

eAudiobooke-journal eMusic

vinyl VHS eVideo

University of Toronto Libraries has recently redesigned their library catalog to be completely responsive. Their item record summary view in the catalog is brief and clear. Each record in the summary view also uses a red and a green icon that helps patrons to determine the availability of an item quickly. The icons for citing, printing, e-mailing, or texting the item record that often show up in the catalog are hidden in the options icon at the bottom right corner. When the mouse hovers over, a variety of choices appear.

Screen Shot 2013-10-09 at 4.45.33 PM

univtoronto

Richland Library’s catalog displays library items in a grid as a default, which makes the catalog more closely resemble an online bookstore or shopping website. Patrons can also change the view to have more details shown with or without the item image. The item record summary view in the default grid view is brief and to the point. The main type of patron action, such as Hold or Download, is clearly differentiated from other links as an orange button.

richland

Screen Shot 2013-10-13 at 8.33.46 PM

Standford University Library offers a grid view (although not as the default like Richland Library). The grid view is very succinct with the item title, call number, availability information in the form of a green checkmark, and the item location.

Screen Shot 2013-10-13 at 8.37.37 PM

What is interesting about Stanford University Library catalog (using Blacklight) is that when a patron hovers its mouse over an item in the grid view, the item image displays the preview link. And when clicked, a more detailed information is shown as an overlay.

Screen Shot 2013-10-13 at 8.37.58 PM

Brigham Young University completely customized the user interface of the Primo product from ExLibris.

byu

And University of Michigan Library customized the search result display of the Summon product from SerialsSolutions.

Screen Shot 2013-10-14 at 11.16.49 PM

Here are some other item record summary views that are also fairly straightforward and uncluttered but can be improved further.

Sacramento Public Library uses the open source discovery system, VuFind, with little customization.

dcpl

I have not done an extensive survey of library catalogs to see which one has the best item record summary view. But it seemed to me that in general academic libraries are more likely to provide more information than necessary in the item record summary view and also to require patrons to click a link instead of displaying relevant information right away. For example, the ‘Check availability’ link that is shown in many library catalogs is better when it is replaced by the actual availability status of ‘available’ or ‘checked out.’ Similarly, the ‘Full-text online’ or ‘Available online’ link may be clearer with an button titled ‘Read online’ or ‘Access online.’

F. Challenges and Strategies

The biggest challenge in designing the item record summary view is to strike the balance between too little information and too much information about the item. Too little information will require patrons to review the detailed item record just to identify if the item is the one they are looking for or not.

Since librarians know many features of the library catalog, they tend to err on the side of throwing all available features into the item record summary view. But too much information not only overwhelms patrons and but also makes it hard for them to locate the most relevant information at that stage and to identify the next available action. Any information irrelevant to a given task is no more than noise to a patron.

This is not a problem unique to a library catalog but generally applicable to any system that displays search results. In their book, Designing the Search Experience , Tony Russell-Rose and Tyler Tate describes this as achieving ‘the optimal level of detail.’ (p.130)

Useful strategies for achieving the optimal level of detail for the item summary view in the case of the library catalog include:

  • Removing all unnecessary labels
  • Using appropriate visual cues to make the record less text-heavy
  • Highlighting next logical action(s) and information relevant to that action
  • Systematically guiding a patron to the actions that are relevant to a given item and her/his task in hand

Large online shopping websites, Amazon, Barnes & Noble, and eBay all make a good use of these strategies. There are no labels such as ‘price,’ ‘shipping,’ ‘review,’ etc. Amazon highlights the price and the user reviews most since those are the two most deciding factors for consumers in their browsing stage. Amazon only offers enough information for a shopper to determine if s/he is further interested in purchasing the item. So there is not even the Buy button in the summary view. Once a shopper clicks the item title link and views the detailed item record, then the buying options and the ‘Add to Cart’ button are displayed prominently.

Screen Shot 2013-10-09 at 1.21.15 PM

Barnes & Noble’s default display for search results is the grid view, and the item record summary view offers only the most essential information – the item title, material type, price, and the user ratings.

Screen Shot 2013-10-09 at 1.24.05 PM

eBay’s item record summary view also offers only the most essential information, the highest bid and the time left, while people are browsing the site deciding whether to check out the item in further detail or not.

Screen Shot 2013-10-09 at 1.28.47 PM

G. More Things to Consider

An item record summary view, which we have discussed so far, is surely the main part of the search results page. But it is only a small part of the search results display and even a smaller part of the library catalog. Optimizing the search results page, for example, entails not just re-designing the item record summary view but choosing and designing many other elements of the page such as organizing the filtering options on the left and deciding on the default and optional views. Determining the content and the display of the detailed item record is another big part of creating a user-friendly library catalog. If you are interested in this topic, Tony Russell-Rose and Tyler Tate’s book Designing the Search Experience (2013) provides an excellent overview.

Librarians are professionals trained in many uses of a searchable database, a known item search, exploring and browsing, a search with incomplete details, compiling a set of search results, locating a certain type of items only by location, type, subject, etc. But since our work is also on the operation side of a library, we often make the mistake of regarding the library catalog as one huge inventory system that should hold and display all the acquisition, cataloging, and holdings information data of the library collection. But library patrons are rarely interested in seeing such data. They are interested in identifying relevant library items and using them. All the other information is simply a guide to achieving this ultimate goal, and the library catalog is another tool in their many toolboxes.

Online shopping sites optimize their catalog to make purchase as efficient and simple as possible. Both libraries and online shopping sites share the common interests of guiding the users to one ultimate task – identifying an appropriate item for the final borrowing or access/purchase. For creating user-oriented library catalog sketches, it is helpful to check out how non-library websites are displaying their search results as well.

Screen Shot 2013-10-13 at 9.52.27 PM

music

themes

Once you start looking other examples, you will realize that there are very many ways to display search results and you will soon want to sketch your own alternative design for the search results display in the library catalog and the discovery system. What do you think would be the level of optimum detail for library items in the library catalog or the discovery interface?

Further Reading

 

 


Web Scraping: Creating APIs Where There Were None

Websites are human-readable. That’s great for us, we’re humans. It’s not so great for computer programs, which tend to be better at navigating structured data rather than visuals.

Web scraping is the practice of “scraping” information from a website’s HTML. At its core, web scraping lets programs visit and manipulate a website much like people do. The advantage to this is that, while programs aren’t great at navigating the web on their own, they’re really good at repeating things over and over. Once a web scraping script is set up, it can run an operation thousands of times over without breaking a sweat. Compare that to the time and tedium of clicking through a thousand websites to copy-paste the information you’re interested in and you can see the appeal of automation.

Why web scraping?

Why would anybody use web scraping? There are a few good reasons which are, unfortunately, all too common in libraries.

You need an API where there is none.

Many of the web services we subscribe to don’t expose their inner workings via an API. It’s worth taking a moment to explain the term API, which is used frequently but rarely given a better definition beyond the uninformative “Application Programming Interface”.

Let’s consider a common type of API, a search API. When you visit Worldcat and search, the site checks an enormous database of millions of metadata records and returns a nice, visually formatted list of ones relevant to your query. Again, this is great for humans. We can read through the results and pick out the ones we’re interested in. But what happens when we want to repurpose this data elsewhere? What if we want to build a bento search box, displaying results from our databases and Worldcat alongside each other?1 The answer is that we can’t easily accomplish this without an API.

For example, the human-readable results of search engine may look like this:

1. Instant PHP Web Scraping

by Jacob Ward

Publisher: Packt Publishing 2013

2. Scraping by: wage labor, slavery, and survival in early Baltimore

by Seth Rockman

Publisher: Johns Hopkins University Press 2009

That’s fine for human eyes, but for our search application it’s a pain in the butt. Even if we could embed a result like this using an iframe, the styling might not match what we want and the metadata fields might not display in a manner consistent with our other records (e.g. why is the publication year included with publisher?). What an API returns, on the other hand, may look like this:

[
  {
    "title": "Instant PHP Web Scraping",
    "author": "Jacob Ward",
    "publisher": "Packt Publishing",
    "publication_date": "2013"
  },
  {
    "title": "Scraping by: wage labor, slavery, and survival in early Baltimore",
    "author": "Seth Rockman",
    "publisher": "Johns Hopkins University Press",
    "publication_date": "2009"
  }
]

Unless you really love curly braces and quotation marks, that looks awful. But it’s very easy to manipulate in many programming languages. Here’s an incomplete example in Python:

results = json.load( data )
for result in results:
  print result.title + ' - ' + result.author

Here “data” is our search results from above and we can use a function to easily parse that data into a variable. The script then loops over each search result and prints out its title in author in a format like “Instant PHP Web Scraping – Jacob Ward”.

An API is hard to use or doesn’t have the data you need.

Sometimes services do expose their data via an API, but the API has limitations that the human interface of the website doesn’t. Perhaps it doesn’t expose all the metadata which is visible in search results. Fellow Tech Connect author Margaret Heller mentioned that Ulrich’s API doesn’t include subject information, though it’s present in the search results presented to human users.

Some APIs can also be more difficult to use than web scraping. The ILS at my place of work is like this; you have to pay extra to get the API activated and it requires server configuration on a shared server I don’t have access to. The API has strict authentication requirements which are required even for read-only calls (e.g. I’m just accessing publicly-viewable data, not making account changes). The boilerplate code the vendor provides doesn’t work, or rather only works for trivial examples. All these hurdles combine to make scraping the catalog appealing.

As a side effect, how you reconfigure a site’s data might inspire its own API. Are you sorely missing a feature so bad you need to hack around it? Writing a nice proof-of-concept with web scraping might prove that there’s a use case for a particular API feature.

How?

More or less all web scraping works the same way:

  • Use a scripting language to get the HTML of a particular page
  • Find the interesting pieces of a page using CSS, XPath, or DOM traversal—any means of identifying specific HTML elements
  • Manipulate those pieces, extracting the data you need
  • Pipe the data somewhere else, e.g. into another web page, spreadsheet, or script

Let’s go through an example using the Directory of Open Access Journals. Now, the DOAJ has an API of sorts; it supports retrieving metadata via the OAI-PMH verbs. This means a request for a URL like http://www.doaj.org/oai?verb=GetRecord&identifier=18343147&metadataPrefix=oai_dc will return XML with information about one of the DOAJ journals. But OAI-PMH doesn’t support any search APIs; we can use standard identifiers and other means of looking up specific articles or publications, but we can’t do a traditional keyword search.

Libraries, of the code persuasion

Before we get too far, let’s lean on those who came before us. Scraping a website is both a common task and a complex one. Remember last month, when I said that we don’t need to reinvent the wheel in our programming because reusable modules exist for most common tasks? Please let’s not write our own web scraping library from scratch.

Code libraries, which go by different names depending on the language (most amusingly, they’re called “eggs” in Python and “gems” in Ruby), are pre-written chunks of code which help you complete common tasks. Any task which several people have had to do before probably has a library devoted to it. Google searches for “best [insert task] module for [insert language]” typically turn up useful guidance on where to start.

While each language has its own means of incorporating others’ code into your own, they all basically have two steps: 1) download the external library somewhere onto your hard drive or server, often using a command-line tool, and 2) import the code into your script. The external library should have some documentation on how to use it’s special features once you’re imported it.

What does this look like in PHP, the language our example will be in? First, we visit the Simple HTML DOM website on Sourceforge to download a single PHP file. Then, we place that file in the same directory that our scraping script will live. In our scraping script, we write a single line up at the top:

<?php
require_once( 'simple_html_dom.php' );
?>

Now it’s as if the whole contents of the simple_html_dom.php file were in our script. We can use functions and classes which were defined in the other file, such as the file_get_html function which is not otherwise available. PHP actually has a few functions which are used to import code in different ways; the documentation page for the include function describes the basic mechanics.

Web scraping a DOAJ search

While the DOAJ doesn’t have a search API, it does have a search bar which we can manipulate in our scraping. Let’s run a test search, view the HTML source of the result, and identify the elements we’re interested in. First, we visit doaj.org and type in a search. Note the URL:

doaj.org/doaj?func=search&template=&uiLanguage=en&query=librarianship

I’ve highlighted the key-value pairs in the URLs query string, making the keys bold and the values italicized. Here our search term was “librarianship” which is the value associated with the appropriately-named “query” key. If we change the word “librarianship” to a different search term and visit the new URL, we see results for the new term, predictably. With easily hackable URLs like this, it’s easy for us to write a web scraping script. Here’s the first half of our example in PHP:

<?php
// see http://simplehtmldom.sourceforge.net/manual_api.htm for documentation
require_once( 'simple_html_dom.php' );

$base = 'http://www.doaj.org/doaj?func=search&template=&uiLanguage=en&query=';
$query = urlencode( 'librarianship' );

$html = file_get_html( $base . $query );
// to be continued...
?>

So far, everything is straightforward. We insert the web scraping library we’re using, then use what we’ve figured out about the DOAJ URL structure: it has a base which won’t change and a query which we want to change according to our interests. You could have the query come from command-line arguments or web form data like the $_GET array in PHP, but let’s just keep it as a simple string.

We urlencode the string because we don’t want spaces or other illegal characters sneaking their way in there; while the script still works with $query = 'new librarianship' for example, using unencoded text in URLs is a bad habit to get into. Other functions, such as file_get_contents, will produce errors if passed a URL with spaces in it. On the other hand, urlencode( 'new librarianship' ) returns the appropriately encoded string “new+librarianship”. If you do take user input, remember to sanitize it before using it elsewhere.

For the second part, we need to investigate the HTML source of DOAJ’s search results page. Here’s a screenshot and a simplified example of what it looks like:

2 search results from the DOAJ

A couple search results from DOAJ for the term “librarianship”

<div id="result">
  <div class="record" id="record1">
    <div class="imageDiv">
      <img src="/doajImages/journal.gif"><br><span><small>Journal</small></span>
    </div><!-- END imageDiv -->
    <div class="data">
      <a href="/doaj?func=further&amp;passMe=http://www.collaborativelibrarianship.org">
        <b>Collaborative Librarianship</b>
      </a>
      <strong>ISSN/EISSN</strong>: 19437528
      <br><strong>Publisher</strong>: Regis University
      <br><strong>Subject</strong>:
      <a href="/doaj?func=subject&amp;cpId=129&amp;uiLanguage=en">Library and Information Science</a>
      <br><b>Country</b>: United States
      <b>Language</b>: English<br>
      <b>Start year</b> 2009<br>
      <b>Publication fee</b>:
    </div> <!-- END data -->
    <!-- ...more markup -->
  </div> <!-- END record -->
  <div class="recordColored" id="record2">
    <div class="imageDiv">
      <img src="/doajImages/article.png"><br><span><small>Article</small></span>
    </div><!-- END imageDiv -->
    <div class="data">
       <b>Mentoring for Emerging Careers in eScience Librarianship: An iSchool – Academic Library Partnership </b>
      <div style="color: #585858">
        <!-- author (s) -->
         <strong>Authors</strong>:
          <a href="/doaj?func=search&amp;query=au:&quot;Gail Steinhart&quot;">Gail Steinhart</a>
          ---
          <a href="/doaj?func=search&amp;query=au:&quot;Jian Qin&quot;">Jian Qin</a><br>
        <strong>Journal</strong>: <a href="/doaj?func=issues&amp;jId=88616">Journal of eScience Librarianship</a>
        <strong>ISSN/EISSN</strong>: 21613974
        <strong>Year</strong>: 2012
        <strong>Volume</strong>: 1
        <strong>Issue</strong>: 3
        <strong>Pages</strong>: 120-133
        <br><strong>Publisher</strong>: University of Massachusetts Medical School
      </div><!-- End color #585858 -->
    </div> <!-- END data -->
    <!-- ...more markup -->
   </div> <!-- END record -->
   <!-- more records -->
</div> <!-- END results list -->

Even with much markup removed, there’s a lot going on here. We need to zone in on what’s interesting and find patterns in the markup that help us retrieve it. While it may not be obvious from the example above, the title of each search result is contained in a <b> tag towards the beginning of each record (lines 8 and 26 above).

Here’s a sketch of the element hierarchy leading to the title: a <div> with id=”result” > a <div> with a class of either “record” or “recordColored” > a <div> with a class of “data” > possibly an <a> tag (present in the first example, absent in the second) > the <b> tag containing the title. Noting the conditional parts of this hierarchy is important; if we didn’t note that sometimes an <a> tag is present and that the class can be either “record” or “recordColored”, we wouldn’t be getting all the items we want.

Let’s try to return the titles of all search results on the first page. We can use Simple HTML DOM’s find method to extract the content of specific elements using CSS selectors. Now that we know how the results are structured, we can write a more complete example:

<?php
require_once( 'simple_html_dom.php' );

$base = 'http://www.doaj.org/doaj?func=search&template=&uiLanguage=en&query=';
$query = urlencode( 'librarianship' );

$html = file_get_html( $base . $query );

// using our knowledge of the DOAJ results page
$records = $html->find( '.record .data, .recordColored .data' );

foreach( $records as $record ) {
  echo $record->getElementsByTagName( 'b', 0 )->plaintext . PHP_EOL;
}
?>

The beginning remains the same, but this time we actually do something with the HTML. We use find to pull the records which have class “data.” Then we echo the first <b> tag’s text content. The getElementsByTagName method typically returns an array, but if you pass a second integer parameter it returns the array element at that index (0 being the first element in the array, because computer scientists count from zero). The ->plaintext property simply contains the text found in the element, if we echoed the element itself we would see opening and closing <b> tags wrapped around the title. Finally, we append an “end-of-line” (EOL) character just to make the output easier to read.

To see our results, we can run our script on the command line. For Linux or Mac users, that likely means merely opening a terminal (in Applications/Utilities on a Mac) since they come with PHP pre-installed. On Windows, you may need to use WAMP or XAMPP to run PHP scripts. XAMPP gives you a “shell” button to open a terminal, while you can put the PHP executable in your Windows environment variables if you’re using WAMP.

Once you have a terminal open, the php command will execute whatever PHP script you pass it as a parameter. If we run php name-of-our-script.php in the same directory as our script, we see ten search result titles printed to the terminal:

> php doaj-search.php
Collaborative Librarianship
Mentoring for Emerging Careers in eScience Librarianship: An iSchool – Academic Library Partnership
Education for Librarianship in Turkey Education for Librarianship in Turkey
Turkish Librarianship: A Selected Bibliography Turkish Librarianship: A Selected Bibliography
Journal of eScience Librarianship
Editorial: Our Philosophies of Librarianship
Embedded Academic Librarianship: A Review of the Literature
Model Curriculum for 'Oriental Librarianship' in India
A General Outlook on Turkish Librarianship and Libraries
The understanding of subject headings among students of librarianship

This is a simple, not-too-useful example. But it could expanded in many ways. Try copying the script above and attempting some of the following:

  • Make the script return more than the ten items on the first page of results
  • Use some of DOAJ’s advanced search functions, for instance a date limiter
  • Only return journals or articles, not both
  • Return more than just the title of results, for instance the author(s), URLs, or publication date

Accomplishing these tasks involves learning more about DOAJ’s URL and markup structure, but also learning more about the scraping library you’re using.

Common Problems

There are a couple possible hangups when web scraping. First of all, many websites employ user-agent sniffing to serve different versions of themselves to different devices. A user agent is a hideous string of text which web browsers and other HTTP clients use to identify themselves.2 If a site misinterprets our script’s user agent, we may end up on a mobile or other version of a site instead of the desktop one we were expecting. Worse yet, some sites try to prevent scraping by blacklisting certain user agents.

Luckily, most web scraping libraries have tools built in to work around this problem. A nice example is Ruby’s Mechanize, which has an agent.user_agent_alias property which can be set to a number of popular web browsers. When using an alias, our script essentially tells the responding web server that it’s a common desktop browser and thus is more likely to get a standard response.

It’s also routine that we’ll want to scrape something behind authentication. While IP authentication can be circumvented by running scripts from an on-campus connection, other sites may require login credentials. Again, most web scraping libraries already have built-in tools for handling authentication. We can find which form controls on the page we need to fill in, insert your username and password into the form, and then submit it programmatically. Storing a login in a plain text script is never a good idea though, so be careful.

Considerations

Not all web scraping is legitimate. Taking data which is copyrighted and merely re-displaying it on our site without proper attribution is not only illegal, it’s just not being a good citizen of the web. The Wikipedia article on web scraping has a lengthy section on legal issues with a few historical cases from various countries.

It’s worth noting that web scraping can be very brittle, meaning it breaks often and easily. Scraping typically relies on other people’s markup to remain consistent. If just a little piece of HTML changes, our entire script might be thrown off, looking for elements that no longer exist.

One way to counteract this is to write selectors which are as broad as possible. For instance, let’s return to the DOAJ search results markup. Why did we use such a concise CSS selector to find the title when we could have been much more specific? Here’s a more explicit way of getting the same data:

$html->find( 'div#result > div.record > div.data, div#result > div.recordColored > div.data' );

What’s wrong with these selectors? We’re relying on so much more to stay the same. We need: the result wrapper to be a <div>, the result wrapper to have an id of “result”, the record to be a <div>, and the data inside the record to be a <div>. Our use of the child selector “>” means we need the element hierarchy to stay precisely the same. If any of these properties of the DOAJ markup changed, our selector wouldn’t find anything and our script would need to be updated. Meanwhile, our much more generic line still grabs the right information because it doesn’t depend on particular tags or other aspects of the markup remaining constant:

$html->find( '.record .data, .recordColored .data' );

We’re still relying on a few things—we have to, there’s no getting around that in web scraping—but a lot could change and we’d be set. If the DOAJ upgraded to HTML5 tags, swapping out <div> for <article> or <section>, we would be OK. If the wrapping <div> was removed, or had its id change, we’d be OK. If a new wrapper was inserted in between the “data” and “record” <div>, we’d be OK. Our approach is more resilient.

If you did try running our PHP script, you probably noticed it was rather slow. It’s not like typing a query into Google and seeing results immediately. We have to request a page from an external site, which then queries its backend database, processes the results, and displays HTML which we ultimately don’t use, at least not as intended. This highlights that web scraping isn’t a great option for user-facing searches; it can take too long to return results. One option is to cache searches, for instance storing results of previous scrapings in a database and then checking to see if the database has something relevant before resorting to pulling content off an external site.

It’s also worth noting that web scraping projects should try to be reasonable about the number of times they request an external resource. Every time our script pulls in a site’s HTML, it’s another request that site’s server has to process. A site may not have an API because it cannot handle the amount of traffic one would attract. If our web scraping project is going to be sending thousands of requests per hour, we should consider how reasonable that is. A simple email to the third party explaining what we’re doing and the amount of traffic it may generate is a nice courtesy.

Overall, web scraping is handy in certain situations (see below) or for scripts which are run seldom or a single time. For instance, if we’re doing an analysis of faculty citations at our institution, we might not have access to a raw list of citations. But faculty may have university web pages where they list all their publications in a consistent format. We could write a script which only needs to run once, culling a large list of citations for analysis. Once we’ve scraped that information, you could use OpenRefine or other power tools to extract particular journal titles or whatever else we’re interested in.

How is web scraping used in libraries?

I asked Twitter what other libraries are using web scraping for and got a few replies:

@phette23 Pulling working papers off a departmental website for the inst repo. Had to web scrape for metadata.
— Ondatra libskoolicus (@LibSkrat) September 25, 2013

Matthew Reidsma of Grand Valley State University also had several examples:

To fuel a live laptop/iPad availability site by scraping holdings information from the catalog. See the availability site as well as the availability charts for the last few days and the underlying code which does the scraping. This uses the same Simple HTML Dom library as our example above.

It’s also used to create a staff API by scraping the GVSU Library’s Staff Directory and reformatting it; see the code and the result. The result may not look very readable—it’s JSON, a common data format that’s particularly easy to reuse in some languages such as JavaScript—but remember that APIs are for machine-readable data which can be easily reused by programs, not people.

Jacqueline Hettel of Stanford University has a great blog post that describes using a Google Chrome extension and XPath queries to scrape acknowledgments from humanities monographs in Google Books; no coding required! She and Chris Bourg are presenting their results at the Digital Library Federation in November.

Finally, I use web scraping to pull hours information from our main library site into our mobile version. I got tired of updating the hours in two places every time they changed, so now I pull them in using a PHP script. It’s worth noting that this dual-maintenance annoyance is one major reason websites can and should be done in responsive designs.

Most of these library examples are good uses of web scraping because they involve simply transporting our data from one system to another; scraping information from the catalog to display it elsewhere is a prime use case. We own the data, so there are no intellectual property issues, and they’re our own servers so we’re responsible for keeping them up.

Code Libraries

While we’ve used PHP above, there’s no need to limit ourselves to a particular programming language. Here’s a set of popular web scraping choices in a few languages:

To provide a sense of how the different tools above work, I’ve written a series of gists which uses each to scrape titles from the first page of a DOAJ search.

Notes
  1. See the NCSU or Stanford library websites for examples of this search style. Essentially, results from several different search engines—a catalog, databases, the library website, study guides—are all displaying on the same page in seperate “bento” compartments.
  2. The browser I’m in right now, Chrome, has this beauty for a user agent string: “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36″. Yes, that’s right: Mozilla, Mac, AppleWebKit, KHTML, Gecko, Chrome, & Safari all make an appearance.

A Brief Look at Cryptography for Librarians

You may not think much about cryptography on a daily basis, but it underpins your daily work and personal existence. In this post I want to talk about a few realms of cryptography that affect the work of academic librarians, and talk about some interesting facets you may never have considered. I won’t discuss the math or computer science basis of cryptography, but look at it from a historical and philosophical point of view. If you are interested in the math and computer science, I have a few a resources listed at the end in addition to a bibliography.

Note that while I will discuss some illegal activities in this post, neither I nor anyone connected with the ACRL TechConnect blog is suggesting that you actually do anything illegal. I think you’ll find the intellectual part of it stimulation enough.

What is cryptography?

Keeping information secret is as simple as hiding it from view in, say, an envelope, and trusting that only the person to whom it is addressed will read that information and then not tell anyone else. But we all know that this doesn’t actually work. A better system would only allow a person with secret credentials to open the envelope, and then for the information inside to be in a code that only she could know.

The idea of codes to keep important information secret goes back thousands of years , but for the purposes of computer science, most of the major advances have been made since the 1970s. In the 1960s with the advent of computing for business and military uses, it was necessary to come up with ways to encrypt data. In 1976, the concept of public-key cryptography was developed, but it wasn’t realized practically until 1978 with the paper by Rivest, Shamir, and Adleman–if you’ve ever wondered what RSA stood for, there’s the answer. There were some advancements to this system, which resulted in the digital signature algorithm as the standard used by the federal government.1 Public-key systems work basically by creating a private and a public key–the private one is known only to each individual user, and the public key is shared. Without the private key, however, the public key can’t open anything. See the resources below for more on the math that makes up these algorithms.

Another important piece of cryptography is that of cryptographic hash functions, which were first developed in the late 1980s. These are used to encrypt blocks of data– for instance, passwords stored in databases should be encrypted using one of these functions. These functions ensure that even if someone unauthorized gets access to sensitive data that they cannot read it. These can also be used to verify the identify of a piece of digital content, which is probably how most librarians think about these functions, particularly if you work with a digital repository of any kind.

Why do you care?

You probably send emails, log into servers, and otherwise transmit all kinds of confidential information over a network (whether a local network or the internet). Encrypted access to these services and the data being transmitted is the only way that anybody can trust that any of the information is secret. Anyone who has had a credit card number stolen and had to deal with fraudulent purchases knows first-hand how upsetting it can be when these systems fail. Without cryptography, the modern economy could not work.

Of course, we all know a recent example of cryptography not working as intended. It’s no secret (see above where keeping something a secret requires that no one who knows the information tells anyone else) by now that the National Security Agency (NSA) has sophisticated ways of breaking codes or getting around cryptography though other methods 2 Continuing with our envelope analogy from above, the NSA coerced companies to allow them to view the content of messages before the envelopes were sealed. If the messages were encoded, they got the keys to decode the data, or broke the code using their vast resources. While these practices were supposedly limited to potential threats, there’s no denying that this makes it more difficult to trust any online communications.

Librarians certainly have a professional obligation to keep data about their patrons confidential, and so this is one area in which cryptography is on our side. But let’s now consider an example in which it is not so much.

Breaking DRM: e-books and DVDs

Librarians are exquisitely aware of the digital rights management realm of cryptography (for more on this from the ALA, see The ALA Copyright Office page on digital rights ). These are algorithms that encode media in such a way that you are unable to copy or modify the material. Of course, like any code, once you break it, you can extract the material and do whatever you like with it. As I covered in a recent post, if you purchase a book from Amazon or Apple, you aren’t purchasing the content itself, but a license to use it in certain proscribed ways, so legally you have no recourse to break the DRM to get at the content. That said, you might have an argument under fair use, or some other legitimate reason to break the DRM. It’s quite simple to do once you have the tools to do so. For e-books in proprietary formats, you can download a plug-in for the Calibre program and follow step by step instructions on this site. This allows you to change proprietary formats into more open formats.

As above, you shouldn’t use software like that if you don’t have the rights to convert formats, and you certainly shouldn’t use it to pirate media. But just because it can be used for illegal purposes, does that make the software itself illegal? Breaking DVD DRM offers a fascinating example of this (for a lengthy list of CD and DVD copy protection schemes, see here and for a list of DRM breaking software see here). The case of CSS (Content Scramble System) descramblers illustrates some of the strange philosophical territory into which this can end up. The original code was developed in 1999, and distributed widely, which was initially ruled to be illegal. This was protested in a variety of ways; the Gallery of CSS Descramblers has a lot more on this 3. One of my favorite protest CSS descramblers is the “illegal” prime number, which is a prime number that contains the entire code for breaking the CSS DRM. The first illegal prime number was discovered in 2001 by Phil Carmody (see his description here) 4. This number is, of course, only illegal inasmuch as the information it represents is illegal–in this case it was a secret code that helped break another secret code.

In 2004, after years of court hearings, the California Court of Appeal overturned one of the major injunctions against posting the code, based on the fact that  source code is protected speech under the first amendment , and that the CSS was no longer a trade secret. So you’re no longer likely to get in trouble for posting this code–but again, using it should only be done for reasons protected under fair use. [5.“DVDCCA v Bunner and DVDCCA v Pavlovich.” Electronic Frontier Foundation. Accessed September 23, 2013. https://www.eff.org/cases/dvdcca-v-bunner-and-dvdcca-v-pavlovich.] One of the major reasons you might legitimately need to break the DRM on a DVD is to play DVDs on computers running the Linux operating system, which still has no free legal software that will play DVDs (there is legal software with the appropriate license for $25, however). Given that DVDs are physical media and subject to the first sale doctrine, it is unfair that they are manufactured with limitations to how they may be played, and therefore this is a code that seems reasonable for the end consumer to break. That said, as more and more media is streamed or otherwise licensed, that argument no longer applies, and the situation becomes analogous to e-book DRM.

Learning More

The Gambling With Secrets video series explains the basic concepts of cryptography, including the mathematical proofs using colors and other visual concepts that are easy to grasp. This comes highly recommended from all the ACRL TechConnect writers.

Since it’s a fairly basic part of computer science, you will not be surprised to learn that there are a few large open courses available about cryptography. This Cousera class from Stanford is currently running, and this Udacity class from University of Virginia is a self-paced course. These don’t require a lot of computer science or math skills to get started, though of course you will need a great deal of math to really get anywhere with cryptography.

A surprising but fun way to learn a bit about cryptography is from the NSA’s Kids website–I discovered this years ago when I was looking for content for my X-Files fan website, and it is worth a look if for nothing else than to see how the NSA markets itself to children. Here you can play games to learn basics about codes and codebreaking.

  1. Menezes, A., P. van Oorschot, and S. Vanstone. Handbook of Applied Cryptography. CRC Press, 1996. http://cacr.uwaterloo.ca/hac/. 1-2.
  2. See the New York Times and The Guardian for complete details.
  3. Touretzky, D. S. (2000) Gallery of CSS Descramblers. Available: http://www.cs.cmu.edu/~dst/DeCSS/Gallery, (September 18, 2013).
  4. For more, see Caldwell, Chris. “The Prime Glossary: Illegal Prime.” Accessed September 17, 2013. http://primes.utm.edu/glossary/xpage/Illegal.html.

10 Practical Tips for Compiling Your Promotion or Tenure File

 

Flickr image by Frederic Bisson http://www.flickr.com/photos/38712296@N07/3604417507/

If you work at an academic library, you may count as faculty. Whether the faculty status comes with a tenure track or not, it usually entails a more complicated procedure for promotion than the professional staff status.  At some libraries, the promotion policy and procedure is well documented, and a lot of help and guidance are given to those who are new to the process. At other libraries, on the other hand, there may be less help available and the procedure documentation can be not quite clear. I recently had the experience of compiling my promotion file. I thought that creating a promotion file would not be too difficult since I have been collecting most of my academic and professional activities. But this was not the case at all. Looking back, there are many things I would have done differently to make the process less stressful.

While this post does not really cover a technology topic that we at ACRL TechConnect usually write about, applying for promotion and/or tenure is something that many academic librarians go through. So I wanted to share some lessons that I learned from my first-time experience of crating a promotion binder as a non-tenure track faculty.

Please bear in mind that the actual process of assembling your promotion or tenure file can differ depending on your institution. At my university, everything has to be printed and filed in a binder and multiple copies of such a binder are required for the use of the tenure and promotion committee. At some places, librarians may only need to print all documentation but don’t need to actually create a binder. At other places, you may do everything online using a system such as Digital Measures, Sedona or Interfolio, and you do not have to deal with papers or binders at all. Be aware that if you do have to deal with actual photocopying, filing, and creating a binder, there will be some additional challenges.

Also my experience described here was for promotion, not tenure. If you are applying for a tenure, see these posts that may be helpful:

1. Get a copy of the promotion and tenure policy manual of your library and institution.

In my case, this was not possible since my library as well as the College of Medicine, to which the library belongs, did not have the promotion policy until very recently. But if you work at an established academic library, there will be a promotion/tenure procedure and policy manual for librarians. Some of the manual may refer to the institution’s faculty promotion and tenure policy manual as well. So get a copy of both and make sure to find out under which category librarians fall. You may count as non-tenured faculty, tenure-track faculty, or simply professionals. You may also belong to an academic department and a specific college, or you may belong to simply your library which counts as a college with a library dean.

You do not have to read the manual as soon as you start working. It will certainly not be a gripping read. But do get a copy and file it in your binder. (It is good to have a binder for promotion-related records even if you do not actually have to create a promotion binder yourself or everything can be filed electronically.)

2. Know when you become eligible for the application for promotion/tenure and what the criteria are.

Once you obtain a copy of your library’s promotion/tenure policy, take a quick look at the section that specifies how many years of work is required for you to apply for promotion or tenure and what the promotion / tenure criteria are. An example of  the rankings of non-tenure track librarians at an academic library are: Instructor, Assistant, Associate, and University Librarian. This mirrors the academic faculty rankings of Instructor, Assistant, Associate, and Full Professor. But again, your institution may have a different system. Each level of promotion will have a minimum number of years required, such as 2 years for the promotion from Instructor to Assistant Librarian, and specific criteria applied to that type of promotion. This is good to know early in your career, so that you can coordinate and organize your academic and professional activities to match what your institution expects its librarians to perform as much as possible.

3. Ask those who went through the same process already.

Needless to day, the most helpful advice comes from those librarians who went through the same process. They have a wealth of knowledge to share. So don’t hesitate to ask them what the good preparatory steps are for future application. Even if you have a very general question, they will always point out what to pay attention to in advance.

Also at some libraries, the promotion and tenure committee holds an annual workshop for those who are interested in submitting an application. Even if you are not yet planning to apply and it seems way too early to even consider such a thing, it may be a good idea to attend one just to get an overview. The committee is very knowledgeable about the whole process and consists of librarians experienced in the promotion and tenure process.

4. Collect and gather documentation under the same categories that your application file requires.

The promotion file can require a lot of documentation that you may neglect to collect on a daily basis. For example, I never bothered to keep track of the committee appointment notification e-mails, and the only reason I saved the conference program booklets were because of a colleague’s advice that I got to save them for the promotion binder in the future as the proof of attendance.  (It would have never dawned on me. And even then, I lost some program booklets for conferences I attended.) This is not a good thing.

Since there was no official promotion policy for my library when I started, I simply created a binder and filed anything and everything that might be relevant to the promotion file some day. However, over the last five years, this binder got extremely fat. This is also not a good practice. When I needed those documentation to actually create and organize my promotion file, it was a mess. It was good that I had at least quite a bit of documentation saved. But I had to look through all of them again because they belonged to different categories and the dates were all mixed up.

So, it is highly recommended that you should check the categories of the application file that your library/institution requires before creating a binder. Do not just throw things into a binder or a drawer if possible. Make separate binders or drawers under the same categories that your application file requires such as publication, presentation, university service, community service, professional services, etc.  Also organize the documentation by year and keep the list of items in each category. Add to the list every time you file something. Pretend that you are doing this for your work, not for your promotion, to motivate yourself.

Depending on your preference and the way your institution handles the documentation for your promotion or tenure application, it may make a better sense to scan and organize everything in a digital form as long as the original document is not required. You can use citation management system such as Mendeley or Refworks to keep the copies of all your publications for example. These will easily generate the up-to-date bibliography of your publications for your CV. If your institution uses a system that keeps track of faculty’s research, grant, publication, teaching, and service activities such as Digital Measures or Sedona, those systems may suit you better as you can keep track of more types of activities than just publications. You can also keep a personal digital archive of everything that will go into your application file either on your local computer or on your Dropbox, Google Drive, or SkyDrive account. The key is to save and organize when you have something that would count towards promotion and tenure in hand “right away.”

One more thing. If you publish a book chapter, depending on the situation, you may not get the copy of the book or the final version of your book chapter as a PDF from your editor or publisher. This is no big deal until you have to ask your college to do a rush ILL for you. So take time in advance to obtain at least one hard copy or  the finished PDF version of your publication particularly in the case of book chapters.

5. What do I put into my promotion file or tenure dossier?

There are common items such as personal statement, CV, publications, and services, which are specified in the promotion/tenure policy manual. But some of the things that may not belong to these categories or that make you wonder if it is worth putting into your promotion file. It really depends on what else you have it in your application file. If your application file is strong enough, you may skip things like miscellaneous talks that you gave or newsletter articles that you wrote for a regional professional organization. But ask a colleague for advice first and check if your file looks balanced in all areas.

6. Make sure to keep documentation for projects that only lived a short life.

Another thing to keep in mind is to keep track of all the projects you worked on. As time goes on, you may forget some of the work you did. If you create a website, a LibGuide, a database application, a section in the staff intranet, etc., some of those may last a long time, but others may get used only for a while and then disappear or be removed. Once disappeared projects are hard to show in your file as part of your work and achievement unless you documented the final project result when it was up and running and being used. So take the screenshots, print out the color copies of those screenshots, and keep the record of the dates during which you worked on the project and of the date on which the project result was released, implemented etc.

If you work in technology, you may have more of this type of work than academic publications. Check your library’s promotion and tenure policy manual to see if it has the category of ‘Creative Works’ or something similar under which you can add these items.

If you are assembling your binder right now, and some projects you worked on are completely gone, check the WayBack machine from the Internet Archive and see if you can find an archived copy. Not always available, but if you don’t have anything else, this may the only way to find some evidence of your work that you can document.

7. Update your CV and the list of Continuing Education activities on a regular basis.

Ideally, you will be doing this every year when you do your performance review. But it may not be required. Updating your CV is certainly not the most exciting thing to do, but it must be done. Over the last five years, I have done CV updates only when it was required for accreditation purposes (which requires the current CVs of all faculty). This was better than not having updated my CV at all for sure. But since I did not really update it with the promotion application in mind, when I needed to create one for the promotion application file, I had to redo the CV moving items and organizing them in different categories. So make sure to check your library’s or your institution’s faculty promotion/tenure policy manual. The manual includes the format of CV that the dossier needs to adopt. Use that format for your CV and update it every year. (I think that during the Christmas holidays may be a good time for this kind of task from now on for me.)

Some people keep the most up-to-date CV in their Dropbox’s public folder, and that is also a good idea if you have a website and share your CV there.

Some of the systems I have mentioned earlier – Digital Measure and Sedona – also allow you to create a custom template which you can utilize for the promotion/tenure application purpose. If the system has been in use for many years in your institution, there may be  a pre-made template for promotion and tenure purposes.

8. Make sure to collect all appointment e-mails to committees and other types of services you do.

Keeping the records of all services is a tricky thing as we tend to pay little attention to the appointment e-mails to committees or other types of services that we perform for universities or professional organizations. I assumed that they were all in my inbox somewhere and did not properly organize them. As a result, I had to spend hours looking for them when I was compiling my binder.

This can be easily avoided if you keep a well-organized e-mail archive where you file e-mails as they come in. Sometimes, I found that I either lost the appointment e-mail or never received one. You can file other email correspondences as documentation for that service. But the official appointment e-mail would certainly be better in this case.

This also reminded me that I should write thank-you e-mails to the members of my committees that I chaired and to the committee chairs I worked with as a board member of ALA New Members Round Table. It is always nice to file a letter of appreciation rather than a letter of appointment. And as a committee chair or a board member, it should be something that you do without being asked. By sending these thank you e-mails, your committee members or chairs can file and use them when they need for their performance, promotion, or tenure review without requesting and waiting.

9. Check the timeline dates for the application.

Universities and colleges usually have a set of deadlines you have to meet in order for you to be considered for promotion or tenure. For example, you may have to have a meeting by a certain date with your supervisor or your library dean and get the green light to go for promotion. Your supervisor may have to file an official memorandum to the dean’s office until a certain date as a formal notification. Your department chair (if you are appointed to an academic department) may have to receive a memo about your application by a certain date. Your promotion file may have to be submitted to your academic department’s Promotion and Tenure committee by some time in advance before it gets forwarded to the Promotion and Tenure committee of the college. The list goes on and on. These deadlines are hard to keep tabs on but have to be tracked carefully not to miss them.

10. Plan ahead

I had to compile and create my promotion binder and three copies within a week’s notice, but this is a very unusual case due to special circumstances.  Something like this is unlikely to happen to you, but remember that creating the whole application file will take much more time than you imagine. I could have done some of the sorting out and organizing documentation work myself in advance but delayed it because there was a web data application which I was developing for my library. Looking back, I should have at least started working on the promotion file even if things were unclear and even if I had little time to spare outside of my ongoing work projects. It would have given me a much more accurate sense of how much time I will have to spend eventually on the whole dossier.

Also remember to request evaluation letters in advance. This was the most crazy part for me because I was literally given one week to request and get letters from internal and external reviewers. Asking people of a letter in a week’s time is close to asking for the impossible particularly if the reviewers are outside of your institution and have to be contacted not by you but by a third party.  I was very lucky to get all the signed PDF letters in time, but I do not recommend this kind of experience to anyone.

Plan ahead and plan well in advance. Find out whether you need letters from internal or external reviewers, how many, and what the letters need to cover. Make sure to create a list of colleagues you can request a letter from who are familiar with your work. When you request a letter, make sure to highlight the promotion or tenure criteria and what the letter needs to address, so that letter writers can quickly see what they need to focus on when they review your work. If there are any supplementary materials such as publications, book chapters, presentations, etc., make sure to forward them as well along with your CV and statement.

Lastly, if Your Application File Must be In Print…

You are lucky if you have the option to submit everything electronically or to simply submit the documentation to someone who will do the rest of work such as photocopying, filing, making binders, etc. But your institution may require the application file to be submitted in print, in multiple copies sometimes. And you may be responsible for creating those binders and copies yourself. I had to submit 4 binders, each of which exactly identical, and I was the one who had to do all the photocopying, punching holes, and filing them into a binder. I can tell you photocopying and punching holes for the documents that fill up a very thick binder and doing that multiple times was not exactly inspiring work. If this is your case as well, I recommend creating one binder as a master copy and using the professional photocopy/binding service to create copies. It would have been so much better for my sanity. In my case, the time was too short for me to create one master copy and then bring it to the outside service to make additional copies. So plan ahead and make sure you have time to use outside service. I highly recommend not using your own labor for photocopying and filing.

*       *       *

If you have any extra tips or experience to share about the promotion or tenure process at an academic library, please share them in the comments section. Hopefully in the future, all institutions will allow people to file their documentation electronically. There are also tools such as Interfolio (http://www.interfolio.com/) that you can use, which is particularly convenient for those who has to get external letters that directly have to go to the tenure and promotion committee.

Are there any other tools? Please share them in the comments section as well. Best of luck to all librarians going for promotion and tenure!

 


Library Quest: Developing a Mobile Game App for A Library

This is the story  of Library Quest (iPhone, Android), the App That (Almost) Wasn’t. It’s a (somewhat) cautionary tale of one library’s effort to leverage gamification and mobile devices to create a new and different way of orienting students to library services and collections.  Many libraries are interested in the possibilities offered by both games and mobile devices,  and they should be.  But developing for mobile platforms is new and largely uncharted territory for libraries, and while there have been some encouraging developments in creating games in library instruction, other avenues of game creation are mostly unexplored.  This is what we learned developing our first mobile app and our first large-scale game…at the same time!

Login Screen

The login screen for the completed game. We use integrated Facebook login for a host of technical reasons.

Development of the Concept: Questing for Knowledge

The saga of Library Quest began in February of 2012, when I came on board at Grand Valley State University Libraries as Digital Initiatives Librarian.  I had been reading some books on gamification and was interested in finding a problem that the concept might solve.  I found two.  First, we were about to open a new 65 million dollar library building, and we needed ways to take advantage of the upsurge of interest we knew this would create.  How could we get people curious about the building to learn more about our services, and to strengthen that into a connection with us?  Second, GVSU libraries, like many other libraries, was struggling with service awareness issues.  Comments by our users in the service dimension of our latest implementation of Libqual+ indicated that many patrons missed out on using services like inter-library loan because they were unaware that they existed.  Students often are not interested in engaging with the library until they need something specific from us, and when that need is filled, their interest declines sharply.  How could we orient students to library services and create more awareness of what we could do for them?

We designed a very simple game to address both problems.  It would be a quest or task based game, in which students actively engaged with our services and spaces, earning points and rewards as they did so.  The game app would offer tasks to students, verify their progress through multistep tasks by asking users to input alphanumeric codes or by scanning QR codes (which we ended up putting on decals that could be stuck to any flat surface).  Because this was an active game, it seemed natural to target it at mobile devices, so that people could play as they explored.  The mobile marketplace is more or less evenly split between iOS and Android devices, so we knew we wanted the game to be available on both platforms.  This became the core concept for Library Quest.  Library administration gave the idea their blessing and approval to use our technology development budget, around $12,000, to develop the game.  Back up and read that sentence over if you need to, and yes, that entire budget was for one mobile app.  The expense of building apps is the first thing to wrap your mind around if you want to create one.  While people often think of apps as somehow being smaller and simpler than desktop programs, the reality is very different.

IMG_0101

The main game screen. We found a tabbed view worked best, with quests that are available in one tab, quests that have been accepted but not completed in another, and finished quests in the third.

We contracted with Yeti CGI, a outside game development firm, to do the coding.  This was essential-app development is complicated and we didn’t have the necessary skills or experience in-house.  If we hadn’t used an outside developer, the game app would never have gotten off the ground.  We had never worked with a game-development company before, and Yeti had never worked with a library, although they had ties to higher education and were enthusiastic about the project.  Working with an outside developer always carries certain risks and advantages, and communication is always an issue.

One thing we could have done more of at this stage was spend time working on game concept and doing paper prototyping of that concept.  In his book Game Design Workshop, author Tracey Fullerton stresses two key components in designing a good game: defining the experience you want the player to have, and doing paper prototyping.  Defining the game experience from the player’s perspective forces the game designer to ask questions about how the game will play that it might not otherwise occur to them to ask.  Will this be a group or a solo experience?  Where will the fun come from?  How will the player negotiate the rules structure of the game?  What choices will they have at what points?  As author Jane McGonigal notes, educational games often fail because they do not put the fun first, which is another way of saying that they haven’t fully thought through the player’s experience.  Everything in the game: rules, rewards, format, etc.  should be shaped from the concept of the experience the designer wants to give the player.  Early concepts can and should be tested with paper prototyping.  It’s a lot easier to change rules structure for a game made with paper, scissors, and glue than with code and developers (and a lot less expensive).  In retrospect, we could have spent more time talking about experience and more time doing paper prototypes before we had Yeti start writing code.  While our game is pretty solid, we may have missed opportunities to be more innovative or provide a stronger gameplay experience.

Concept to conception: Wireframing and Usability Testing

The first few months of development were spent creating, approving, and testing paper wireframes of the interface and art concepts.  While we perhaps should have done more concept prototyping, we did do plenty of usability testing of the game interface as it developed, starting with the paper prototypes and continuing into the initial beta version of the game.  That is certainly something I would recommend that anyone else do as well.  Like a website or anything else that people are expected to use, a mobile app interface needs to be intuitive and conform to user expectations about how it should operate, and just as in website design, the only way to create an interface that does so is to engage in cycles of iterative testing with actual users.  For games, this is particularly important because they are supposed to be fun, and nothing is less fun than struggling with poor interface design.

A side note related to usability: one of the things that surfaced in doing prototype testing of the app was that giving players tasks involving library resources and watching them try to accomplish those tasks turns out to be an excellent way of testing space and service design as well.  There were times when students were struggling not with the interface, but with the library! Insufficient signage, space layout that was not clear, assumed knowledge of or access to information the students had no way of knowing, were all things became apparent in watching students try to do tasks that should have been simple.  It serves as a reminder that usability concepts apply to the physical world as much as they do to the web, and that we can and should test services in the real world the same way we test them in virtual spaces.

photo

A quest in progress. We can insert images and links into quest screens, which allows us to use webpages and images as clues.

Development:  Where the Rubber Meets the Phone

Involving an outside developer made the game possible, but it also meant that we had to temper our expectations about the scale of app development.  This became much more apparent once we’d gotten past paper prototyping and began testing beta versions of the game.  There were several ideas that we  developed early on, such as notifications of new quests, and an elaborate title system, that had to be put aside as the game evolved because of cost, and because developing other features that were more central to gameplay turned out to be more difficult than anticipated.  For example, one of the core concepts of the game was that students would be able to scan QR codes to verify that they had visited specific locations.  Because mobile phone users do not typically have QR code reader software installed, Yeti built QR code reader functionality into the game app.  This made scanning the code a more seamless part of gameplay, but getting the scanner software to work well on both the android and iOS versions proved a major challenge (and one that’s still vexing us somewhat at launch).  Tweaks to improve stability and performance on iOS threw off the android version, and vice versa.  Despite the existence of programs like Phonegap and Adobe Air, which will supposedly produce versions of the software that run on both platforms, there can still be a significant amount of work involved in tuning the different versions to get them to work well.

Developing apps that work on the android platform is particularly difficult and expensive.  While Apple has been accused of having a fetish for control, their proprietary approach to their mobile operating system produces a development environment that is, compared to android, easy to navigate.  This is because android is usually heavily modified by specific carriers and manufacturers to run on their hardware. Which means that if you want to ensure that your app runs well on an android device, the app must be tested and debugged on that specific combination of android version and hardware.  Multiply the 12 major versions of android still commonly used by the hundreds of devices that run it, and you begin to have an idea of the scope of the problem facing a developer.  While android only accounts for 50% of our potential player base, it easily took up 80% of the time we spent with Yeti debugging, and the result is an app that we are sure works on only a small selection of android devices out there.  By contrast, it works perfectly well on all but the very oldest versions of iOS.

Publishing a Mobile App: (Almost) Failure to Launch

When work began on Library Quest, our campus had no formal approval process for mobile apps, and the campus store accounts were controlled by our student mobile app development lab.  In the year and a half we spent building it, control of the campus store accounts was moved to our campus IT department, and formal guidelines and a process for publishing mobile apps started to materialize.  All of which made perfect sense, as more and more campus entities were starting to develop mobile apps and campus was rightly concerned about branding and quality issues, as well as ensuring any apps that were published furthered the university’s teaching and research mission.  However, this resulted in us trying to navigate an approval process as it materialized around us very late in development, with requests coming in for changes to the game appearance to bring it into like with new branding standards when the game was almost complete.

It was here the game almost foundered as it was being launched. During some of the discussions, it surfaced that one of the commercial apps being used by the university for campus orientation bore some superficial resemblance to Library Quest in terms of functionality, and the concern was raised that our app might be viewed as a copy.  University counsel got involved.  For a while, it seemed the app might be scrapped entirely, before it ever got out to the students!  If there had been a clear approval process when we began the app, we could have dealt with this at the outset, when the game was still in the conceptual phase.  We could have either modified the concept, or addressed the concern before any development was done.  Fortunately, it was decided that the risk was minimal and we were allowed to proceed.

A quest completion screen for one of our test quests.  These screens stick around when the quest is done, forming a kind of personalized FAQ about library services and spaces.

A quest completion screen for one of our test quests. These screens stick around when the quest is done, forming a kind of personalized FAQ about library services and spaces.

Post-Launch: Game On!

As I write this, it’s over a year since Library Quest was conceived and it has just been released “into the wild” on the Apple and Google Play stores.  We’ve yet to begin the major advertising push for the game, but it already has over 50 registered users.  While we’ve learned a great deal, some of the most important questions about this project are still up in the air.  Can we orient students using a game?  Will they learn anything?  How will they react to an attempt to engage with them on mobile devices?  There are not really a lot of established ways to measure success for this kind of project, since very few libraries have done anything remotely like it.  We projected early on in development that we wanted to see at least 300 registered users, and that we wanted at least 50 of them to earn the maximum number of points the game offered.  Other metrics for success are “squishier,” and involve doing surveys and focus groups once the game wraps to see what reactions students had to the game.  If we aren’t satisfied with performance at the end of the year, either because we didn’t have enough users or because the response was not positive, then we will look for ways to repurpose the app, perhaps as part of classroom teaching in our information literacy program, or as part of more focused and smaller scale campus orientation activities.

Even if it’s wildly successful, the game will eventually need to wind down, at least temporarily.  While the effort-reward cycle that games create can stimulate engagement, keeping that cycle going requires effort and resources.  In the case of Library Quest, this would include the money we’ve spent on our prizes and the effort and time we spend developing quests and promoting the game.  If Library Quest endures, we see it having a cyclical life that’s dependent on the academic year.  We would start it anew each fall, promoting it to incoming freshmen, and then wrap it up near the end of our winter semester, using the summers to assess and re-engineer quests and tweak the app.

Lessons Learned:  How to Avoid Being a Cautionary Tale
  1. Check to see if your campus has an approval process and a set of guidelines for publishing mobile apps. If it doesn’t, do not proceed until they exist. Lack of such a process until very late in development almost killed our game. Volunteer to help draft these guidelines and help create the process, if you need to.  There should be some identified campus app experts for you to talk to before you begin work, so you can ask about apps already in use and about any licensing agreements campus may have. There should be a mechanism to get your concept approved at the outset, as well as the finished product.
  2. Do not underestimate the power of paper.  Define your game’s concept early, and test it intensively with paper prototypes and actual users.  Think about the experience you want the players to have, as well as what you want to teach them.  That’s a long way of saying “think about how to make it fun.”  Do all of this before you touch a line of code.
  3. Keep testing throughout development.  Test your wireframes, test your beta version, test, test, test with actual players.  And pay attention to anything your testing might be telling you about things outside the game, especially if the game interfaces with the physical world at all.
  4. Be aware that mobile app development is hard, complex, and expensive.  Apps seem smaller because they’re on small devices, but in terms of complexity, they are anything but.  Developing cross-platform will be difficult (but probably necessary), and supporting android will be an ongoing challenge.  Wherever possible, keep it simple.  Define your core functionality (what does the app *have* to do to accomplish its mission) and classify everything else you’d like it to do as potentially droppable features.
  5. Consider your game’s life-cycle at the outset.  How long do you need it to run to do what you want it to do?  How much effort and money will you need to spend to keep it going for that long?  When will it wind down?
References

Fullerton, Tracy.  Game Design Workshop (4th Edition).  Amsterdam, Morgan Kaufmann.  2008

McGonigal, Jane.  Reality is Broken: Why Games Make us Better and How they Can Change the World. Penguin Press, New York.  2011

About our Guest Author:
Kyle Felker is the Digital Initiatives Librarian at Grand Valley State University Libraries, where he has worked since February of 2012.  He is also a longtime gamer.  He can be reached at felkerk@gvsu.edu, or on twitter @gwydion9.