Decentralizing Library IT

I’ve always gravitated toward library jobs in library systems and technology, but I recently took on a new position as head of a tech services department in a smaller academic library.  Some of my colleagues expressed surprised that I’m moving out of a traditional library IT or systems role, but my former position was as a systems librarian within a technical services department, and for the past few years, a significant amount of my time recently has involved developing collection and metadata-related system integrations for acquisitions and cataloging. A few trends have made me think that I’m not alone in branching out and applying systems skills to diverse functional areas of the library.  It has become relatively commonplace for the work of technology innovation to occur, at least in part, outside of traditional library IT departments; for example, reference and instruction librarians playing a tightly integrated role in the optimization of discovery interfaces, tech services staff using Python and linked data technologies to clean up and enhance metadata, and instruction librarians and access services staff creating and managing high-tech MakerSpaces.

More personnel across the library are embracing and developing high tech skills traditionally housed in library systems or IT departments.  The following are six general trends I’ve observed that are influencing the spread of technology development outside of traditional library IT.

Increasingly high technical skills are required for most library areas

  • Job advertisements for almost every functional area of the library emphasize advanced technical knowledge (beyond typical office application knowledge), especially with regard to ILS systems management.  In a 2016 study of library job advertisements, the authors found a wide range of job titles that require knowledge and skills in information technology, including Metadata Librarian, Digital Archivist, Information Literacy Librarian, and Research Data Librarian (Shahbazi, Fahimnia, & Khoshemehr, 2016).   Scholarly communications, data services, e-resource management, reference, and other library staff positions may all be positioned outside of traditional library IT, but are all deeply involved in the utilization and development of library technologies.

Optimization of cloud-based systems can be distributed

  • With cloud-based application hosting, managing physical servers and backups may become less burdensome, but the need for knowledgeable personnel to configure and optimize often complex cloud-based systems is as essential as it has always been.  Scholarly communications, e-resource, and access services library staff may play highly integrated roles in the development and optimizing of library systems.  Beyond acting as consultants for the management of these systems, knowledge in scripting and interoperability mechanisms enables staff outside of library IT to holistically contribute to the development of cloud-based library applications.

More opportunities to build integrations

  • New library services platforms often enable a number of integrations with third party systems via APIs and other web services.1 Many of these integrations require a deep knowledge of workflows and data structures between multiple systems, so including involvement from multiple functional areas is usually required. The combination of knowledge of a functional area with knowledge of how the library system works can result in some seriously powerful and useful applications.

The increasing importance of data wrangling

  • Sources of metadata are increasingly varied, requiring data wrangling, work that is often made more efficient by developing coding or scripting methods to automate routine tasks.  Metadata specialists are often experts in developing macros in OCLC Connexion (for example), and increasingly require access to a computing environment that enables writing, testing, and using code to further automate metadata cross-walking and cleanup, including use of Python, OpenRefine, and other tools for dealing with enormous amounts of messy data.

Customization of discovery and library e-content

  • Discovery platforms are complex and often highly customizable. Many reference and instruction librarians have a robust understanding of user behavior and information literacy goals that are essential for development of usable interfaces, as well as skills in user experience (UX) testing and interface design.  Reference and instruction librarians are often experts in course management systems and LibGuides, and know good tricks and hacks for optimizing digital learning content.  Library collection development, scholarly communications and technical services librarians deeply understand content and how to make it findable, and increasingly play pivotal roles in configuring harvesting and transformation of metadata into discovery systems.

Systems beyond the ILS

  • Libraries are engaging with a much wider variety of technologies than just an ILS – libraries support institutional repository and digital library software, data management software, authority systems such as VIAF, open publishing, etc. While working with these systems does not necessarily require a background or emphasis on systems administration, it is definitely helpful to have an understanding of the architecture of such applications and how applications might interact with each other.
  • Systems knowledge is applicable to more than just technology. Thinking like a programmer can often be useful when performing workflow analysis and optimization, as well as problem-solving even in non-technical areas.

Are “true” library systems administrators still needed? (Yes, obviously)

When researching for this post, I came across an amusing article by Roy Tennant from back in 2011 titled “If You are a Library SysAdmin, you are TOAST”.  The article presents a (seemingly not satirical?) argument that movement to cloud-based systems in libraries will make library system administrators obsolete:

When I, as just a moderately savvy librarian, can learn maybe five to ten very specific steps and be able to deploy any application I would likely want to deploy, why do I need to talk to my system administrator ever again?

Obviously, six years after this article was written, with many libraries firmly embedded in the cloud with a variety of library applications, the role of the system administrator in library is not at all diminished.  While work involving physical servers and backups may be less common for many applications, system administrators and those with IT skills in libraries are still in huge demand to be on hand to evaluate, optimize, and provide integrations for cloud-based library systems.

I think it’s safe to say that the more people in any organization with technical knowledge, the better.  Managing decentralized technology projects, however, does require leadership and coordination.  When learning to develop applications, coding is often much more fun than worrying about server administration and security – but of course, someone has to be concerned about security and help those who may just be learning about technology adopt secure development practices.  Library technology projects don’t have to come out of library IT departments, but leadership from library IT departments should be open and supportive of library technology initiatives coming out of non-library IT areas, while facilitating secure practices.  Coordination on the part of library IT is also essential to avoid duplication of effort and ensure that projects being developed are sustainable and supported by the technology environment of the larger organization.  Encouraging the open exchange of technology-related ideas across the library prevents tech savvy staff feeling they need to hide their pet projects lest they get ‘in trouble’ with a restrictive library IT department.

In my view, there’s simply too much technology change happening in library to keep all technology development centralized in a single unit within the library.  Adding tech-savvy positions within non-technology departments is not a bad strategy – it can help support innovation out of library departments that haven’t traditionally been expected to drive technological change.  However, continually raising expectations with regard to technical knowledge can be stressful, so ensuring that strong support for professional development is in place is also important.  In my own new position, I’m excited to channel my knowledge of APIs and interest in data visualization technologies into creating some cool collection management and assessment tools, and I’m not at all concerned that I won’t have an opportunity to apply my technical knowledge in the rapidly changing landscape of library technical services and collection development.  Working outside of library IT means that I need to communicate closely with the head of library IT about the projects I’m working on, and also be sure to closely follow other technology-related projects across the library and be proactive about offering my skills where they might be helpful.  It also means that I need to work to support the technical expertise of staff in my department, particularly as related to library system management in acquisitions and cataloging.    No matter your role in the library, there’s plenty of technology-related work to go around.

  1.  See, for example, the Ex Libris and OCLC Developer Networks, both of which provide great documentation and example applications to novice developers.

Hosting a Coding Challenge in the Library

In Fall of 2016, the city of Los Angeles held a 2-week “Innovate LA” event intended to celebrate innovation and creativity within the LA region.  Dozens of organizations around Los Angeles held events during Innovate LA to showcase and provide resources for making, invention, and application development.  As part of this event, the library at California State University, Northridge developed and hosted two weeks of coding challenges, designed to introduce novice coders to basic development using existing tutorials. Coders were rewarded with digital badges distributed by the application Credly.

The primary organization of the events came out of the library’s Creative Media Studio, a space designed to facilitate audio and video production as well as experimentation with emerging technologies such as 3D printing and virtual reality.  Users can use computers and recording equipment in the space, and can check out media production devices, such as camcorders, green screens, GoPros, and more.  Our aim was to provide a fun, very low-stress way to learn about coding, provide time for new coders to get hands-on help with coding tutorials, and generally celebrate how coding can be fun.  While anyone was welcome to join, our marketing efforts specifically focused on students, with coding challenges distributed daily throughout the Innovate LA period through Facebook.

The Challenges

The coding challenges were sourced from existing coding tutorial sites such as Free Code CampLearn Ruby and Codecademy.  We wanted to offer a mix of front-end and server side coding challenges, starting with HTML, CSS and JavaScript and ramping up to PHP, Python, and Ruby.  We tested out several free tutorials and chose tutorials that had the most straightforward instructions that provided immediate feedback about incorrect code. We also tried to keep the interfaces consistent, using Free Code Camp most frequently so participants could get used to the interface and focus on coding rather than the tutorial mechanism itself.

Here’s a list of the challenges and their corresponding badges earned:

Challenge Badge Received
Say Hello to the HML Elements, Headline with the H2 Element, Inform with the Paragraph Element HTML Ninja
Change the Color of Text, Use CSS Selectors to Style Elements, Use a CSS Class to Style an Element CSS Ninja
Use Responsive Design with Bootstrap Fluid Containers, Make Images Mobile Responsive, Center Text with Bootstrap Bootstrapper
Comment your JavaScript Code, Declare JavaScript Variables, Storing Values with the Assignment Operator JavaScript Hacker
Learn how Script Tags and Document Ready Work, Target HTML Elements with Selectors Using jQuery, Target Elements by Class Using jQuery jQuery Ninja
Uncomment HTML, Comment out HTML, Fill in the Blank with Placeholder Text HTML Master
Style Multiple Elements with a CSS Class, Change the Font Size of an Element, Set the Font Family of an Element CSS Master
Create a Bootstrap Button, Create a a Block Element Bootstrap Button, Taste the Bootstrap Button Color Rainbow Bootstrap Master
Getting Started and Cat/Dog JS Game Maker
Target Elements by ID Using jQuery, Delete your jQuery Functions, Target the same element with multiple jQuery selectors jQuery Master
Hello World

Variables and Types

Lists

Python Ninja
Hello World, Variables and Types, Simple Arrays PHP Ninja
Hello World, Variables and Types, Math Ruby Ninja
How to Use APIs with JavaScript (complete through Step 9: Authentication and API Keys) API Ninja
Edit or create a wikipedia page. You may join in at the Wikipedia Edit-a-thon or do editing remotely. The Citation Hunt tool is a cool/easy way of going about editing a Wikipedia page. Narrow it to a topic that interests you and make. WikiWiz
Create a 3D Model for an original animated character. You may use TinkerCAD or Blender as free options or feel free to use SolidWorks AutoCAD if you are familiar with them. If you don’t know where to begin, TinkerCAD has step by step tutorials for you to bring your ideas to life. 3D Designer
Get a selfie with a Google Cardboard or any virtual reality goggles VR Explorer

Note the final three challenges – editing a Wikipedia page, creating a 3D model, and experimenting with Google Cardboard or other virtual reality (VR) goggles are not coding challenges, but we wanted to use the opportunity to promote some of the other services the Creative Media Studio provides.  Conveniently, the library was hosting a Wikipedia Edit-A-Thon during the same period as the coding challenges, so it made sense to leverage both of those events as part of our Innovate LA programming.

The coding challenges and instructions were distributed via Facebook, and we also held “office hours” (complete with snacks) in one of the library’s computer labs to provide assistance with completing the challenges.  The office hours were mostly informal, with two library staff members available to walk users through completing and submitting the challenges.  One special office hours was planned, bringing in a guest professor from our Cinema and Television Arts program to help users with a web-based game making tutorial he had designed.  This partnership was very successful, and that particular office hour session had the most attendance of any we offered.  In future iterations of this event, more advance planning would enable us to partner with additional faculty members and feature tutorials they already use effectively with students in their curriculum.

Credly

We needed a way to both accept submissions documenting completion of coding challenges and a way to award digital badges.  Originally we had investigated potentially distributing digital badges through our campus learning management system, as some learning management systems like Moodle are capable of awarding digital badges.  There were a couple of problems with this – 1) we wanted the event to be open to anyone, including members of the community who wouldn’t have access to the learning management system, and 2), the digital badge capability hadn’t been activated in our campus’ instance of Moodle.   Another route we considered taking was accepting submissions for completed challenges was through the university’s Portfolium application, which has a fairly robust ability to accept submissions for completed work, but again, wouldn’t facilitate anyone from outside of the university participating. Credly seemed like an easy, efficient way to both accept submissions and award badges that could also be embedded in 3rd party applications, such as LinkedIN.  Since we hosted the competition in 2016, the capability to integrate Credly badges in Portfolium has been made available.

Credly enables you to either design your badges using Credly’s Badge Builder or upload your own badge designs.  Luckily, we had access to amazing student designers Katie Pappace, Rose Rieux, and Eva Cohen, who custom-created our badges using Adobe Illustrator.  A Credly account for the library’s Creative Media Studio was created to issue the badges, and Credly “Credits” were defined using the custom-created badge designs for each of the coding skills for which we wanted to award badges.

When a credit is designed in Credly and you enable the credit to allow others to claim the credit, you have several options.  You can require a claim code, which requires users to submit a code in order to claim the credit.  Claim codes are useful if you want to award badges not based on evidence (like file submission) but are awarding badges based on participation or attendance at an event at which you distribute the claim code to attendees.  When claim codes are required, you can also set approval of submissions to be automatic, so that anyone with a claim code automatically receives their badge.  We didn’t require a claim code, and instead required evidence to be submitted.

When requiring evidence, you can configure which what types of evidence are appropriate to receive the badge. Choices for evidence submission include a URL, a document (Word, text, or PDF), image, audio file, video file, or just an open text submission.  As users were completing code challenges, we asked for screenshots (images) as evidence of completion for most challenges.  We reviewed all submissions to ensure the submission was correct, but by requiring screenshots, we could easily see whether or not the tutorial itself had “passed” the code submission.

Awards

Credly gives the ability of easily counting the number of badges earned by each of the participants. From those numbers, we were able to determine the top badge earners and award them prizes. All participants, even the ones with a single badge, were awarded buttons of each of their earned badges. In addition to the virtual and physical badges, participants with the greatest number of earned badges were rewarded with prizes. The top five prizes were awarded with gift cards and the grand prize winner also got a 3D printed trophy designed with Tinkercad and their photo as a Lithopane incorporated into the trophy. A low stakes award ceremony was held for all contestants and winners. Top awards were high commodity and it was a good opportunity for students to meet others interested in coding and STEM.

Lessons Learned

Our first attempt at hosting coding challenges in the library taught us a few things.  First, taking a screenshot is definitely not a skill most participants started out with – the majority of initial questions we received from participants were not related to coding, but rather involved how to take a screenshot of their completed code to submit to Credly.  For future events, we’ll definitely make sure to include step-by-step instructions for taking screenshots on both PC and Mac with each challenge, or consider an alternative method of collecting submissions (e.g., copying and pasting code as a text submission into Credly).  It’s still important to not assume that copying and pasting text from a screen is a skill that all participants will have.

As noted above, planning ahead would enable us to more effectively reach out and partner with faculty, and possibly coordinate coding challenges with curriculum.  A few months before the coding challenges, we did reach out to computer science faculty, cinema and television arts faculty, and other faculty who teach curriculum involving code, but if we had reached out much earlier (e.g., the semester before) we likely would have been able to garner more faculty involvement.  Faculty schedules are so jam-packed and often set that way so far in advance, at least six months of advance notice is definitely appreciated.

Only about 10% of coding challenge participants came to coding office hours regularly, but that enabled us to provide tailored, one-on-one assistance to our novice coders.  A good portion of understanding how to get started with coding and application development is not related to syntax, but involves larger questions about how applications work:  if I wanted to make a website, where would my code go?  How does a URL figure out where my website code is?  How does a browser understand and render code?  What’s the difference between JavaScript (client-side code) and PHP (server-side code), and why are they different?  These were the types of questions we really enjoyed answering with participants during office hours.  Having fewer, more targeted office hours — where open questions are certainly encouraged, but where participants know the office hours are focused on particular topics — makes attending the office hours more worthwhile, and I think gives novice coders the language to ask questions they may not know they have.

One small bit of feedback that was personally rewarding for the authors:  at one of our office hours, a young woman came up to us and asked us if we were the planners of the coding challenges.  When we said yes, she told how excited she was (and a bit surprised) to see women involved with coding and development.  She asked us several questions about our jobs and how we got involved with careers relating to technology.  That interaction indicated to us that future outreach could potentially focus on promoting coding to women specifically, or hosting coding office hours to enable mentoring for women coders on campus, modeling (or joining up with) Women Who Code networks.

If you’re interested in hosting support for coding activities or challenges in your library, a great resource to get started with is Hour of Code, which promotes holding one-hour introductions to coding and computer science particularly during Computer Science Education Week.  Hour of Code provides tutorials, resources for hosts, promotional materials and more.  This year, Hour of Code week / Computer Science Education Week will be  December 4-10 2017, so start planning now!

Making a Basic LTI (Learning Tools Intoperability) App

Learning Tools Interoperability, or LTI, is an open standard maintained by the IMS Global Learning Consortium used to build external tools or plugins for Learning Management Systems (LMS).  A common use case of an LTI is to build an application that can be accessed from within the LMS to perform searches and import resources into a course.  For example, the Wikipedia LTI application enables instructors to search Wikipedia and embed links to articles directly into their courses.  Academic libraries frequently struggle to integrate library resources in learning management systems, so LTI is an obvious standard to embrace as a potential way to make library resources more accessible.  However, when I began researching how I could begin creating an LTI app, I found it very difficult to find examples of existing app code and resources to get started.  You can’t just create any old web application and have that be ‘consumable’ by a learning management system in an LTI-compliant way.  In this post, I’ll outline some of the resources I found useful to get started building your own LTI app.

LTI General Architecture

These are the basic components of an LTI application:

  • The LTI Tool Provider (TP):  This is your application.  The tool provider is the resource the user sees when they access your application from within the learning management system.  The Wikipedia LTI app linked above is an example of a tool provider.
  • The LTI Tool Consumer (TC): This is the learning management system (e.g., Blackboard, Moodle, Canvas) from which the user accesses your tool provider application.
  • The LTI Launch:  When a user accesses your tool provider from the tool consumer, this is called “launching” the LTI application.  Parameters are passed from the tool consumer to your tool provider, including authorization parameters that ensure the user is permitted to access your application, as well as information about the user’s identity, roles within the tool consumer, and the type of request the user is sending (e.g., a “content item message” is sent to your tool to indicate the user is expecting to import a link back to the tool consumer).
  • OAuth:  LTI applications use OAuth signatures for validating messages between the Tool Consumer and the Tool provider.  LTI applications require that the Tool Consumer and the Tool Consumer have each configured a shared key and secret, which is used to build an OAuth “Access Token” to enable communication between the two systems.1

An additional tip for developing LTI apps:  Sign up for a free instructor account for the Canvas learning management system. Canvas accounts hosted on the Instructure website enable you to add a custom LTI tool to your course (once it’s hosted on a web server, of course) and also enables you to quickly experiment with some existing LTI applications (such as the Khan Academy and Merlot LTI apps) to explore possible functionality you might want to include in your application.  This way, you can see what an instructor or student would see when they interact with your tool provider through an LMS.

Building your first “Hello, World” LTI app (with some help from Harvard)

When I first started looking into LTI, I found it really difficult to find a full (but basic) LTI application to get an overall picture of how LTI apps work – there’s lots of LTI class libraries out there, but I wanted an example of how all the pieces of an LTI app fit together. After some fruitless Googling and GitHub searches, I finally stumbled upon this Harvard LTI workshop on LTI apps that really helped me understand how LTI applications work.  The repository includes a full working LTI application that you can simply “plug in” some basic values to create a fully working LTI application, complete with OAuth authentication.

First, be sure to look at the included presentation in the repository, which is a rare example of a set of presentation slides that is 100% understandable out of context, to get a general introduction to the LTI standard and what it attempts to achieve.  You’ll also want to read through the step-by-step LTI blog tutorial that will get you set up with your first “Hello,World!” LTI application, complete with valid OAuth-signed requests.2

I found it especially useful that the Harvard LTI workshop repository includes a pseudo tool consumer (which mimics how the LMS would interact with your tool) that you can use during development on localhost.  Once you follow the steps of the tutorial to build your basic “Hello, World!” single-page LTI application, you can plug the local URL of that into the tool consumer page and check out how the parameters are passed from the tool consumer to the tool provider.   You can also examine the built-in basic LTI php class library that is included, as well as the basic OAuth functionality to see how the OAuth Access Token is constructed.

Use Case:  A WorldCat Discovery API Search and Retrieval Tool for LMS

My particular use case for exploring LTI involves building a search box that would enable a faculty user to add a link to a resource from the WorldCat Discovery system.  If your library subscribed to FirstSearch or you are a WorldShare Management System (WMS) customer you now likely have access to WorldCat Discovery; but the framework I’m using to build my app would work for any Discovery layer with an API (e.g., Summon, Primo, etc.).

Searching and retrieving via LTI is straightforward.  First, using the Harvard LTI workshop LTI application, I created a /lib directory to host the WorldCat Discovery PHP library published by OCLC cloned from GitHub.  I installed the library using Composer as described in the GitHub repository readme instructions.   I created a very simple search form and response page that enables a user to enter a query and then retrieve results from the WorldCat Discovery API based on that query. Then, I set up my “tool.php” application to display the search form and POST the query to the the simple response page:

tool.php:

<?php
error_reporting(E_ALL & ~E_NOTICE);
ini_set("display_errors", 1);
require_once 'ims-blti/blti.php';
$lti = new BLTI("secret", false, false);

session_start();
header('Content-Type: text/html;charset=utf-8');
?>

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8" />
    <title>Building Tools with the Learning Tools Operability Specification</title>
  </head
  <body>
  <?php 
    if ($lti->valid) {
  ?>
    <h2>Search WorldCat Discovery</h2>
      <form action="results.php" method="post" encType="application/x-www-form-urlencoded">
        Search: <input type="text" name="query" id="query" />
         <?php
    foreach($_POST as $key => $value) {
      echo "<input type=\"hidden\" name=\"" .$key .  "\" value=\"" . $value . "\" />\n";
    }
  ?>
      <input type="submit" name="submit" value="Submit" />
      </form>
    <pre>
    </pre>
    <?php
      } else {
    ?>
      <h2>This was not a valid LTI launch</h2>
      <p>Error message: <?= $lti->message ?></p>
    <?php
      }
    ?>
    </body>
</html>

results.php:

<?php
require_once('../lib/worldcat-discovery-php/vendor/autoload.php');

   use OCLC\Auth\WSKey;
   use OCLC\Auth\AccessToken;
   use WorldCat\Discovery\Bib;

$key = 'somekey';
$secret = 'somesecret';
$options = array('services' => array('WorldCatDiscoveryAPI', 'refresh_token'));
$wskey = new WSKey($key, $secret, $options);
$accessToken = $wskey->getAccessTokenWithClientCredentials('123', '123');

$query = $_POST["query"];
$options = array(
  'useFRBRGrouping' => 'true',
  'sortBy' => 'library_plus_relevance',
  'itemsPerPage' => 25,
  );
$bib = Bib::Search($query, $accessToken, $options);

if (is_a($bib, 'WorldCat\Discovery\Error')) {
   echo $bib->getErrorCode();
   echo $bib->getErrorMessage();
} else {
    foreach ($bib->getSearchResults() as $result){
      echo '<li><a href="'. $result . '">' . $result->getName()->getValue() .' (';
      echo ($result->getDatePublished() ?  '' . $result->getDatePublished()->getValue()  : '') . ')</a></li>';
   }
}

?>

 

The application I’ve created so far is mostly a proof of concept, and I have a few essential tasks to finish the application – first, I need to re-write the URLs to point to a specific WorldCat Discovery instance (pointing to generic WorldCat.org isn’t helpful when a user is wanting to embed the resources of a specific library, to enable full-text access links and ); second, my app needs to enable the user to return these links to the LMS so that students / course participants can click on them.

For the second point, there is an LTI specification called the “content-item-message” that indicates that the type of interaction requested from the tool is the return of a link to the LMS.  The LMS must include this input parameter in the POST request to the tool.  The LMS “knows” to send this parameter when the tool is initially installed in the LMS.

<input type="hidden" name="lti_message_type" value="ContentItemSelectionRequest" />

The POST request to the tool must also indicate the return URL (e.g., the URL back to the LMS) where the link should be sent (the LMS should generate this input parameter for you; your tool just needs to identify this parameter and include it in the POST request to return the link to the LMS):

<input type="hidden" name="content_item_return_url" value="http://www.tc.com/item-return" />

The Tool provider must then render the link to be imported with some description of the content in JSON, for example:

{
  "@context" : "http://purl.imsglobal.org/ctx/lti/v1/ContentItem", 
  "@graph" : [ 
    { "@type" : "LtiLinkItem",
      "url" : "https://someinstitution.worldcat.org/oclc/709669613",
      "mediaType" : "text/html",
      "title" : "Global Warming: Hype or Hazard?"
    }
  ]
}

See the Content Item Message documentation for more details on returning JSON suitable for consumption by the LMS.

Learn, and do, more with LTI

You may find the basic LTI class script included in the Harvard LTI tutorial are insufficient for your use case – the code is a bit aged and the LTI specification has moved on.

A more robust LTI tool provider PHP library than the basic one included in the Harvard tutorial has been made available by IMS Global on GitHub.  You can also find a more complex complete sample app called “Rating” that is a great example of more complex kinds of interactions with an LTI app, including how you might build a server-side data store and recall that data through the LTI app, and how you might handle the assignment of grades or scores through an LTI app.

To learn more, the Canvas Learning Management system has an excellent open course on LTI development that you can enroll yourself in with a free Canvas account.  Once enrolled in the course, you can launch your own locally developed LTI app within the course to check how parameters and data are exchanged between the LMS and the tool.

  1.  See this post on LTI and OAuth for a straightforward discussion of the general implications of OAuth for LTI application development.
  2. I skipped the steps for installing Vagrant and VirtualBox and the tutorial still worked great for me on my MAMP server, so if you’re concerned about installing those and already have a local development server installed (or you’re just working from a LAMP server online) the tutorial will still work for you.

Do Library Stuff Faster with Python

Python is a great programming language to know if you work in a library: it’s (relatively) easy to learn, its syntax is fairly clear and intuitive, and it has great, robust libraries for doing routine library tasks like hacking MARC records and working with delimited data, CSV files, JSON and XML. 1  In this post, I’ll describe a couple of projects I’ve worked on recently that have enabled me to Do Library Stuff Faster using Python.  For reference, both of these scripts were written with Python 2.7 2 in mind, but could easily be adapted for other versions of Python.

Library Holdings Lookup with Beautiful Soup

Here’s a very common library dilemma:  A generous and well-meaning patron, faculty member, or friend of the library has a large personal collection of books or other materials that they would like to bequeath to your library.  They have carefully created a spreadsheet (or word document, or hand-written index) of all of the titles and authors (and maybe dates and ISBNs) in their library and want to know if you want the items.

Many libraries (for very good reason) have policies to just say “no” to these kinds of gifts.  Well-meaning library gift givers don’t always realize that it’s an enormous amount of work for a library to evaluate materials and decide whether or not they can be added to the library’s collection. Beyond relevance to their users and condition of the items, libraries don’t want to accept gifts of duplicate copies of titles they already have in their collection due to limited shelf space.

It’s that final point – how to avoid adding duplicate titles to the collection – that led me to develop a very simple (and very hacky) script to enable me to take a list of titles and authors and do a very simple lookup to see if, at minimum, we have those same titles already in the collection.  Our ILS (Innovative Interface’s Millennium system) does not have a way to feed in a bunch of titles and generate a report of title matches – and I would venture to say that kind of functionality is probably not available in most library systems.  Normally when presented with a dilemma of having to check to see if the library already has a set of titles, we’d sit down an unfortunate student worker and have them manually work through the list – copying and pasting titles into the library catalog and noting down any matches found.  This work is incredibly boring for the student worker, and is a prime candidate for automation (the same task is done over and over again, with a very specific set of criteria as output (match or no match).

Python’s Beautiful Soup library is built for exactly this kind of task – instead of having your student worker scan a bunch of web pages in your catalog, the script can do approximately the same thing by sending search terms to your catalog via URL, and returning back page elements that can tell you whether or not any matches were found.  In my example script, I’m using title and author elements, but you could modify this script to use other elements as long as they are indexed in your catalog – for example, you could send ISBNs, OCLC numbers, etc.

First, using Excel I concatenate a list of titles and authors with a domain and other URL elements to search my library’s catalog.  Here’s a few examples of what the URLs look like:

http://suncat.csun.edu/search~S9/X?SEARCH=t:(Los%20Angeles%20Two%20Hundred)+and+a:(Lavender)&searchscope=9&SORT=DX
http://suncat.csun.edu/search~S9/X?SEARCH=t:(The%20Land%20of%20Journeys'%20Ending)+and+a:(Austin)&searchscope=9&SORT=DX
http://suncat.csun.edu/search~S9/X?SEARCH=t:(Mathematics%20and%20Sex)+and+a:(Ernest)&searchscope=9&SORT=DX

I’ll save the full list of these (in my example, I have over 1000 titles and authors to check) in a plain text file called advancedtitleauth.txt.

Next, I start my Python script by calling the Beautiful Soup library, and some other libraries that are useful (urllib – a library built for fetching data by URLs; csv – a library for working with CSV files; and re, for working with regular expressions ).  You’ll probably have to install Beautiful Soup on your system first, which you can do if you have the pip Python package management system 3 installed by using sudo pip install beautifulsoup4 on your system’s command line.

from bs4 import BeautifulSoup
import urllib
import csv
import re

Then I create a blank array and define a CSV file into which to write the output of the script:

url_list = []
csv_out = csv.writer(open('output.txt', 'w'), delimiter = '\t')

The CSV file I’m creating will use tabs instead of commas as delimiters (hence delimiter = ‘\t’).  Typically when working with library data, I prefer tab-delimited text files over comma-separated files, because you never know when a random comma is going to show up in a title and create a delimiter where their should not be one.

Then I open my list of URLs, read it, append each URL to my array, and feed each URL into Beautiful Soup:

try:
  f = open(&#039;advancedtitleauth.txt&#039;, &#039;rb&#039;)
  for line in f:
    url_list.append(line)
    r = urllib.urlopen(line).read()
    soup = BeautifulSoup(r)

Beautiful Soup will go fetch the web page of each URL.  Now that I have the web pages, Beautiful Soup can parse out specific features of each page.  In my case, my catalog returns a page with a single record if a match is found, and a browsable index when a match is found (e.g., your title would be here, but it isn’t, so here’s some stuff with titles that would be nearby).  I can use Beautiful Soup to return page elements that tell me whether a match was found, and if a match is found, to write the permanent URL of the match for later evaluation.  This bit of code looks for an HTML div element with the class “bibRecordLink” on the page, which only appears when a single match is found.  If this div is present on the page, the script grabs the link and drops it into the output file.

try:
      link = soup.find_all("div", class_="bibRecordLink")
      directlink = str(link[0])
      directlink = "http://suncat.csun.edu" + directlink[36:]

In the code above, [36:] is Python’s way of noting the start position of a string – so in this case, I’m getting the end of the string starting with the 36th character (which in my case, is the bibliographic ID number of the item that allows me to construct a permalink).

If a title/author search results in multiple possible matches – that is, we might have multiple copies, or the title/author combo is too vague to land on just one item, the page that displays in our catalog shows a browsable list of brief record info.  In my script, I just grab the top result:

 try:
      briefcit = soup.find_all("span", class_="briefcitTitle")
      bestmatch = str(briefcit[0])
      sep = "&"
      bestmatch = bestmatch.split(sep, 1)[0]
      bestmatch = "http://suncat.csun.edu/" + bestmatch[39:]

In the code above, Beautiful Soup finds all the<span> elements with the class “briefcitTitle”, the script returns the first one, and again returns a URL stored in the bestmatch variable.

You can see a sample output of my lookup script here.  You can see that for each entry, I include publication information, direct links, or a best match link if the elements are found.  If none of the elements are found for a lookup URL, the line reads:

nopub nolink nomatch

We can now divide the output file into “no match” entries, direct links, or best match links.  Direct links and best match links will need to be double-checked by a student worker to make sure they actually represent the item we looked up, including the date and edition.  The “no match” entries represent titles we don’t have in our collection, so those can be evaluated more closely to determine if we want them.

The script certainly has room for improvement; I could write in a lot more functionality to better identify publication information, for example, to possibly reduce or eliminate the need for manual review of direct or partial matches.  But the return on investment for this script is fairly highfor a 37-line script written in an afternoon; we can re-use this dozens of times, and hopefully save countless hours of student worker boredom (and re-assign those student workers to more complex and meaningful tasks!).

Rudimentary Keyword Frequency Analysis

This second example involves, again, dealing with a task that could be done manually, but can be done much more quickly with a script.

My university needed to submit data for the AASHE Sustainability Tracking, Assessment, and Rating System (STARS) Report (https://stars.aashe.org/), which requires the analysis of data from campus course offerings as well as research output by faculty.  To submit the report, we needed to know how many courses we offer in sustainability (defined by AASHE “in an inclusive way, encompassing human and ecological health, social justice, secure livelihoods and a better world for all generations”) and how many faculty do research in sustainability.  This project was broken up into two components:  Analysis of research data by faculty and analysis of course data.

Sustainability Keywords

Before we even started analyzing research or course data, we needed to define an approach to identify what counts as “sustainability.”  Thankfully, there was precedent from the University of North Carolina, which had developed a list of sustainability-related keywords used to search against faculty research output 4  We adopted this list of keywords to lookup in faculty research articles and course descriptions.

Research data by faculty

We don’t have a comprehensive inventory of research done by faculty at our campus.  Because we were on a somewhat tight deadline to do the analysis, I came up with a very quick and dirty way of getting a lot of citations by using Web of Science.  Web of Science enables you to do a search for research published by affiliates of your university.  I was able to retrieve about 8,000 citations written by current or former faculty associated with my institution going back about 15 years.  Of course, we cannot consider the data in Web of Science to be fully representative of faculty research output, but it seemed like a good start at least.

Web of Science enables you to export 500-record chunks of metadata, so it took an hour or so to export the metadata in several pieces (see Figure 1 for my Web of Science export criteria).

A screenshot of Web of Science's Output Records function showing the following fields selected: All records in this list (up to 500), Author(s) / Editor(s), Abstract, PubMedID, Title, Source, Keywords, Web of Science Categories, Conference Information, Research Areas.
Figure 1. Web of Science’s Output Records function showing metadata fields selected.

Once I had all of the metadata for the 8,000 or so records written by faculty at my institution, I combined them into a single file.  Next, I needed to identify records that had sustainability keywords in either the title or abstract.

First, I created an array of all of the keywords, and turned that list into a Python set.  A Python set is different from a list in that the order of terms does not matter, and is ideal for checking membership of items in the set against strings (or in my case, a bunch of citation and abstract strings).

word_list = 'Agriculture,Alternative,Applied%Science [..snip..]'
word_set <span class="pl-k">=</span> <span class="pl-c1">set</span>(word_list.split(<span class="pl-s"><span class="pl-pds">'</span>,<span class="pl-pds">'</span></span>))

Note the % in “Applied%Science”.  For some reason my set lookup couldn’t match terms with spaces. My hacky solution was to replace spaces with % characters, and then do a find/replace in my spreadsheet of Web of Science data to replace all keyword matches with spaces (such as Applied Science) with percentage signs (Applied%Science).  Luckily, there were only 10 or so keywords on the list with spaces, so the find/replace did not take very long.  Note also that the set match lookup is case sensitive, so I actually found it easier to just turn everything to lower case in my Web of Science spreadsheet and match on the lower case term (though I kept both upper and lower case terms in my lookup set).

Then I checked to see if any words were in the title, abstract, or both, and constructed my query so that a new column would be added to an output spreadsheet indicating *which* matches were found:

for row in csv_reader:
 if (set(row[23].split()) & word_set) & (set(row[9].split()) & word_set) :
 csv_out.writerow(["title & abstract match",row[61],row[1],row[9],row[23],(set(row[9].split()) & word_set), (set(row[23].split()) & word_set)])

If any of the words in my set were found in the 23rd cell of the spreadsheet (the abstract) and the 9th cell of the spreadsheet (the title), then a row would be written to an output sheet indicating that sustainability keywords were found in the title and abstract, pulling in some citation details about the article (including author names), as well as a cell with a list of the matches found for both title and abstract fields.

I did similar conditionals for rows that found, for example, just a title match, or just an author match:

 elif set(row[9].split()) & word_set:
      csv_out.writerow(["title match",row[61],row[1],row[9],row[23], (set(row[9].split()) & word_set)])
    elif set(row[23].split()) & word_set:
      csv_out.writerow(["abstract match",row[61],row[1],row[9],row[23], (set(row[23].split()) & word_set)])

And that is pretty much the whole script!  With the output file, I did have to do a bit more work to identify current faculty at my institution, but I basically used the same set matching method above using a list provided by HR.

Because the STARS report also required analysis of courses related to sustainability, I also created a very similar script to lookup key terms found in course titles and descriptions.

Of course, just because a research article or course description has a keyword, or even multiple keywords, does not mean it’s relevant at all to sustainability.  One of the keywords identified as related to sustainability, for example, is “invest”, which basically meant that almost every finance class returned as a match.  Manual work was required to review the matches and weed out false positives, but because the keyword matching was already done and we could easily see what matches were found, this work was done fairly quickly.  We could, for example, identify courses and research articles that only had a single keyword match.  If that single keyword match was something like “sustainability” it was likely a sustainability-related course and would merit further review; if the single keyword match was something like “systems” it could probably be weeded out.

As with my author/title lookup script, if I had a bit more time to fuss with the script, I could have probably optimized it further (for example, by assigning weight to more sustainability-related keywords to help calculate a relevance score).  But again, a short amount of time invested in this script saved a huge amount of time, and enabled us to do something we would not otherwise have been able to do.

Python Resources

If you’re interested in learning more about Python and its syntax, and don’t have a lot of Python experience, a good (free) place to start is Google’s Python Class, created by Nick Parlante for Google (I actually took a similar class several years ago, also created by Dr. Parlante, through Coursera, which looks to still be available).  If you want to get started using Python right away and don’t want to have to fuss with installing it on your computer, you can check out the interactive course How to Think Like a Computer Scientist created by Brad Miller and David Ranum at Luther College.  For more examples of usage in Python for library work, check out Charles Ed Hill, Heidi Frank, and Mark Pernotto’s Python chapter in the just-released LITA Guide The Librarian’s Introduction to Programming Languagesedited by Beth Thomsett-Scott (full-disclosure: I am a contributor to this book).

  1. Working with CSV files and JSON Data.  In Sweigart, Al (2015). Automate the Boring Stuff with Python: Practical Programming for Total Beginners. San Francisco: No Starch Press.
  2. For an explanation of the difference between Python 2 and 3, see https://wiki.python.org/moin/Python2orPython3.  The reason I use Python 2.7 for these scripts is because of my computing environment (in which Python 2 is installed by default), but if you have Python 3 installed on your computer, note that syntactical changes in Python 3 mean that many Python 2.x scripts may require revision in order to work.
  3. For instructions on using Pip with your Python installation, see: https://pip.pypa.io/en/latest/installing/
  4.  Blank-White, Kristen. 2014. Researching the Researchers: Developing a Sustainability Research Inventory.  Presented at the 2014 AASHE Conference and Expo, Portland OR. http://www.aashe.org/files/2014conference/presentations/secondPresentationUpload/Blank-White-Kristin_Researching-the-Researchers-Developing-a-Sustainability-Research-Inventory.pdf.

Supporting Library Staff Technology Training

Keeping up with technical skills and finding time to learn new things can be a struggle, no matter your role in a library (or in any organization, for that matter).  In some academic libraries, professional development opportunities have been historically available to librarians and library faculty, and less available (or totally unavailable) for staff positions.  In this post, I argue that this disparity, where it may exist, is not only prima facie unfair, but can reduce innovation and willingness to change in the library.  If your library does not have a policy that specifically addresses training and professional development for all library staff, this post will provide some ideas on how to start crafting one.

In this post, when referring to “training and professional development,” I mostly have in mind technology training – though a training policy could cover non-technical training, such as leadership, time management, or project management training (though of course, some of those skills are closely related to technology).

Rationale

In the absence of a staff training policy or formal support for staff training, staff are likely still doing the training, but may not feel supported by the library to do so.  In ACRL TechConnect’s 2015 survey on learning programming in libraries, respondents noted disparities at their libraries between support for technical training for faculty or librarian positions and staff positions.  Respondents also noted that even though support for training was available in principle (e.g., funding was potentially available for travel or training), workloads were too high to find the time to complete training and professional development, and some respondents indicated learning on their own time was the only feasible way to train.   A policy promoting staff training and professional development should therefore explicitly allocate time and resources for training, so that training can actually be completed during work hours.

There is not a significant amount of recent research reflecting the impact of staff training on library operations.  Research in other industries has found that staff training can improve morale, reduce employee turnover and increase organizational innovation.1  In a review of characteristics of innovative companies, Choudhary (2014) found that “Not surprisingly, employees are the most important asset of an organization and the most important source of innovation.” 2  Training and workshops – particularly those that feature “lectures/talks from accomplished persons outside the organization” are especially effective in fostering happy and motivated employees 3 – and it’s happy and motivated employees that contribute most to a culture of innovation in an organization.

Key Policy Elements

Time

Your policy should outline how much time for training is available to each employee (for example, 2 hours a week or 8 hours a month).  Ensuring that staff have enough time for training while covering their existing duties is the most challenging part of implementing a training policy or plan.  For service desks in particular, scheduling adequate coverage while staff are doing professional development can be very difficult – especially as many libraries are understaffed.  To free up time, an option might be to train and promote a few student workers to do higher-level tasks to cover staff during training (you’ll need to budget to pay these students a higher wage for this work).  If your library wants to promote a culture of learning among staff, but there really is no time available to staff to do training, then the library probably needs more staff.

A training policy should be clear that training should be scheduled in advance with supervisor approval, and supervisors should be empowered to integrate professional development time into existing schedules.  Your policy may also specify that training hours can be allocated more heavily during low-traffic times in the library, such as summer, spring, and winter breaks, and that employees will likely train less during high-traffic or project-intensive times of the year.  In this way, a policy that specifies that an employee has X number of training hours per month or year might be more flexible than a policy that calls for X number of training hours per week.

Equipment and Space

Time is not enough.  Equipment, particularly mobile devices such as iPads or laptops – should also be available for staff use and checkout. These devices should be configured to enable staff to install required plugins and software for viewing webinars and training videos.   Library staff whose offices are open and vulnerable to constant interruption by patrons or student workers may find training is more effective if they have the option to check out a mobile device and head to another area – away from their desk – to focus.  Quiet spaces and webinar viewing rooms may also be required, and most libraries already have group or individual study areas.  Ensure that your policy states whether or how staff may reserve these spaces for training use.

Funding

There are tons of training materials, videos, and courses that are freely available online – but there are also lots of webinars and workshops that have a cost that are totally worth paying for.  A library that offers funding for professional development for some employees (such as librarians or those with faculty status), but not others, risks alienating staff and sending the message that staff learning is not valued by the organization.  Staff should know what the process is to apply for funding to travel, attend workshops, and view webinars.  Be sure to write up the procedures for requesting this funding either in the training policy itself or documented elsewhere but available to all employees.  Funding might be limited, but it’s vital to be transparent about travel funding request procedures.

An issue that is probably outside of the scope of a training policy, but is nonetheless very closely related, is staff pay.  If you’re asking staff to train more, know more, and do more, compensation needs to reflect this. Pay scales may not have caught up to the reality that many library staff positions now require technology skills that were not necessary in the past; some positions may need to be re-classed.  For this reason, creating a staff training policy may not be possible in a vacuum, but this process may need to be integrated with a library strategic planning and/or re-organization plan.  It’s incredibly important on this point that library leadership is on board with a potential training policy and its strategic and staffing implications.

Align Training with Organizational Goals

It likely goes without saying that training and professional development should align with organizational goals, but you should still say it in your policy – and specify where those organizational goals are documented. How those goals are set is determined by the strategic planning process at your library, but you may wish to outline in your policy that supervisors and department heads can set departmental goals and encourage staff to undertake training that aligns with these goals.  This can, in theory, get a little tricky: if we want to take a yoga class as part of our professional development, is that OK?  If your organization values mindfulness and/or wellness, it might be!

If your library wants to promote a culture of experimentation and risk-taking, consider explicitly defining and promoting those values in your policy.  This can help guide supervisors when working with staff to set training priorities.  One exciting potential outcome of implementing a training policy is to foster an environment where employees feel secure in trying out new skills, so make it clear that employees are empowered to do so.  Communication / Collaboration

Are there multiple people in your library interested in learning Ruby?  If there were, would you have any way of knowing?  Effective communication can be a massive challenge on its own (and is way beyond the scope of this post), but when setting up and documenting a training policy staff, you could include guidance for how staff should communicate their training activities with the rest of the library.  This could take the form of something totally low-tech (like a bulletin board or shared training calendar in the break room) or could take the form of an intranet blog where everyone is encouraged to write a post about their recent training and professional development experiences.  Consider planning to hold ‘share-fests’ a few times a year where staff can share new ideas and skills with others in the library to further recognize training accomplishments.

Training is in the Job Description

Training and professional development should be included in all job descriptions (a lot easier said than done, admittedly).  Employees need to know they are empowered to use work time to complete training and professional development.  There may be union, collective bargaining, and employee review implications to this – which I certainly am not qualified to speak on – but these issues should be addressed when planning to implement a training policy.  For new hires going forward, expect to have a period of ‘onboarding’ during which time the new staff member will devote a significant amount of time to training (this may already be happening informally, but I have certainly had experiences as a staff member being hired in and spending the first few weeks of my new job trying to figure out what my job is on my own!).

Closing the Loop:  Idea and Innovation Management

OK, so you’ve implemented a training policy, and now training and professional development is happening constantly in your library.  Awesome!  Not only is everyone learning new skills, but staff have great ideas for new services, or are learning about new software they want to implement.  How do you keep the momentum going?

One option might be to set up a process to track ideas and innovative projects in your library.  There’s a niche software industry around idea and innovation management that features some highly robust and specialized products (Brightidea, Spigit  and Ideascale are some examples), but you could start small and integrate idea tracking into an existing ticket system like SpiceWorks, OSTicket, or even LibAnswers.  A periodic open vote could be held to identify high-impact projects and prioritize new ideas and services.  It’s important to be transparent and accountable for this – adopting internally-generated ideas can in and of itself be a great morale-booster if handled properly, but if staff feel like their ideas are not valued, a culture of innovation will die before it gets off the ground.

Does your library have a truly awesome culture of learning and employee professional development?  I’d love to hear about it in the comments or @lpmagnuson.

Notes

  1. Sung, S. , & Choi, J. (2014). Do organizations spend wisely on employees? effects of training and development investments on learning and innovation in organizations. Journal of Organizational Behavior,35(3), 393-412.
  2.  Choudhary, A. (2014). Four Critical Traits of Innovative Organizations. Journal of Organizational Culture, Communication and Conflict, 18(2), 45-58.
  3. Ibid.

Store and display high resolution images with the International Image Interoperability Framework (IIIF)

Recently a faculty member working in the Digital Humanities on my campus asked the library to explore International Image Interoperability Framework (IIIF) image servers, with the ultimate goal of determining whether it would be feasible for the library to support a IIIF server as a service for the campus.  I typically am not very involved in supporting work in the Digital Humanities on my campus, despite my background in (and love for) the humanities (philosophy majors, unite!). Since I began investigating this technology, I seem to see references to IIIF-compliance popping up all over the place, mostly in discussions related to IIIF compatibility in Digital Asset Management System (DAMS) repositories like Hydra 1 and Rosetta 2, but also including ArtStor3 and the Internet Archive 4.

IIIF was created by a group of technologists from Stanford, the British Library, and Oxford to solve three problems: 1) slow loading of high resolution images in the browser, 2) high variation of user experience across image display platforms, requiring users to learn new controls and navigation for different image sites, and 3) the complexity of setting up high performance image servers.5 Image servers traditionally have also tended to silo content, coupling back-end storage with either customized or commercial systems that do not allow additional 3rd party applications to access the stored data.

By storing your images in a way that multiple applications can access them and render them, you enable users to discover your content through a variety of different portals. With IIIF, images can be stored in a way that facilitates API access to them. This enables a variety of applications to retrieve the data. For example, if you have images stored in a IIIF-compatible server, you could have multiple front-end discovery platforms access the images through API, either at your own institution or other institutions that would be interested in providing gateways to your content. You might have images that are relevant to multiple repositories or collections; for instance, you might want your images to be discoverable through your institutional repository, discovery system, and digital archives system.

IIIF systems are designed to work with two components: an image server (such as the Python-based Loris application)6 and a front-end viewer (such as Mirador 7 or OpenSeadragon8).  There are other viewer options out there (IIIF Viewer 9, for example), and you could conceivably write your own viewer application, or write a IIIF display plugin that can retrieve images from IIIF servers.  Your image server can serve up images via APIs (discussed below) to any IIIF-compatible front-end viewer, and any IIIF-compatible front-end viewer can be configured to access information served by any IIIF-compatible image server.

IIIF Image API and Presentation API

IIIF-compatible software enables retrieval of content from two APIs: the Image API and the Presentation API. As you might expect, the Image API is designed to enable the retrieval of actual images. Supported file types depends on the image server application being used, but API calls enable the retrieval of specific file type extensions including .jpg, .tif, .png, .gif, .jp2, .pdf, and .webp.10. A key feature of the API is the ability to request images to be returned by image region – meaning that if only a portion of the image is requested, the image server can return precisely the area of the image requested.11 This enables faster, more nimble rendering of detailed image regions in the viewer.

A screenshot showing a region of an image that can be returned via a IIIF Image API request. The region to be retrieved is specified using pixel area references (Left, Top, Right, Bottom).
A screenshot showing a region of an image that can be returned via a IIIF Image API request. The region to be retrieved is specified using pixel area references (Left, Top, Right, Bottom). These reference points are then included in the request URI. (Image Source: IIIF Image API 2.0. http://iiif.io/api/image/2.0/#region)

The basic structure of a request to a IIIF image server follows a standard scheme:

{scheme}://{server}{/prefix}/{identifier}/{region}/{size}/{rotation}/{quality}.{format}

An example request to a IIIF image server might look like this:

http://www.example.org/imageservice/abcd1234/full/full/0/default.jpg12

The Presentation API returns contextual and descriptive information about images, such as how an image fits in with a collection or compound object, or annotations and properties to help the viewer understand the origin of the image. The Presentation API retrieves metadata stored as “manifests” that are often expressed as JSON for Linked Data, or JSON-LD.13 Image servers such as Loris may only provide the ability to work with the Image API; Presentation API data and metadata can be stored on any server and image viewers such as Mirador can be configured to retrieve presentation API data.14

Why would you need a IIIF Image Server or Viewer?

IIIF servers and their APIs are particularly suited for use by cultural heritage organizations. The ability to use APIs to render high resolution images in the browser efficiently is essential for collections like medieval manuscripts that have very fine details that lower-quality image rendering might obscure. Digital humanities, art, and history scholars who need access to high quality images for their research would be able to zoom, pan and analyze images very closely.  This sort of an analysis can also facilitate collaborative editing of metadata – for example, a separate viewing client could be set up specifically to enable scholars to add metadata, annotations, or translations to documents without necessarily publishing the enhanced data to other repositories.

Example: Biblissima

A nice example of the power of the IIIF Framework is with the Biblissima Mirador demo site. As the project website describes it,

In this demo, the user can consult a number of manuscripts, held by different institutions, in the same interface. In particular, there are several manuscripts from Stanford and Yale, as well as the first example from Gallica and served by Biblissima (BnF Français 1728)….

It is important to note that the images displayed in the viewer do not leave their original repositories; this is one of the fundamental principles of the IIIF initiative. All data (images and associated metadata) remain in their respective repositories and the institutions responsible for them maintain full control over what they choose to share. 15.

A screenshot of the Biblissima Mirador demo site.
The Biblissima Mirador demo site displays images that are gathered from remote repositories via API. In this screenshot, the viewer can select from manuscripts available from Yale, the National Library of Wales, and Harvard.

The approach described by Biblissima represents the increasing shift toward designing repositories to guide users toward linked or related information that may not be actually held by the repository.  While I can certainly anticipate some problems with this approach for some archival collections – injecting objects from other collections might skew the authentic representation of some collections, even if the objects are directly related to each other – this approach might work well to help represent provenance for collections that have been broken up across multiple institutions. Without this kind of architecture, researchers would have to visit and keep track of multiple repositories that contain similar collections or associated objects. Manuscript collections are particularly suited to this kind approach, where a single manuscript may have been separated into individual leaves that can be found in multiple institutions worldwide – these manuscripts can be digitally re-assembled without requiring institutions to transfer copies of files to multiple repositories.

One challenge we are running into in exploring IIIF is how to incorporate this technology into existing legacy applications that host high resolution images (for example, ContentDM and DSpace).  We wouldn’t necessarily want to build a separate IIIF image server – it would be ideal if we could continue storing our high res images on our existing repositories and pull them together with a IIIF viewer such as Loris).  There is a Python-based translator to enable ContentDM to serve up images using the IIIF standard16, but I’ve found it difficult to find case studies or step-by-step implementation and troubleshooting information (if you have set up IIIF with ContentDM, I’d love to know about your experience!).  To my knowledge, there is not an existing way to integrate IIIF with DSpace (but again, I would love to stand corrected if there is something out there).  Because IIIF is such a new standard, and legacy applications were not necessarily built to enable this kind of content distribution, it may be some time before legacy digital asset management applications integrate IIIF easily and seamlessly.  Apart from these applications serving up content for use with IIIF viewers, embedding IIIF viewer capabilities into existing applications would be another challenge.

Finally, another challenge is discovering IIIF repositories from which to pull images and content.  Libraries looking to explore supporting IIIF viewers will certainly need to collaborate with content experts, such as archivists, historians, digital humanities and/or art scholars, who may be familiar with external repositories and sources of IIIF content that would be relevant to building coherent collections for IIIF viewers.  Viewers are manually configured to pull in content from repositories, and so any library wanting to support a IIIF viewer will need to locate sources of content and configure the viewer to pull in that content.

Undertaking support for IIIF servers and viewers is fundamentally not a trivial project, but can be a way for libraries to potentially expand the visibility and findability of their own high-resolution digital collections (by exposing content through a IIIF-compatible server) or enable their users to find content related to their collections (by supporting a IIIF viewer).  While my library hasn’t determined what exactly our role will be in supporting IIIF technology, we will definitely be taking information learned from this experiences to shape our exploration of emerging digital asset management systems, such as Hydra and Islandora.

More Information

  • IIIF Website: http://search.iiif.io/
  • IIIF Metadata Overview: https://lib.stanford.edu/home/iiif-metadata-overview
  • IIIF Google Group: https://groups.google.com/forum/#!forum/iiif-discuss

Notes

 

  1. https://wiki.duraspace.org/display/hydra/Page+Turners+%3A+The+Landscape
  2.  Tools for Digital Humanities: Implementation of the Mirador high-resolution viewer on Rosetta – Roxanne Wyns, Business Consultant, KU Leuven/LIBIS – Stephan Pauls, Software architect. http://igelu.org/wp-content/uploads/2015/08/5.42-IGeLU2015_5.42_RoxanneWyns_StephanPauls_v1.pptx
  3.  D-Lib Magazine. 2015. “”Bottled or Tap?” A Map for Integrating International Image Interoperability Framework (IIIF) into Shared Shelf and Artstor”. D-Lib Magazine. 2015-08. http://www.dlib.org/dlib/july15/ying/07ying.html
  4. https://blog.archive.org/2015/10/23/zoom-in-to-9-3-million-internet-archive-books-and-images-through-iiif/
  5. Snydman, Stuart, Robert Sanderson and Tom Cramer. 2015. The International Image Interoperability Framework (IIIF): A
    community & technology approach for web-based images. Archiving Conference 1. 16-21(6). https://stacks.stanford.edu/file/druid:df650pk4327/2015ARCHIVING_IIIF.pdf.
  6. https://github.com/pulibrary/loris
  7. http://github.com/IIIF/mirador
  8.  http://openseadragon.github.io/
  9. http://klokantech.github.io/iiifviewer/
  10.  http://iiif.io/api/image/2.0/#format
  11. http://iiif.io/api/image/2.0/#region
  12. Snydman, Sanderson, and Cramer, The International Image Interoperability Framework (IIIF), 2
  13. http://iiif.io/api/presentation/2.0/#primary-resource-types-1
  14. https://groups.google.com/d/msg/iiif-discuss/F2_-gA6EWjc/2E0B7sIs2hsJ
  15.  http://www.biblissima-condorcet.fr/en/news/interoperable-viewer-prototype-now-online-mirador
  16. https://github.com/IIIF/image-api/tree/master/translators/ContentDM

Accessibility Testing LibGuides 2.0

Over the summer my library began investigating potentially migrating to the LibGuides content management system from our current, Drupal-based subject guide system.  As part of our investigation, and with resources from our campus’ Universal Design Center 1, I began an initial review to determine the extent to which LibGuides 2.0 was accessible to all users, including users with disabilities or those using assistive technologies.  Our campus, like other California State University campuses, has a strong commitment to ensuring technology is accessible to all users.  The campus has a fairly extensive process for acquiring new technologies that require all departments to review the accessibility of any technology or web-based product purchased, and the Universal Design Center assists all departments on campus with these evaluations.  While evaluating technology for accessibility is not typically my area of responsibility (in fact, I rarely have involvement in end-user facing technology, let alone testing for usability and accessibility), in this case I was interested in using LibGuides as an opportunity to learn more about accessibility for my own knowledge.  Ensuring that web content is accessible requires a blend of skills related to using web markup, understanding user behavior, and knowledge of assistive technologies, and as a librarian I know I can benefit from a solid understanding of all of these areas.

While I am by no means an expert on accessibility, I am familiar with basic guidelines of accessibility for content creation and markup. 2  Of course, accessibility and usability in a content management system depend, in large part, on the practices followed by content creators.  LibGuides authors have a significant amount of control over the accessibility of the content they create.  For example, using the HTML source code editing features of LibGuides, any guide author can ensure their own markup is compliant with accessibility guidelines, and manually add elements such as alternative text, titled iFrames, or ARIA attributes.  However, I was especially interested in identifying any issues that LibGuides guide authors could not easily modify themselves.  While many features can be overridden via the extensive CSS customization available in LibGuides 2.0’s Bootstrap Framework3, I wanted to identify those ‘out-of-the-box’ elements that posed accessibility problems.

The following issues identified below have been reported to SpringShare, and I was told by SpringShare support that all of these issues are being investigated and already ‘on the list’ for future development.  As this is my first attempt to really deep-dive into web accessibility, I’m really interested in feedback about the issues identified below.  I am hoping that I’ve interpreted the standards correctly, but I definitely welcome any feedback or corrections!

Method

A sample guide was created in a LibGuides demo instance to evaluate all built-in LibGuides box types, content types, and various multimedia elements to determine Section 508 compliance.  The following features were included on the guide that was used for testing:

LibGuides Box Types:

  • Tabbed
  • Gallery
  • Profile

LibGuides Content Types:

  • Rich Text/HTML
  • Database
  • Link
  • Media/Widget
  • Book from the Catalog
  • Document/File
  • RSS Feed
  • Guide List
  • Poll
  • Google Search

Free tools used to evaluate LibGuides accessibility include:

  • W3C Markup Validator :  Valid markup is usually much more accessible markup.  Unclosed tags or nesting problems can often cause problems with screen readers, keyboard navigation, or other assistive technologies.
  • WebAIM WAVE Accessibility Tool – Enter the URL of your page, and the WAVE Tool will examine the page and automatically identify accessibility errors (elements, such as form labels, that are required for accessibility that are absent or problematically implemented), alerts (potential issues that could be improved) and features (good accessibility practices).
  • CynthiaSays – Similar to the WAVE tool, CynthiaSays automatically reads through the markup of a URL you provide and generates a comprehensive report of problems and potential issues.
  • Mozilla Firefox with the following extensions (there are likely Chrome alternatives to these):
    • Fangs – A screen reader emulator that enables you to view a text-only version of a page the way a screen-reader would read it.  Ensuring that your page is read by a screen reader the way you intend is essential for accessibility, and Fangs enables you to review the screen-readability of your page without downloading a full screen-reading desktop client such as JAWS.
    • WCAG Color Contrast Checker – A handy tool to quickly view the color contrast of your page in the browser.  Low contrast elements, such as yellow text on a white background, can be very different to see for a variety of users.   
  • Colour Contrast Analyser – A helpful desktop client that enables automated checking to ensure that web page elements or images contain high enough contrast to be viewed and read easily by a wide variety of users.
  • JAWS – JAWS is a very popular screen reading application that enables web pages to be navigated and read aloud to users.  While this software has a cost, a free trial can be downloaded temporarily to preview the software’s functionality.

Guidelines from the US Federal Government’s Section 508 Accessibility Program, W3C’s WCAG 2.0, and CSU Northridge’s Web Accessibility Criteria were used in this evaluation.

Findings

These features do not conform to Section 508 and/or WCAG 2.0 compliance, and their implementation in LibGuides does not enable guide authors to easily override code to improve accessibility manually.  

Polls: Lack clear labeling of form elements (Section 508 1194.22(n))

In our testing, Poll elements lack “FOR” attributes in tag labels and “ID” attributes in associated form elements.  Poll forms also make use of ‘implicit labels’, where the form element and its associated label are contained within opening and closing label tags.  For example, radio button code from a  poll element is generated by LibGuides as:

<div class="radio">
<label>
<input type="radio" class="pad-left-med" name="s-lg-poll-option-13342416" 
id="s-lg-poll-option-13342416_1" value="83823" >Never
</label>
</div>

More accessible code might instead look like:

<div class="radio">
<label FOR=”never”>Never
</label>
<input type="radio" class="pad-left-med" name="s-lg-poll-option-13342416" 
id="s-lg-poll-option-13342416_1" value="83823" ID=”never”>
</div>
Cover images from ‘Books from the Catalog’:  Lack textual description (Section 508 1194.22(a))

In testing, whether covers were retrieved from Syndetics, Amazon, or whether default (blue or white) covers were used, all resultant “Books from the Catalog” elements lacked ALT attributes.  Images do, however, have title elements.  It could be interpreted that these elements are decorative and therefore do not require alternative text elements.  However, the default title elements (derived from the title of the book) is not especially descriptive to help the user understand the role of the image on the page.

For example:

<img alt="" src="http://syndetics.com/index.aspx?isbn=9780133017854/LC.GIF&amp;
client=springshare" 
title="Getting It Right for Young Children from Diverse Backgrounds" 
class="pull-left s-lg-book-cover-img-0">

This code could be made more accessible with the following:

<img alt="Getting it Right for Young Children from Diverse Backgrounds 
Cover Image" 
src="http://syndetics.com/index.aspx?isbn=9780133017854/LC.GIF&amp;
client=springshare" 
title="Getting It Right for Young Children from Diverse Backgrounds" 
class="pull-left s-lg-book-cover-img-0">
Gallery Keyboard Accessibility and Tab Navigation Section 508 1194.21 (a)

In testing, it was not possible to navigate through gallery images using keyboard tab navigation alone.  While it was possible with tab navigation to bypass the gallery (tab into and out of it into the next page element) the user would not be able to control the movement of the gallery or tab through the gallery images to access the descriptions or captions of the gallery.

Gallery Default Label and Caption Color: Insufficient contrast and readability

FireFox’s WCAG Color Contrast Checker identified the white label and caption color of the “Gallery” box type as having insufficient contrast with many images that could be used in the gallery.  Because the label and captions appear directly overlaid upon gallery images, with no outline or background color to enhance the contrast of the text, these labels and captions can be difficult to read.  There does not appear to be a way in LibGuides administrative settings to adjust the default caption, though custom scripting might be used to override the style.

A screenshot of an example "Gallery" image in LibGuides. The example screenshot is of a cityscape in Israel.

Figure 1:  LibGuides gallery feature showing white label and caption that can be difficult to read against the gallery image.

Accessible Practices for Guide Authors:  A few tips

The issues identified above cannot easily be resolved through LibGuides administrative options or author controls, but there are several other important practices for guide authors to be aware of.  The tips below are by no means a comprehensive guide to accessibility; there are many more aspects to ensuring content is accessible (especially concerning the use of media, tables, and other types of content), but this list provides a few examples of things content creators can be aware of when creating guides.

Media/Widget Embed Codes:  Manually add title attributes to iframe elements

When embedding iframe media (such as a YouTube video, SoundCloud file, or Google Form) it is essential that Guide authors manually add a TITLE attribute to media embed codes.

Here is an example of a YouTube video’s embed code:

<iframe width="548" height="315" 
src="https://www.youtube.com/embed/rWDN64k977o" 
frameborder="0" allowfullscreen></iframe>

When adding code like this to a LibGuides Media/Widget feature, guide authors should manually add in a descriptive title element to briefly describe the contents of the embedded media:

<iframe title=”Video tutorial on finding a book at the Oviatt Library” 
width="548" height="315" 
src="https://www.youtube.com/embed/rWDN64k977o" 
frameborder="0" allowfullscreen></iframe>

Embedded media should also always include captions for visual media and transcripts for audio and visual media.

Rich Text/HTML Content: Add alternative text to all images

When manually adding images to RichText/HTML content, guide authors should be sure to add descriptive Alternative Text in the image dialogue box:

The LibGuides image upload dialogue menu, with a black box highlighting the input field for alternative text.

Figure 2:  LibGuides Image Properties Dialogue Box used to add images.  The Alternative Text field is highlighted.

Links:  Add title and aria-label attributes

When manually adding links to resources in LibGuides, ensure the purpose of the link is clear, either with title attributes or aria-label attributes.  Avoid, where possible, vague link text such as ‘Read More’ or ‘Click Here’. If link text is vague or there is no descriptive information about the link visible on the page, use a title attribute or aria-label attribute:

Link with title attribute:

<a href="http://example.com" 
  title="Read about evaluating sources with the CRAP Test">
  The Crap Test
</a>

Link with aria-label attribute:

<a href="http://example.com" 
  aria-label="Read more about evaluating sources">
  The Crap Test
</a>
Look and Feel:  Ensure text is visually distinct from background colors

When designing the look and feel of LibGuides, where possible, ensure a high level of contrast between text and background colors for readability.  For example, consider enhancing the text contrast on box labels, which by default have somewhat low contrast (dark grey text on light grey background).  

A screenshot of a default LibGuides tab heading reading "Profile Box", with dark grey text over light grey text.

Figure 3:  LibGuides default box header, showing low contrast between text in box and background.

A LibGuides box header with text reading "Profile Box" where the text contrast has been enhanced by making it black against a light grey background.

Figure 4:  LibGuides box header with font color set to #000000 in administrative Look and Feel settings.

For any element on the page, avoid using colors that do not have high contrast with background color features.

More Resources

Many LibGuides authors have created excellent guides to accessibility for guide authors at their institution, and SpringShare also provides an useful  guide for best practices for LibGuides content creators that covers some accessibility practices.  Here are a few resources from the LibGuides community that helped me enormously when doing this evaluation:

The ACRL Universal Accessibility Interest Group (UAIG) is currently exploring the formation of a subcommittee to review LibGuides accessibility and potentially create a more comprehensive guide to best practices for LibGuides accessibility.  You can join the UAIG through your ALA / ACRL membership to learn more about this initiative.

I would also love to hear from other who have done this kind of testing and found other issues.  Do you have a guide to best practices that covers accessibility?  Are you aware of other features in LibGuides that are not accessible to all users?  Comment here or tweet me @lpmagnuson.

Notes

  1. The mission of the Universal Design Center is “to assist the campus community in creating pathways for individuals to learn, communicate, and share via information technology.  Part of the mission is to help the campus community design-in interoperability, usability, and accessibility into information technology so that individual learning and processing styles, or physical characteristics are not barriers to accessing information.” http://www.csun.edu/universaldesigncenter
  2. For an excellent overview of web accessibility compliance, see Cynthia Ng’s articles on ACRL Tech Connect at http://acrl.ala.org/techconnect/post/making-your-website-accessible-part-1-understanding-wcag, http://acrl.ala.org/techconnect/post/making-your-website-accessible-part-2-implementing-wcag, and http://acrl.ala.org/techconnect/post/making-your-website-accessible-part-3-content-wcag-compliance.
  3. For a great example of the extensive customization that can be done in LibGuides 2.0’s Bootstrap framework, see http://acrl.ala.org/techconnect/post/migrating-to-libguides-2-0

How is programming work supported (or not…) by administrators in libraries?

[Editor’s Note:  This post is part of a series of posts related to ACRL TechConnect’s 2015 survey on Programming Languages, Frameworks, and Web Content Management Systems in Libraries.  The survey was distributed between January and March 2015 and received 265 responses.  The first post in this series is available here.]

In our last post in this series, we discussed how library programmers learn about and develop new skills in programming in libraries.  We also wanted to find out how library administrators or library culture in general does or does not support learning skills in programming.

From anecdotal accounts, we hypothesized that learning new programming skills might be impeded by factors including lack of access to necessary technologies or server environments, lack of support for training, travel or professional development opportunities, or overloaded job descriptions that make it difficult to find the time to learn and develop new skills.  While respondents to our survey did in some cases indicate these barriers, we actually found that most respondents felt supported by their administration or library to develop new programming skills.

Most respondents feel supported, but lack of time is a problem

The question we asked respondents was:

Please describe how your employing institution either does or does not support your efforts to learn or improve programming or development skills. “Support” can refer to funding, training, mentoring, work time allocation, or other means of support.

The question was open-ended, enabling respondents to provide details about their experiences.  We received 193 responses to this question and categorized responses by whether they overall indicated support or lack of support.  74% of respondents indicated at least some support for learning programming by their library administration, while 26% report a lack of support for learning programming.

Of those who mentioned that their administration or supervisors provide a supportive environment for learning about programming, the top kind of support mentioned was training, closely followed by funding for professional development opportunities.  Flexibility in work time was also frequently mentioned by respondents.  Mentoring and encouragement were mentioned less frequently.

 

However, even among those who feel supported in terms of funding and training opportunities, respondents indicated that time to actually complete training or professional development, is, in practice, scarce:

Work time allocation is a definite issue – I’m the only systems librarian and have responsibilities governing web site, intranet, discovery layer, link resover, ereserve system, meeting room booking system and library management system. No time for deep learning.

Low staffing often contributes to the lack of time to develop skills, even in supportive environments:

They definitely support developing new skills, but we have a very small technology staff so it’s difficult to find time to learn something new and implement it.

Respondents indicated the importance to their employers of aligning training and funding requests with current work projects and priorities:

I would be able to get support in terms of work time allocation, limited funding for training. I’m limited by external control of library technology platforms (centrally administrated), need to identify utility of learning language to justify training, use, &c.

26% of respondents indicate a lack of support for learning programming

Of those respondents who indicated that their workplace is not supportive of programming professional development or learning opportunities, lack of funding and training was the most commonly cited type of support that respondents found lacking.

Lack of  Funding and Training

The main lack of support comes in the form of funding and training. There are few opportunities to network and attend training events (other than virtually online) to learn how to do my job better. I basically have to read and research (either with a book or on the web) to learn about programming for libraries.

Respondents mentioned that though they could do training during their work hours, they are not necessarily funded to do so:

I am given time for self-education, but no formal training or provision for formal education classes.

Lack of Mentoring / Peer Support

Peer support was important to many respondents, both in supportive and unsupportive environments.  Many respondents who felt supported mentioned how important it was to have colleagues in their workplace to whom they can turn to get advice and help with troubleshooting.  Comments such as this one illustrate the difficulty of being the only systems or technology support person in one’s workplace:

They are very open to supporting me financially and giving me work time to learn (we have an institutional license to lynda.com and they have funded off site training), but there is not a lot of peer support for learning. I am a solo systems department and most of our campus IT staff are contractors, so there is not the opportunity for a community of colleagues to share ideas and to learn from each other.

Understaffing / Low Pay for Programming Skills

Closely related to the lack of peer support, respondents specifically mentioned that being the only technical staff person at their institution can make it difficult to find time for learning, and that understaffing contributes to the high workload:

There’s no money for training and we are understaffed so there’s no time for self-taught skills. I am the only non-Windows programmer so there’s no one I can confer with on programming challenges. I learn whatever I need to know on the fly and only to the degree it’s necessary to get the job done.

I’m the only “tech” on site, so I don’t have time to learn anything new.

One respondent mentioned that pay for those with programming skills is not competitive at his or her institution:

We have zero means for support, partially due to a complex web of financial reasons. No training, little encouragement, and a refusal to hire/pay at market rates programming staff.

Future Research and Other Questions

As with the first post in this series, the analysis of the data yields more questions than clear conclusions.  Some respondents indicated they have very supportive workplaces, where they feel like their administration and supervisors provide every opportunity to develop new skills and learn about the technologies they want to learn about.  Others express frustration with the lack of funding or ability to collaborate with colleagues on projects that require programming skills.

One question that requires a more thorough examination of the data is whether those whose jobs do not specifically require programming skills feel as supported in learning about programming as those who were hired to be programmers.  30% of survey respondents indicated that programming is *not* part of their official job duties, but that they do programming or similar activities to perform job functions.  Initial analysis indicates there is no significant difference between these respondents and respondents as a whole.  However, there may be differences in support based on the type of position one has in a library (e.g., staff, faculty, or administration), and we did not gather that information from respondents in this survey.  At least two respondents, however, indicates that this may be the case at least at some libraries:

Training & funding is available; can have release time to attend; all is easier for librarians to obtain than for staff to obtain which is sad since staff tend to do more of the programming

Some staff have a lot of support, some have nill, it depends on where/what project you are working on.

In the next (and final) post in this series, we’ll explore some preliminary data on popular programming languages in libraries, and examine how often library programmers get to use their preferred programming languages in their work.

Where do Library Staff Learn About Programming? Some Preliminary Survey Results

[Editor’s Note:  This post is part of a series of posts related to ACRL TechConnect’s 2015 survey on Programming Languages, Frameworks, and Web Content Management Systems in Libraries.  The survey was distributed between January and March 2015 and received 265 responses.  A longer journal article with additional analysis is also forthcoming.  For a quick summary of the article below, check out this infographic.]

Our survey on programming languages in libraries has resulted in a mountain of fascinating data.  One of the goals of our survey was to better understand how staff in libraries learn about programming and develop their coding skills.  Based upon anecdotal evidence, we hypothesized that library staff members are often self-taught, learning through a combination of on-the-job learning and online tutorials.  Our findings indicate that respondents use a wide variety of sources to learn about programming, including MOOCs, online tutorials, Google searches, and colleagues.

Are programming skills gained by formal coursework, or in Library Science Master’s Programs?

We were interested in identifying sources of programming learning, whether that involved course work (either formal coursework as part of a degree or continuing education program, or through Massive Online Open Courseware (MOOCs)).  Nearly two-thirds of respondents indicated they had an MLS or were working on one:

When asked about coursework taken in programming, application, or software development, results were mixed, with the most popular choice being 1-2 classes:

However, of those respondents who have taken a course in programming (about 80% of all respondents) AND indicated that they either had an MLS or were attending an MLS program, only about a third had taken any of those courses as part of a Master’s in Library Science program:

Resources for learning about programming

The final question of the survey asked respondents, in an open-ended way, to describe resources they use to learn about programming.  It was a pretty complex question:

Please list or describe any learning resources, discussion boards or forums, or other methods you use to learn about or develop your skills in programming, application development, or scripting. Please includes links to online resources if available. Examples of resources include, but are not limited to: Lynda.com, MOOC courses, local community/college/university course on programming, Books, Code4Lib listserv, Stack Overflow, etc.).

Respondents gave, in many cases, incredibly detailed responses – and most respondents indicated a list of resources used.  After coding the responses into 10 categories, some trends emerged.  The most popular resources for learning about programming, by far, were courses (whether those courses were taken formally in a classroom environment, or online in a MOOC environment):

To better illustrate what each category entails, here are the top five resources in each category:

By far, the most commonly cited learning resource was Stack Overflow, followed by the Code4Lib Listserv, Books/ebooks (unspecified) and Lynda.com.  Results may skew a little toward these resources because they were mentioned as examples in the question, priming respondents to include them in their responses.  Since links to the survey were distributed, among other places, on the Code4Lib listserv, its prominence may also be influenced by response bias. One area that was a little surprising was the number of respondents that included social networks (including in-person networks like co-workers) as resources – indeed, respondents who mentioned colleagues as learning resources were particularly enthusiastic, as one respondent put it:

…co-workers are always very important learning resources, perhaps the most important!

Preliminary Analysis

While the data isn’t conclusive enough to draw any strong conclusions yet, a few thoughts come to mind:

  • About 3/4 of respondents indicated that programming was either part of their job description, or that they use programming or scripting as part of their work, even if it’s not expressly part of their job.  And yet, only about a third of respondents with an MLS (or in the process of getting one) took a programming class as part of their MLS program.  Programming is increasingly an essential skill for library work, and this survey seems to support the view that there should be more programming courses in library school curriculum.
  • Obviously programming work is not monolithic – there’s lots of variation among those who do programming work that isn’t reflected in our survey, and this survey may have unintentionally excluded those who are hobby coders.  Most questions focused on programming used when performing work-related tasks, so additional research would be needed to identify learning strategies of enthusiast programmers who don’t have the opportunity to program as part of their job.
  • Respondents indicated that learning on the job is an important aspect of their work; they may not have time or institutional support for formal training or courses, and figure things out as they go along using forums like Stack Overflow and Code4Lib’s listserv.  As one respondent put it:

Codecademy got me started. Stack Overflow saves me hours of time and effort, on a regular basis, as it helps me with answers to specific, time-of-need questions, helping me do problem-based learning.

TL;DR?  Here’s an infographic:



In the next post, I’ll discuss some of the findings related to ways administration and supervisors support (or don’t support) programming work in libraries.

GIS and Geospatial Data Tools

I was recently appointed the geography subject librarian for my library, which was mildly terrifying considering that I do not have a background in geography. But I was assigned the subject because of my interest in data visualization, and since my appointment I’ve learned a few things about the awesome potential opportunities to integrate Geographic Information Systems (GIS) and geospatial visualization tools into information literacy instruction and library services generally.  A little bit of knowledge about GIS and geospatial visualization goes a long way, and is useful across a variety of disciplines, including social sciences, business, humanities and environmental studies and sciences.   If you are into open data (who isn’t?) and you like maps and / or data visualization (who doesn’t?!) then it’s definitely worth it to learn about some tools and resources to work with geospatial information.

About GIS and Geospatial Data

Geographic Information Systems, or GIS, are software tools that enable visualizing and interpreting data (social, demographic, economic, political, topographic, spatial, natural resources, etc.) using maps and geospatial data. Often data is visualized using layers, where a base map (containing, for example, a political map of a city) or tiles are overlaid with shapes, data points, or choropleth shading. For example, in the map below, a map of districts in Tokyo is overlaid with data points representing the number of seniors living in the area: 1

You may be familiar with Google Earth, which has a lot of features similar to a GIS (but is arguably not really a GIS, due to its lack of data analysis and query tools typically found in a fully-featured GIS). You can download a free Pro version of Google Earth that enables you to import GIS data. GIS data can appear in a variety of formats, and while there isn’t space here to go into each of them, a few common formats you might come across include Shapefiles, KML, and GeoJSON2  Shapefiles, as the name suggests, represent shapes (e.g., polygons) as layers of vector data that can be visualized in GIS programs and Google Earth Pro.  You may also come across KML files (Keyhole Markup Language), which is an XML-style standard  for representing geographic data, and is commonly used with Google Earth and Google Maps.  GeoJSON is another format for representing geospatial information that is ideal for use with web services.  The various formats of GIS and geospatial data deserve a full post on their own, and I plan to write a follow-up post exploring some of these formats and how they are used in greater detail.

GIS/Geospatial Visualization Tools

ArcGIS (ESRI)

ArcGIS is arguably the industry standard for GIS software, and the maker of ArcGIS (ESRI) publishes manuals and guides for GIS students and practitioners.  There are a few different ArcGIS products:  ArcGIS for Desktop, ArcGIS Online, and ArcGIS server.  Personally I am only familiar with ArcGIS online, but you can do some pretty cool things with a totally free account, like create this map of where drones can and cannot fly in the United States: 3

ArcGIS can be very powerful and is particularly useful for complex geospatial datasets and visualizations (particularly visualizations that might require multiple layers of data or topographic / geologic data). A note about signing up with ArcGIS online:  You don’t actually need to sign up for a ‘free trial’ to explore the software – you can just create a free account that, as I understand it, is not limited to a trial period.  Not all features may be available in the completely free account.

CartoDB

CartoDB is both an open source application and a freemium cloud service that can be used to make some pretty amazing geospatial visualizations that can be embedded in web pages, like this choropleth that visualizes the amount of various kinds of pollution across Los Angeles.4

CartoDB’s aesthetics are really strong, and default map settings tend to be pretty gorgeous.  It also leverages Torque to enable animations (which is what’s behind the heatmap animation of this map showing Twitter activity related to Ferguson, MO over time).5  CartoDB can import Shapefiles, GeoJSON, and .csv files, and has a robust SQL API (built on PostGreSQL) that can be used to import and export data. CartoDB also has its own JavaScript library (CartoDB.js) that can be leveraged for building attractive custom apps.

More JavaScript Libraries

In addition to CartoDB.js mentioned above, there are lots of other flexible JavaScript libraries for mapping and geospatial visualization on the scene that can be leveraged for visualizing geospatial data:

  • OpenLayers – OpenLayers enables pulling in ’tile’ layers as base maps from a variety of sources, as well as enabling parsing of vector data in a wide range of formats, such as GeoJSON and KML.
  • Leaflet.js – A fairly user-friendly and lightweight library used for creating basic interactive, mobile-friendly maps.  In my opinion, Leaflet is a good library to get started with if you’re just jumping in to geospatial visualization.
  • D3.js – Everyone’s favorite JavaScript charting library also has some geospatial visualization features for certain kinds of maps, such as this choropleth example.
  • Mapbox Mapbox.js is a JavaScript API library built on top of Leaflet.js, but Mapbox also offers a suite of tools for more extensive mapping and geospatial visualization needs

Open Geospatial Data

Librarians wanting to integrate geospatial data visualization and GIS into interdisciplinary instruction can take advantage of open data sets that are increasingly available online. Sui (2014) notes that increasingly large data sets are being released freely and openly on the web, which is an exciting trend for GIS and open data enthusiasts. However, Sui also notes that the mere fact that data is legally released and made accessible “does not necessarily mean that data is usable (unless one has the technical expertise); thus they are not actually used at all.”6  Libraries could play a crucial role in helping users understand and interpret public data by integrating data visualization into information literacy instruction.

Some popular places to find open data that could be used in geospatial visualiation include:

  • Data.gov  Since 2009, Data.gov has published thousands of public open datasets, including datasets containing geographic and geospatial information.  As of this month, you can now open geospatial data files directly in CartoDB (requires a CartoDB account) to start making visualizations.  There isn’t a huge amount of geospatial data available currently, but Data.gov will hopefully benefit from initiatives like Project Open Data, which was launched in 2013 by the White House and designed to accelerate the publishing of open data sets by government agencies.
  • Google Public Data Explorer – This is a somewhat small set of public data that Google has gathered from other open data repositories (such as Eurostat) that can be directly visualized using Google charting tools.  For example, you could create a visualization of European population change by country using data available through the Public Data Explorer.  While the currently available data is pretty limited, Google has prepared a kind of open data metadata standard (Data Set Publishing Language, or DSPL) that might increase the availability of data through the explorer if the standard takes off.
  • publicdata.eu – The destination for Europe’s public open data, a nice feature of publicdata.eu is the ability to filter down to datasets that contain Shapefiles (.shp files) that can be directly imported into GIS software or Google Earth Pro.
  • OpenStreetMap (OSM) –  Open, crowdsourced street map data that can be downloaded or referenced to create basemaps or other geospatial visualizations that rely on transportation networks (roads, railways, walking paths, etc.).  OpenStreetMap data are open, so for those who would prefer to make applications that are based entirely on open data (rather than commercial solutions), OSM can be combined with JavaScript libraries like Leaflet.js for fully open geospatial applications.

GIS and Geospatial Visualization In the Library

I feel like I’ve only really scratched the surface with the possibilities for libraries to get involved with GIS and geospatial data.  Libraries are doing really exciting things with these technologies, whether it’s creating new ways of interacting with historical maps, lending GPS units, curating and preserving geospatial data, exploring geospatial linked data possibilities with GeoSPARQL or integrating GIS or geospatial visualization into information literacy / instruction programs.  For more ideas about integrating GIS and geospatial visualization into library instruction and services, check out these guides:

(EDIT 4/13) Also be sure to check out ALA’s Map and Geospatial Information Round Table (MAGIRT).  Thanks to Paige Andrew and Kathy Weimer for pointing out this awesome resource in the comments.

If you’re working on something awesome related to geospatial data in your library and would be interested in writing about it for ACRL TechConnect, contact me on Twitter @lpmagnuson or drop me a line in the comments!

Notes

  1. AtlasPublisher. Tokyo Senior Population. https://www.arcgis.com/home/webmap/viewer.html?webmap=6990a8c5e87b42ee80701cf985383d5d.  (Note:  Apologies if you have trouble seeing or zooming in on embedded visualizations in this post; the interaction behavior of these embedded iframes can be a little unpredictable if your cursor gets near them.  It’s definitely a drawback of embedding these interactive visualizations as iframes.)
  2. The Open Geospatial Consortium is an organization that gathers and shares information about geographic and geospatial data formats, and details about a variety of geospatial file formats and standards can be found on its website:  http://www.opengeospatial.org/.
  3. ESRI. A Nation of Drones. http://story.maps.arcgis.com/apps/MapSeries/?appid=79798a56715c4df183448cc5b7e1b999
  4. Lauder, Thomas Suh (2014).  Pollution Burdenshttp://graphics.latimes.com/responsivemap-pollution-burdens/.
  5. YMMV, but the performance of map animations that use Torque seems to be a little tricky, especially when embedded in an iFrame.  I tried to embed the Ferguson Twitter map into this post (because it is really cool looking), and it really slowed down page loading, and the script seemed to get stuck at times.
  6. Sui, Daniel. “Opportunities and Impediments for Open GIS.” Transactions in GIS, 18.1 (2014): 1-24.