Copyright Changes: It’s All Connected

UPDATE: Just after this post was published, the U.S. Copyright Office released the long-awaited Discussion Document that was referenced below in this post. In this document the Copyright Office affirms a commitment to retaining the Fair Use Savings clause.

Libraries rely on exceptions to copyright law and provisions for fair use to provide services. Any changes to those rules have big implications to services we provide. With potential changes coming in an uncertain political climate, I would like to take a look at what we know, what we don’t know, and how it’s all related. Each piece as it currently stands works in relation to the others, and a change to any one of them changes the overall situation for libraries. We need to understand how everything relates, so that when we approach lawmakers or create policies we think holistically.

The International Situation

A few months back at the ALA Annual Conference in Chicago, I attended a panel called “Another Report from the Swamp,” which was a copyright policy specific session put on by the ALA Office of Information Technology Policy (OITP) featuring copyright experts Carrie Russell (OITP), Stephen Wyber (IFLA), Krista Cox (the Association of Research Libraries [ARL]). This panel addressed international issues in copyright in addition to those in the United States, which was a helpful perspective. They covered a number of topics, but I will focus on the Marrakesh Treaty and potential changes to US Code Title 17, Section 108 (Reproduction by libraries and archives).

Stephen Wyber and Krista Cox covered the WIPO Marrakesh Treaty to Facilitate Access to Published Works for Persons Who Are Blind, Visually Impaired or Otherwise Print Disabled (aka the Marrakesh Treaty), which the US is a signatory to, but has not yet been ratified by the US Senate (see Barack Obama’s statement in February 2016). According to them, in much of the developing world only 1% of published work is available for those with print disabilities. This was first raised as issue in 1980, and 33 years later debates at WIPO began to address the situation. This treaty was ratified last year, and permits authorized parties (including libraries) to make accessible copies of any work in a medium that is appropriate for the disabled individual. In the United States, this is generally understood to be permitted by Title 17 sections Fair Use (Section 107) and Section 121 (aka the Chaffee amendment), though this is still legally murky 1. This is not the case internationally. Stephen Wyber pointed out that IFLA must engage at the European level with the European Commission for negotiations at WIPO, and there is no international or cross-border provision for libraries, archives, or museums.

According to Krista Cox, a reason for the delay in ratification was that the Senate Committee on Foreign Relations wouldn’t move it to ratification unless it was a non-controversial treaty with no changes required for US law (and it should not have required changes). The American Association of Publishers (AAP) wanted to include recordkeeping requirements, which disability and library advocates argued would be onerous. (A good summary of the issues is available from the ALA Committee on Legislation). During the session, a staff attorney from the AAP stood up and made the point that their position was that it would be helpful for libraries to track what accessible versions of material they had made. While not covered in the session explicitly, a problem with this approach is that it would create a list of potentially sensitive information about patron activities. Even if no names were attached, the relatively few people making use of the service would make it possible to identify individual users. In any event, the 114th Congress took no action, and it is unclear when this issue will be taken up again. For this reason, we have to continue to rely on existing provisions of the US Code.

Along those lines, the panel gave a short update on potential changes to Section 108 of the Copyright Act, which have been under discussion for many years. Last year, the Copyright Office invited stakeholders to set up meetings to discussion revisions. The library associations met with them last July, and generally while the beneficiaries of Section 108 find revisions controversial and oppose reform, the Copyright Office is continuing work on this. One fear with revisions is that the Fair Use exception clause (17 § 108 (F)(4)) would be removed. Krista Cox reported that at the Copyright Society of the USA meeting in early June 2017, the Copyright Office reported that they were working on a report with proposed legislation, but no one has seen this report [NOTE: the report is now available.].

Implications for Revisions to Title 17

Moving beyond the panel, let’s look at the larger implications for revisions to Title 17. There are some excellent reasons to revise Section 108 and others–just as the changes in 1976 reflected changes in photocopying technology 2, changes in digital technology and the services of libraries require additional help. In 2008, the Library of Congress Section 108 Study Group released a lengthy report with a set of recommendations for revisions, which can be boiled down into extending permissions for preservation activities (though that is a gross oversimplification). In 2015 Maria A. Pallante testified to the Committee on the Judiciary of the House of Representatives on a wide variety of changes to the Copyright Act (not just for libraries), which incorporated the themes from that 2008 report, in addition to other later discussions. Essentially, she says that changes in technology and culture in the past 20 years made much of the act unclear and required application of loopholes and workarounds that were legally unclear. For instance, libraries rely heavily on Section 107, which covers fair use, to perform their daily functions. This report points out that those activities should be explicitly permitted rather than relying on potentially ambiguous language in Section 107, since the ambiguity means some institutions are unwilling to perform activities that may be permitted due to fear. On the other hand, that ambiguous language opens up more possibilities that adventurous projects such as HathiTrust have used to push on boundaries and expand the nature of fair use and customary practice. The ARL has a Code of Best Practices in Fair Use that details what is currently considered customary practice. With revisions, there enters the possibility that what is allowed will be dictated by, for instance, the publishing lobby, and that what libraries can do will be overly circumscribed. Remember, too, that one reason for not ratifying the Marrakesh Treaty is that allowances for reproductions for the disabled are covered by Fair Use and the Chaffee amendment.

Orphan works are another issue. While the Pallante report suggests that it would be in everyone’s interest to have clear guidance on what a good faith effort to identify a copyright holder actually meant, in many ways we would rather have general practice mandate this. Speaking as someone who spends a good portion of my time clearing permissions for material and frequently running into unresponsive or unknown copyright holders, I feel more comfortable pushing the envelope if I have clearly documented and consistently followed procedures based on practices that I know other institutions follow as well (see the Statement of Best Practices). This way I have been able to make useful scholarship more widely available despite the legal gray area. But there is a calculated risk, and many institutions choose to never make such works available due to the legal uncertainty. Last year the Harvard Office of Scholarly Communication released a report on legal strategies for orphan work digitization to give some guidance in this area. To summarize over 100 pages, there are a variety of legal strategies libraries can take to either minimize the risk of a dispute or reduce negative consequences of a copyright dispute–which remains largely hypothetical when it comes to orphan works and libraries anyway.

Future Considerations

There is one other important wrinkle in all this. The Copyright Office’s future is politically uncertain. It could be removed from the purview of the Library of Congress, and the Register of Copyrights be made a political appointment. This was passed by the House in April and introduced in the Senate in May, and was seen as a rebuke to Carla Haydenn. Karyn Temple Claggett is the acting Registrar, replacing Maria Pallante who resigned last year after Carla Hayden became the new Librarian of Congress and appointed (some say demoted) her to the post of Senior Advisor for Digital Strategy. Maria Pallante is now CEO of–you guessed it–the American Association of Publishers. The story is full of intrigue and clashing opinions–one only has to see the “possibly not neutral” banner on Pallante’s Wikipedia page to see that no one will agree on the reasons for Pallante’s move from Register of Copyrights (it may have been related to wasteful spending), but libraries do not see the removal of copyright from the Library of Congress as a good thing. More on this is available at the recent ALA report “Lessons From History: The Copyright Office Belongs in the Library of Congress.”

Given that we do not know what will happen to the Copyright Office, nor exactly what their report will recommend, it is critical that we pay attention to what is happening with copyright. While more explicit provisions to allow more would be excellent news, as the panel at ALA pointed out, lawmakers are more exposed to Hollywood and the content creator organizations such as AAP, RIAA and MPAA and so may be more likely to see arguments from their point of view. We should continue to take advantage of provisions we currently have for fair use and providing access to orphan works, since exercising this right is one way we keep it.

  1. “Briefing: Accessibility, the Chafee Amendment, and Fair Use. ” (2012). Association of Research Libraries. http://www.arl.org/focus-areas/copyright-ip/fair-use/code-of-best-practices/2445-briefing-accessibility-the-chafee-amendment-and-fair-use#.WbadWMiGNPY
  2. Federal Register Vol. 81, No. 109 https://www.federalregister.gov/d/2016-13426/p-8

Working with a Web Design Firm

As I’ve mentioned in the previous post, my library is undergoing a major website redesign. As part of that process, we contracted with an outside web design and development firm to help build the theme layer. I’ve done a couple major website overhauls in the course of my career, but never with an outside developer participating so much. In fact, I’ve always handled the coding part of redesigns entirely by myself as I’ve worked at smaller institutions. This post discusses what the process has been like in case other libraries are considering working with a web designer.

An Outline

To start with, our librarians had already been working to identify components of other library websites that we liked. We used Airtable, a more dynamic sort of spreadsheet, to collect our ideas and articulate why we liked certain peer websites, some of which were libraries and some not (mostly museums and design companies). From prior work, we already knew we wanted a few different page templates types. We organized our ideas around how they fit into these templates, such as a special collections showcase, a home page with a central search box, or a text-heavy policy page.

Once we knew we were going to work with the web development firm, we had a conference call with them to discuss the goals of our website redesign and show the contents of our Airtable. As we’re a small art and design library, our library director was actually the one to create an initial set of mockups to demonstrate our vision. Shortly afterwards, the designer had his own visual mockups for a few of our templates. The mockups included inline comments explaining stylistic choices. One aspect I liked about their mockups was that they were divided into desktop and mobile; there wasn’t just a “blog post” example, but a “blog post on mobile” and “blog post on desktop”. This division showed that the designer was already thinking ahead towards how the site’s theme would function on a variety of devices.

With some templates in hand, we could provide feedback. There was some push and pull—some of our initial ideas the designer thought were unimportant or against best practices, while we also had strong opinions. The discussion was interesting for me, as someone who is a librarian foremost but empathetic to usability concerns and following web conventions. It was good to have a designer who didn’t mindlessly follow our every request; when he felt like a stylistic choice was counterproductive, he could articulate why and that changed a few of our ideas. However, on some principles we were insistent. For instance, we wanted to avoid multiple search boxes on a single page; not a central catalog search and a site search in the header. I find that users are easily confused when confronted with two search engines and struggle to distinguish the different purposes and domains of both. The designer thought that it was a common enough pattern to be familiar to users, but our experiences lead us to insist otherwise.

Finally, once we had settled on agreeable mockups, a frontend developer turned them into code with an impressive turnaround; about 90% of the mockups were implemented within a week and a half. We weren’t given something like Drupal or WordPress templates; we received only frontend code (CSS, JavaScript) and some example templates showing how to structure our HTML. It was all in single a git repository complete with fake data, Mustache templates, and instructions for running a local Node.js server to view the examples. I was able to get the frontend repo working easily enough, but it was a bit surprising to me working with code completely decoupled from its eventual destination. If we had had more funds, I would have liked to have the web design firm go all the way to implementing their theme in our CMS, since I did struggle in a few places when combining the two (more on that later). But, like many libraries, we’re frugal, and it was a luxury to get this kind of design work at all.

The final code took a few months to deliver, mostly due to a single user interface bug we pointed out that the developer struggled to recreate and then fix. I was ready to start working with the frontend code almost exactly a month after our first conversation with the firm’s designer. The total time from that conversation to signing off on the final templates was a little under two months. Given our hurried timeline for rebuilding our entire site over the summer, that quick delivery was a serious boon.

Code Quirks

I’ve a lot of opinions about how code should look and be structured, even if I don’t always follow them myself. So I was a bit apprehensive working with an outside firm; would they deliver something highly functional but structured in an alien way? Luckily, I was pleasantly surprised with how the CSS was delivered.

First of all, the designer didn’t use CSS, he used SASS, which Margaret wrote about previously on Tech Connect. SASS adds several nice tools to CSS, from variables to darken and lighten functions for adjusting colors. But perhaps most importantly, it gives you much more control when structuring your stylesheets, using imports, nested selectors, and mixins. Basically, SASS is the antithesis of having one gigantic CSS file with thousands of lines. Instead, the frontend code we were given was about fifty files neatly divided by our different templates and some reusable components. Here’s the directory tree of the SASS files:

components
    about-us
    blog
    collections
    footer
    forms
    header
    home
    misc
    search
    services
fonts
reset
settings
utilities

Other than the uninformative “misc”, these folders all have meaningful names (“about-us” and “collections” refer to styles specific to particular templates we’d asked for) and it never takes me more than a moment to locate the styles I want.

Within the SASS itself, almost all styles (excepting the “reset” portion) hinge on class names. This is a best practice for CSS since it doesn’t couple your styles tightly to markup; whether a particular element is a <div>, <section>, or <article>, it will appear correctly if it bears the right class name. When our new CMS output some HTML in an unexpected manner, I was still able to utilize the designer’s theme by applying the appropriate class names. Even better, the class names are written in BEM “Block-Element-Modifier” form. BEM is a methodology I’d heard of before and read about, but never used. It uses underscores and dashes to show which high-level “block” is being styled, which element inside that block, and what variation or state the element takes on. The introduction to BEM nicely defines what it means by Block-Element-Modifier. Its usage is evident if you look at the styles related to the “see next/previous blog post” pagination at the bottom of our blog template:

.blog-post-pagination {
  border-top: 1px solid black(0.1);
 
  @include respond($break-medium) {
    margin-top: 40px;
  }
}
 
  .blog-post-pagination__title {
    font-size: 16px;
  }
 
  .blog-post-pagination__item {
    @include clearfix();
    flex: 1 0 50%;
 }
 
  .blog-post-pagination__item--prev {
    display: none;
  }

Here, blog-post-pagination is the block, __title and __item are elements within it, and the --prev modifier effects just the “previous blog post” item element. Even in this small excerpt, other advantages of SASS are evident: the respond mixin and $break-medium variables for writing responsive styles that adapt to differing device screen sizes, the clearfix include, and these related styles all being nested inside the brackets of the parent blog-post-pagination block.

Trouble in Paradise

However, as much as I admire the BEM class names and structure of the styles given to us, of course I can’t be perfectly happy. As I’ve started building out our site I’ve run into a few obvious problems. First of all, while all the components and templates we’d asked for are well-designed with clearly written code, there’s no generic framework for adding on anything new. I’d hoped, and to be honest simply assumed, that a framework like Bootstrap or Foundation would be used as the basis of our styles, with more specific CSS for our components and templates. Instead, apart from a handful of minor utilities like the clearfix include referenced above, everything that we received is intended only for our existing templates. That’s fine up to a point, but as soon as I went to write a page with a HTML table in it I noticed there was no styling whatsoever.

Relatedly, since the class names are so focused on distinct blocks, when I want to write something similar but slightly different I end up with a bunch of misleading class names. So, for instance, some of our non-blog pages have templates which are littered with class names including a .blog- prefix. The easiest way for me to build them was to co-opt the blog styles, but now the HTML looks misleading. I suppose if I had greater time I could write new styles which simply copy the blog ones under new names, but that also seems unideal in that it’s a) a lot more work and b) leads to a lot of redundant code.

Lastly, the way our CMS handles “rich text” fields (think: HTML edited in a WYSIWYG editor, not coded by hand) has caused numerous problems for our theme. The rich text output is always wrapped in a <div class="rich-text">, which made translating some of the HTML templates from the frontend code a bit tricky. The frontend styles also included a “reset” stylesheet which erased all default styles for most HTML tags. That’s fine, and a common approach for most sites, but many of the styles for elements available in the rich text editor ended up being reset. As content authors went about creating lower-level headings and unordered lists, they discovered that they appeared just as plain text.

Reflecting on these issues, they boil primarily down to insufficient communication on our part. When we first asked for design work, it was very much centered around the specific templates we wanted to use for a few different sections of our site. I never specifically outlined a need for a generic framework which could encompass new, unanticipated types of content. While there was an offhand mention of Bootstrap early on in our discussions, I didn’t make it explicit that I’d like it or something similar to form the backbone of the styles we wanted. I should have also made it clearer that styles should specifically anticipate working within our CMS and alongside rich text content. Instead, by the time I realized some of these issues, we had already approved much of the frontend work as complete.

Conclusion

For me, as someone who has worked at smaller libraries for the duration of their professional career, working with a web design company was a unique experience. I’m curious, has your library contracted for design or web development work? Was it successful or not? As tech savvy librarians, we’re often asked to do everything even if some of the tasks are beyond our skills. Working with professionals was a nice break from that and a learning experience. If I could do anything differently, I’d be more assertive about requirements in our initial talks. Outlining expectations about that the styles include a generic framework and anticipate working with our particular CMS would have saved me some time and headaches later on.

Information Architecture for a Library Website Redesign

My library is about to embark upon a large website redesign during this summer semester. This isn’t going to be just a new layer of CSS, or a minor version upgrade to Drupal, or moving a few pages around within the same general site. No, it’s going to be a huge, sweeping change that affects the whole of our web presence. With such an enormous task at hand, I wanted to discuss some of the tools and approaches that we’re using to make sure the new site meets our needs.

Why Redesign?

I’ve heard about why the wholesale website redesign is a flawed approach, why we should be continually, iteratively working on our sites. Continual changes stop problems from building up, plus large swaths of changes can disrupt our users who were used to the old site. The gradual redesign makes a lot of sense to me, and also seems like a complete luxury that I’ve never had in my library positions.

The primary problem with a series of smaller changes is that that approach assumes a solid fundamental to begin with. Our current site, however, has a host of interconnected problems that makes tackling any individual issue a challenge. It’s like your holiday lights sitting in a box all year; they’re hopelessly tangled by the time you take them out again.

Our site has decades of discarded, forgotten content. That’s mostly harmless; it’s hard to find and sees virtually no traffic. But it’s still not great to have outdated information scattered around. In particular, I’m not thrilled that a lot of it is static HTML, images, and documents sitting outside our content management system. It’s hard to know how much content we even have because it cannot be managed in one place.

We also fell into a pattern of adding content to the site but never removing or re-organizing existing content. Someone would ask for a button here, or a page dictating a policy there, or a new FAQ entry. Pages that were added didn’t have particular owners responsible for their currency and maintenance; I, as Systems Librarian, was expected to run the technical aspects of the site but also be its primary content editor. That’s simply an impossible task, as I don’t know every detail of the library’s operations or have the time to keep on top of a menagerie of pages of dubious importance.

I tried to create a “website changes form” to manage things, but it didn’t work for staff nor myself. The few staff who did fill out the form ended up requesting things that were difficult to do, large theme changes that I wasn’t comfortable making without user testing or approval from our other librarians. The little content that was added was minor text being ferried through this form and myself, essentially slowing down the editorial process and furthering this idea that web content was solely my domain.

To top our content troubles off, we’re also on an unsupported, outdated version of Drupal. Upgrading or switching a CMS isn’t necessarily related to a website redesign. If you have a functional website on a broken piece of software, you probably don’t want to toss out the good with the bad. But in our case, similar to how our ILS migration gave us the opportunity to clean up our bibliographic records, a CMS migration gives us a chance to rebuild a crumbling website. It just doesn’t make sense to invest technical effort in migrating all our existing content when it’s so clearly in need of major structural change.

Card Sort

Making a card sort
Cards in the middle of being constructed.

Not wanting to go into a redesign process blind, we set out to collect data on our current site and how it could be improved. One of the first ways we gathered data was to ask all library staff to perform a card sort. A card sort is an activity wherein pieces of web content are put on cards which can then be placed into categories; the idea is to form a rough information architecture for your site which can dictate structure and main menus. You can do either open or closed card sorts, meaning the categories are up to the participants to invent or provided ahead of time.

For our card sort, I chose to do an open card sort since we were so uncertain on the categories. Secondly, I selected web content based on our existing site’s analytics. It was clear to me that our current site was bloated and disorganized; there were pages tucked into the nooks of cyberspace that no one had visited in years. There was all sorts of overlapping and unnecessary content. So I selected ≈20 popular pages but also gave each group two pieces of blank paper on which to add whatever content they felt was missing.

Finally, trying to get as much and as useful data as possible, I modified the card sort procedure in a couple ways. I asked people to role play as different types of stakeholders (graduate & undergraduate students, faculty, administrators) and to justify their decisions from that vantage point. I also had everyone, after sorting was done, put dots on content they felt was important enough for the home page. Since one of our current site’s primary challenges in maintenance, or the lack thereof, I wanted to add one last activity wherein participants would write a “responsible staff member” on each card (e.g. the instruction librarian maintains the instruction policy page). Sadly, we ran out of time and couldn’t do that bit.

The results of the card sort were informative. A few categories emerged as a commonality across everyone’s sorts: collections, “about us”, policies, and current events/news. We discovered a need for new content to cover workshops, exhibits, and events happening in the library which were currently only represented (and not very well) on blog posts. In terms of the home page, it was clear that LibGuides, collections, news, and most importantly our open hours needed to represented.

Treejack & Analytics

Once we had enough information to build out the site’s architecture, I organized our content into a few major categories. But there were still several questions on my mind: would users understand terms like “special collections”? Would they understand where to look for LibGuides? Would they know how to find the right contact for various questions? To answer some of these questions, I turned to Optimal Workshop’s “Treejack” tool. Treejack tests a site’s information architecture by having users navigate basic text links to perform basic tasks. We created a few tasks aimed at answering our questions and recruited students to perform them. While we’re only using the free tier of Optimal Workshop, and only using student stakeholders, the data was till informative.

For one, Optimal Workshop’s results data is rich and visualized well. It shows the exact routes each user took through our site’s content, the time it took to complete a task, and whether a task was completed directly, completed indirectly, or failed. Completed directly means the user took an ideal route through our content; no bouncing up and down the site’s hierarchy. Indirect completion means they eventually got to the right place, but didn’t take a perfect path there, while failure means they ended in the wrong place. The graph’s the demonstrate each tasks’ outcomes are wonderful:

Data & charts for a task
The data & charts Treejack shows for a moderately successful task.
"Pie tree" visualizing users' paths
A “pie tree” showing users’ paths while attempting a task.

We can see here that most of our users found their way to LibGuides (named “study guides” here). But a few people expected to find them under our “Collections” category and bounced around in there, clearly lost. This tells us we should represent our guides under Collections alongside items like databases, print collections, and course reserves. While building and running your own Treejack-type tests would be easy, I definitely recommend Optimal Workshop as a great product which provides much insight.

There’s much work to be done in terms of testing—ideally we would adjust our architecture to address the difficulties that users had, recruit different sets of users (faculty & staff), and attempt to answer more questions. That’ll be difficult during the summer while there are fewer people on campus but we know enough now to start adjusting our site and moving along in the redesign process.

Another piece of our redesign philosophy is using analytics about the current site to inform our decisions about the new one. For instance, I track interactions with our home page search box using Google Analytics events 1. The search box has three tabs corresponding to our discovery layer, catalog, and LibGuides. Despite thousands of searches and interactions with the search box, LibGuides search is seeing only trace usage. The tab was clicked on a mere 181 times this year; what’s worse, only 51 times did a user actually search afterwards. This trace amount of usage, plus the fact that users are clearly clicking onto the tab and then not finding what they want there, indicates it’s just not worth any real estate on the home page. When you add in that our LibGuides now appear in our discovery layer, their search tab is clearly disposable.

What’s Next

Data, tests, and conceptual frameworks aside, our next stage will involve building something much closer to an actual, functional website. Tools like Optimal Workshop are wonderful for providing high-level views on how to structure our information, but watching a user interact with a prototype site is so much richer. We can see their hesitation, hear them discuss the meanings of our terms, get their opinions on our stylistic choices. Prototype testing has been a struggle for me in the past; users tend to fixate on the unfinished or unrefined nature of the prototype, providing feedback that tells me what I already know (yes, we need to replace the placeholder images; yes, “Lorem ipsum dolor sit amet” is written on every page) rather than something new. I hope to counter that by setting appropriate expectations and building a small but fairly robust prototype.

We’re also building our site in an entirely new piece of software, Wagtail. Wagtail is exciting for a number of reasons, and will probably have to be the subject of future posts, but it does help address some of the existing issues I noted earlier. We’re excited by the innovative Streamfield approach to content—a replacement for large, rich text fields which are unstructured and often let users override a site’s base styles. We’ve also heard whispers of new workflow features which let us send reminders to owners of different content pages to revisit them periodically. While I could do something like this myself with an ad hoc mess of calendar events and spreadsheets, having it build right into the CMS bodes well for our future maintenace plans. Obviously, the concepts underlying Wagtail and the tools it offers will influence how we implement our information architecture. But we also started gathering data long before we knew what software we’d use, so exactly how it will work remains to be figured out.

Has your library done a website redesign or information architecture test recently? What tools or approaches did you find useful? Let us know in the comments!

Notes

  1. I described Google Analytics events before in a previous Tech Connect post

Decentralizing Library IT

I’ve always gravitated toward library jobs in library systems and technology, but I recently took on a new position as head of a tech services department in a smaller academic library.  Some of my colleagues expressed surprised that I’m moving out of a traditional library IT or systems role, but my former position was as a systems librarian within a technical services department, and for the past few years, a significant amount of my time recently has involved developing collection and metadata-related system integrations for acquisitions and cataloging. A few trends have made me think that I’m not alone in branching out and applying systems skills to diverse functional areas of the library.  It has become relatively commonplace for the work of technology innovation to occur, at least in part, outside of traditional library IT departments; for example, reference and instruction librarians playing a tightly integrated role in the optimization of discovery interfaces, tech services staff using Python and linked data technologies to clean up and enhance metadata, and instruction librarians and access services staff creating and managing high-tech MakerSpaces.

More personnel across the library are embracing and developing high tech skills traditionally housed in library systems or IT departments.  The following are six general trends I’ve observed that are influencing the spread of technology development outside of traditional library IT.

Increasingly high technical skills are required for most library areas

  • Job advertisements for almost every functional area of the library emphasize advanced technical knowledge (beyond typical office application knowledge), especially with regard to ILS systems management.  In a 2016 study of library job advertisements, the authors found a wide range of job titles that require knowledge and skills in information technology, including Metadata Librarian, Digital Archivist, Information Literacy Librarian, and Research Data Librarian (Shahbazi, Fahimnia, & Khoshemehr, 2016).   Scholarly communications, data services, e-resource management, reference, and other library staff positions may all be positioned outside of traditional library IT, but are all deeply involved in the utilization and development of library technologies.

Optimization of cloud-based systems can be distributed

  • With cloud-based application hosting, managing physical servers and backups may become less burdensome, but the need for knowledgeable personnel to configure and optimize often complex cloud-based systems is as essential as it has always been.  Scholarly communications, e-resource, and access services library staff may play highly integrated roles in the development and optimizing of library systems.  Beyond acting as consultants for the management of these systems, knowledge in scripting and interoperability mechanisms enables staff outside of library IT to holistically contribute to the development of cloud-based library applications.

More opportunities to build integrations

  • New library services platforms often enable a number of integrations with third party systems via APIs and other web services.1 Many of these integrations require a deep knowledge of workflows and data structures between multiple systems, so including involvement from multiple functional areas is usually required. The combination of knowledge of a functional area with knowledge of how the library system works can result in some seriously powerful and useful applications.

The increasing importance of data wrangling

  • Sources of metadata are increasingly varied, requiring data wrangling, work that is often made more efficient by developing coding or scripting methods to automate routine tasks.  Metadata specialists are often experts in developing macros in OCLC Connexion (for example), and increasingly require access to a computing environment that enables writing, testing, and using code to further automate metadata cross-walking and cleanup, including use of Python, OpenRefine, and other tools for dealing with enormous amounts of messy data.

Customization of discovery and library e-content

  • Discovery platforms are complex and often highly customizable. Many reference and instruction librarians have a robust understanding of user behavior and information literacy goals that are essential for development of usable interfaces, as well as skills in user experience (UX) testing and interface design.  Reference and instruction librarians are often experts in course management systems and LibGuides, and know good tricks and hacks for optimizing digital learning content.  Library collection development, scholarly communications and technical services librarians deeply understand content and how to make it findable, and increasingly play pivotal roles in configuring harvesting and transformation of metadata into discovery systems.

Systems beyond the ILS

  • Libraries are engaging with a much wider variety of technologies than just an ILS – libraries support institutional repository and digital library software, data management software, authority systems such as VIAF, open publishing, etc. While working with these systems does not necessarily require a background or emphasis on systems administration, it is definitely helpful to have an understanding of the architecture of such applications and how applications might interact with each other.
  • Systems knowledge is applicable to more than just technology. Thinking like a programmer can often be useful when performing workflow analysis and optimization, as well as problem-solving even in non-technical areas.

Are “true” library systems administrators still needed? (Yes, obviously)

When researching for this post, I came across an amusing article by Roy Tennant from back in 2011 titled “If You are a Library SysAdmin, you are TOAST”.  The article presents a (seemingly not satirical?) argument that movement to cloud-based systems in libraries will make library system administrators obsolete:

When I, as just a moderately savvy librarian, can learn maybe five to ten very specific steps and be able to deploy any application I would likely want to deploy, why do I need to talk to my system administrator ever again?

Obviously, six years after this article was written, with many libraries firmly embedded in the cloud with a variety of library applications, the role of the system administrator in library is not at all diminished.  While work involving physical servers and backups may be less common for many applications, system administrators and those with IT skills in libraries are still in huge demand to be on hand to evaluate, optimize, and provide integrations for cloud-based library systems.

I think it’s safe to say that the more people in any organization with technical knowledge, the better.  Managing decentralized technology projects, however, does require leadership and coordination.  When learning to develop applications, coding is often much more fun than worrying about server administration and security – but of course, someone has to be concerned about security and help those who may just be learning about technology adopt secure development practices.  Library technology projects don’t have to come out of library IT departments, but leadership from library IT departments should be open and supportive of library technology initiatives coming out of non-library IT areas, while facilitating secure practices.  Coordination on the part of library IT is also essential to avoid duplication of effort and ensure that projects being developed are sustainable and supported by the technology environment of the larger organization.  Encouraging the open exchange of technology-related ideas across the library prevents tech savvy staff feeling they need to hide their pet projects lest they get ‘in trouble’ with a restrictive library IT department.

In my view, there’s simply too much technology change happening in library to keep all technology development centralized in a single unit within the library.  Adding tech-savvy positions within non-technology departments is not a bad strategy – it can help support innovation out of library departments that haven’t traditionally been expected to drive technological change.  However, continually raising expectations with regard to technical knowledge can be stressful, so ensuring that strong support for professional development is in place is also important.  In my own new position, I’m excited to channel my knowledge of APIs and interest in data visualization technologies into creating some cool collection management and assessment tools, and I’m not at all concerned that I won’t have an opportunity to apply my technical knowledge in the rapidly changing landscape of library technical services and collection development.  Working outside of library IT means that I need to communicate closely with the head of library IT about the projects I’m working on, and also be sure to closely follow other technology-related projects across the library and be proactive about offering my skills where they might be helpful.  It also means that I need to work to support the technical expertise of staff in my department, particularly as related to library system management in acquisitions and cataloging.    No matter your role in the library, there’s plenty of technology-related work to go around.

  1.  See, for example, the Ex Libris and OCLC Developer Networks, both of which provide great documentation and example applications to novice developers.

Voice, Natural Language Processing, and the Future of Library Experiences

Is the future of research voice controlled? It might be, because when I originally had the idea for this post my first instinct was to grab my phone and dictate my half-formed ideas into a note, rather than typing it out. Writing things down often makes them seem wrong and not at all what we are trying to say in our heads. (Maybe it’s not so new, since as you may remember Socrates had a similar instinct.) The idea came out of a few different talks at the national Code4Lib conference held in Los Angeles in March of 2017 and a talk given by Chris Bourg. Among these presentations the themes of machine learning, artificial intelligence, natural language processing, voice search, and virtual assistants intersect to give us a vision for what is coming. The future might look like a system that can parse imprecise human language and turn it into an appropriately structured search query in a database or variety of databases, bearing in mind other variables, and return the correct results. Pieces of this exist already, of course, but I suspect over the next few years we will be building or adapting tools to perform these functions. As we do this, we should think about how we can incorporate our values and skills as librarians into these tools along the way.

Natural Language Processing

I will not attempt to summarize natural language processing (NLP) here, except to say that speaking to a computer requires that the computer be able to understand what we are saying. Human—or natural—language is messy, full of nuance and context that requires years for people to master, and even then often leads to misunderstandings that can range from funny to deadly. Using a machine to understand and parse natural language requires complex techniques, but luckily there are a lot of tools that can make the job easier. For more details, you should review the NLP talks by Corey Harper and Nathan Lomeli at Code4Lib. Both these talks showed that there is a great deal of complexity involved in NLP, and that its usefulness is still relatively confined. Nathan Lomeli puts it like this. NLP can “cut strings, count beans, classify things, and correlate everything”. 1 Given a corpus, you can use NLP tools to figure out what certain words might be, how many of those words there are, and how they might connect to each other.

Processing language to understand a textual corpus has a long history but is now relatively easy for anyone to do with the tools out there. The easiest is Voyant Tools, which is a project by Sinclair, Stéfan Sinclair and Geoffrey Rockwell. It is a portal to a variety of tools for NLP. You can feed it a corpus and get back all kind of counts and correlations. For example, Franny Gaede and I used VoyantTools to analyze social justice research websites to develop a social justice term corpus for a research project. While a certain level of human review is required for any such project, it’s possible to see that this technology can replace a lot of human-created language. This is already happening, in fact. A tool called Wordsmith can create convincing articles about finance, sports, and technology, or really any field with a standard set of inputs and outputs in writing. If computers are writing stories, they can also find stories.

Talking to the Voice in the Machine

Finding those stories, and in turn, finding the data with which to tell more stories, is where machine learning and artificial intelligence enter. In libraries we have a lot of words, and while we have various projects that are parsing those words and doing things with them, we have only begun to see where this can go. There are two sides to this. Chris Bourg’s talk at Harvard Library Leadership in a Digital Age, asks the question “What happens to libraries and librarians when machines can read all the books?” One suggestion she makes is that:

we would be wise to start thinking now about machines and algorithms as a new kind of patron  — a patron that doesn’t replace human patrons, but has some different needs and might require a different set of skills and a different way of thinking about how our resources could be used. 2

One way in which we can start to address the needs of machines as patrons is by creating searches that work with them, which is for now ultimately to serve the needs of humans, but in the future could be for their own artificial intelligence purposes. Most people are familiar with virtual assistants that have popped up on all platforms over the past few years. As an iOS and a Windows user, I am now constantly invited to speak to Siri or Cortana to search for answers to my questions or fix something in my schedule. While I’m perfectly happy to ask Siri to remind me to bring my laptop to work at 7:45 AM or to wake me up in 20 minutes, I find mixed results when I try to ask a more complex question. 3 Sometimes when I ask the temperature on the surface of Jupiter I get the answer, other times I get today’s weather in a town called Jupiter. This is not too surprising, as asking “What is the temperature of Jupiter?” could mean a number of things. It’s on the human to specify to the computer to which domain of knowledge you are referring, which requires knowing exactly how to ask the question. Computers cannot yet do a reference interview, since they cannot pick up on the subtle hidden meanings or helping with the struggle for the right words that librarians do so well. But they can help with certain types of research tasks quite well, if you know how to ask the question.  Eric Frierson (PPT) gave a demonstration of his project working on voice powered search in EBSCO using Alexa. In the presentation he demonstrates the Alexa “skills” he set up for people to ask Alexa for help. They are “do you have”, “the book”, “information about”, “an overview of”, “what I should read after”, or “books like”. There is a demonstration of what this looks like on YouTube. The results are useful when you say the correct thing in the correct order, and for an active user it would be fairly quick to learn what to say, just as we learn how best to type in a search query in various services.

Some Considerations

People in the Sixties: The government will wiretap your home. People now: Hey wire tap, can cats eat pancakes?
Alexa meme

Why ask a question of a computer rather than type in a question to a computer? For the reason I started this piece with, certainly–voice is there, and it’s often easier to say what you mean than write it. This can be taken pragmatically as well. If you find typing difficult, being able to speak makes life easier. When I was home with a newborn baby I really appreciated being able to dictate and ask Siri about the weather forecast and what time the doctor’s appointment was. Herein lies one of the many potential pitfalls of voice: who is listening to what you are saying? One recent news story puts this in perspective, as Amazon agreed to turn over data from Alexa to police in a murder investigation after the suspect gave the ok. They refused to do at first, but it is an open question as to the legal nature of the conversation with a virtual assistant. Nor is it entirely clear when you speak to a device where the data is being processed. So before we all rush out and write voice search tools for all our systems, it is useful to think about where that data lives what the purpose of it is.

If we would protect a user’s search query by ensuring that our catalogs are encrypted (and let’s be honest, we aren’t there yet), how do we do the same for virtual search assistants in the library catalog? For Alexa, that’s built into creating an Alexa skill, since a basic requirement for the web service used is that it meet Amazon’s security requirements. But if this data is subject to subpoena, we would have to think about it in the same way we would any other search data on a third party system. And we also have to recognize that these tools are created by these companies for commercial purposes, and part of that is to gather data about people and sell things to them based on that data. Machine learning could eventually build on that to learn a lot more about people than they think, which the Amazon Echo Look recently brought up as a subject of debate. There are likely to be other services popping up in addition to those offered by Amazon, Google, Apple, and Microsoft. Before long, we might expect our vendors to be offering voice search in their interfaces, and we need to be aware of the transmission of that data and where it is being processed. A recent alliance formed called The Voice Privacy Alliance, which is developing some standards for this.

The invisibility of the result processing has another dark side. The biases inherent in the algorithms become even more hidden, as the first result becomes the “right” one. If Siri tells me the weather in Jupiter, that’s a minor inconvenience, but if Siri tells me that “Black girls” are something hypersexualized, as Safiya Noble has found that Google does, do I (or let’s say, a kid) necessarily know something has gone wrong? 4 Without human intervention and understanding, machines can perpetuate the worst side of humanity.

This comes back to Chris Bourg’s question. What happens to librarians when machines can read all the books, and have a conversation with patrons about those books? Luckily for us, it is unlikely that artificial intelligence will ever be truly self-aware with desires, metacognition, love, and need for growth and adventure. Those qualities will continue to make librarians useful to creating vibrant and unique collections and communities. But we will need to fit that in a world where we are having conversations with our computers about those collections and communities.

 

  1. Lomeli, Nathan. “Natural Language Processing: Parsing Through The Hype”. Code4Lib. Los Angeles, CA. March 7, 2017.
  2. “What Happens to Libraries and Librarians When Machines Can Read All the Books?” Feral Librarian, March 17, 2017. https://chrisbourg.wordpress.com/2017/03/16/what-happens-to-libraries-and-librarians-when-machines-can-read-all-the-books/.
  3. As a side issue, I don’t have a private office and I feel weird speaking to my computer when there are people around.
  4. Noble, Safiya Umoja. “Google Search: Hyper-Visibility as a Means of Rendering Black Women and Girls Invisible – InVisible Culture.” InVisible Culture: An Electronic Journal for Visual Culture, no. 19 (2013). http://ivc.lib.rochester.edu/google-search-hyper-visibility-as-a-means-of-rendering-black-women-and-girls-invisible/.

How to Price 3D Printing Service Fees

Many libraries today provide 3D printing service. But not all of them can afford to do so for free. While free 3D printing may be ideal, it can jeopardize the sustainability of the service over time. Nevertheless, many libraries tend to worry about charging service fees.

In this post, I will outline how I determined the pricing schema for our library’s new 3D Printing service in the hope that more libraries will consider offering 3D printing service if having to charge the fee is a factor stopping them. But let me begin with libraries’ general aversion to fees.

A 3D printer in action at the Health Sciences and Human Services Library (HS/HSL), Univ. of Maryland, Baltimore

Service Fees Are Not Your Enemy

Charging fees for the library’s service is not something librarians should regard as a taboo. We live in the times in which a library is being asked to create and provide more and more new and innovative services to help users successfully navigate the fast-changing information landscape. A makerspace and 3D printing are certainly one of those new and innovative services. But at many libraries, the operating budget is shrinking rather than increasing. So, the most obvious choice in this situation is to aim for cost-recovery.

It is to be remembered that even when a library aims for cost-recovery, it will be only partial cost-recovery because there is a lot of staff time and expertise that is spent on planning and operating such new services. Libraries should not be afraid to introduce new services requiring service fees because users will still benefit from those services often much more greatly than a commercial equivalent (if any). Think of service fees as your friend. Without them, you won’t be able to introduce and continue to provide a service that your users need. It is a business cost to be expected, and libraries will not make profit out of it (even if they try).

Still bothered? Almost every library charges for regular (paper) printing. Should a library rather not provide printing service because it cannot be offered for free? Library users certainly wouldn’t want that.

Determining Your Service Fees

What do you need in order to create a pricing scheme for your library’s 3D printing service?

(a) First, you need to list all cost-incurring factors. Those include (i) the equipment cost and wear and tear, (ii) electricity, (iii) staff time & expertise for support and maintenance, and (iv) any consumables such as 3d print filament, painter’s tape. Remember that your new 3D printer will not last forever and will need to be replaced by a new one in 3-5 years.

Also, some of these cost-incurring factors such as staff time and expertise for support is fixed per 3D print job. On the other hand, another cost-incurring factor, 3D print filament, for example, is a cost factor that increases in proportion to the size/density of a 3d model that is printed. That is, the larger and denser a 3d print model is, the more filament will be used incurring more cost.

(b) Second, make sure that your pricing scheme is readily understood by users. Does it quickly give users a rough idea of the cost before their 3D print job begins? An obscure pricing scheme can confuse users and may deter them from trying out a new service. That would be bad user experience.

Also in 3D printing, consider if you will also charge for a failed print. Perhaps you do. Perhaps you don’t. Maybe you want to charge a fee that is lower than a successful print. Whichever one you decide on, have that covered since failed prints will certainly happen.

(c) Lastly, the pricing scheme should be easily handled by the library staff. The more library staff will be involved in the entire process of a library patron using the 3D printing service from the beginning to the end, the more important this becomes. If the pricing scheme is difficult for the staff to work with when they need charge for and process each 3D print job, the new 3D printing service will increase their workload significantly.

Which staff will be responsible for which step of the new service? What would be the exact tasks that the staff will need to do? For example, it may be that several staff at the circulation desk need to learn and handle new tasks involving the 3D printing service, such as labeling and putting away completed 3D models, processing the payment transaction, delivering the model, and marking the job status for the paid 3D print job as ‘completed’ in the 3D Printing Staff Admin Portal if there is such a system in place. Below is the screenshot of the HS/HSL 3D Printing Staff Admin Portal developed in-house by the library IT team.

The HS/HSL 3D Printing Staff Admin Portal, University of Maryland, Baltimore

Examples – 3D Printing Service Fees

It’s always helpful to see how other libraries are doing when you need to determine your own pricing scheme. Here are some examples that shows ten libraries’ 3D printing pricing scheme changed over the recent three years.

  • UNR DeLaMare Library
    • https://guides.library.unr.edu/3dprinting
    • 2014 – $7.20 per cubic inch of modeling material (raised to $8.45 starting July, 2014).
    • 2017 – uPrint – Model Material: $4.95 per cubic inch (=16.38 gm=0.036 lb)
    • 2017 – uPrint – Support Materials: $7.75 per cubic inch
  • NCSU Hunt Library
    • https://www.lib.ncsu.edu/do/3d-printing
    • 2014- uPrint 3D Printer: $10 per cubic inch of material (ABS), with a $5 minimum
    • 2014 – MakerBot 3D Printer: $0.35 per gram of material (PLA), with a $5 minimum
    • 2017 – uPrint – $10 per cubic inch of material, $5 minimum
    • 2017 – F306 – $0.35 per gram of material, $5 minimum
  • Southern Illinois University Library
    • http://libguides.siue.edu/3D/request
    • 2014 – Originally $2 per hour of printing time; Reduced to $1 as the demand grew.
    • 2017 – Lulzbot Taz 5, Luzbot mini – $2.00 per hour of printing time.
  • BYU Library
  • University of Michigan Library
    • The Cube 3D printer checkout is no longer offered.
    • 2017 – Cost for professional 3d printing service; Open access 3d printing is free.
  • GVSU Library
  • University of Tennessee, Chattanooga Library
  • Port Washington Public library
  • Miami University
    • 2014 – $0.20 per gram of the finished print; 2017 – ?
  • UCLA Library, Dalhousie University Library (2014)
    • Free

Types of 3D Printing Service Fees

From the examples above, you will notice that many 3d printing service fee schemes are based upon the weight of a 3D-print model. This is because these libraries are trying recover the cost of the 3d filament, and the amount of filament used is most accurately reflected in the weight of the resulting 3D-printed model.

However, there are a few problems with the weight-based 3D printing pricing scheme. First, it is not readily calculable by a user before the print job, because to do so, the user will have to weigh a model that s/he won’t have until it is 3D-printed. Also, once 3D-printed, the staff will have to weigh each model and calculate the cost. This is time-consuming and not very efficient.

For this reason, my library considered an alternative pricing scheme based on the size of a 3D model. The idea was that we will have roughly three different sizes of an empty box – small, medium, and large – with three different prices assigned. Whichever box into which a user’s 3d printed object fits will determine how much the user will pay for her/his 3D-printed model. This seemed like a great idea because it is easy to determine how much a model will cost to 3d-print to both users and the library staff in comparison to the weight-based pricing scheme.

Unfortunately, this size-based pricing scheme has a few significant flaws. A smaller model may use more filament than a larger model if it is denser (meaning the higher infill ratio). Second, depending on the shape of a model, a model that fits in a large box may use much less filament than the one that fits in a small box. Think about a large tree model with think branches. Then compare that with a 100% filled compact baseball model that fits into a smaller box than the tree model does. Thirdly, the resolution that determines a layer height may change the amount of filament used even if what is 3D-printed is a same model.

Different infill ratios – Image from https://www.packtpub.com/sites/default/files/Article-Images/9888OS_02_22.png

Charging Based upon the 3D Printing Time

So we couldn’t go with the size-based pricing scheme. But we did not like the problems of the weight-based pricing scheme, either. As an alternative, we decided to go with the time-based pricing scheme because printing time is proportionate to how much filament is used, but it does not require that the staff weigh the model each time. A 3D-printing software gives an estimate of the printing time, and most 3D printers also display actual printing time for each model printed.

First, we wanted to confirm the hypothesis that 3D printing time and the weight of the resulting model are proportionate to each other. I tested this by translating the weight-based cost to the time-based cost based upon the estimated printing time and the estimated weight of several cube models. Here is the result I got using the Makerbot Replicator 2X.

  • 9.10 gm/36 min= 0.25 gm per min.
  • 17.48 gm/67 min= 0.26 gm per min.
  • 30.80 gm/117 min= 0.26 gm per min.
  • 50.75 gm/186 min=0.27 gm per min.
  • 87.53 gm/316 min= 0.28 gm per min.
  • 194.18 gm/674 min= 0.29 gm per min.

There is some variance, but the hypothesis holds up. Based upon this, now let’s calculate the 3d printing cost by time.

3D plastic filament is $48 for ABS/PLA and $65 for the dissolvable per 0.90 kg (=2.00 lb) from Makerbot. That means that filament cost is $0.05 per gram for ABS/PLA and $0.07 per gram for the dissolvable. So, 3D filament cost is 6 cents per gram on average.

Finalizing the Service Fee for 3D Printing

For an hour of 3D printing time, the amount of filament used would be 15.6 gm (=0.26 x 60 min). This gives us the filament cost of 94 cents per hour of 3D printing (=15.6 gm x 6 cents). So, for the cost-recovery of filament only, I get roughly $1 per hour of 3D printing time.

Earlier, I mentioned that filament is only one of the cost-incurring factors for the 3D printing service. It’s time to bring in those other factors, such as hardware wear/tear, staff time, electricity, maintenance, etc., plus “no-charge-for-failed-print-policy,” which was adopted at our library. Those other factors will add an additional amount per 3D print job. And at my library, this came out to be about $2. (I will not go into details about how these have been determined because those will differ at each library.) So, the final service fee for our new 3D printing service was set to be $3 up to 1 hour of 3D printing + $1 per additional hour of 3D printing. The $3 is broken down to $1 per hour of 3D printing that accounts for the filament cost and $2 fixed cost for every 3D print job.

To help our users to quickly get an idea of how much their 3D print job will cost, we have added a feature to the HS/HSL 3D Print Job Submission Form online. This feature automatically calculates and displays the final cost based upon the printing time estimate that a user enters.

 

The HS/HSL 3D Print Job Submission form, University of Maryland, Baltimore

Don’t Be Afraid of Service Fees

I would like to emphasize that libraries should not be afraid to set service fees for new services. As long as they are easy to understand and the staff can explain the reasons behind those service fees, they should not be a deterrent to a library trying to introduce and provide a new innovative service.

There is a clear benefit in running through all cost-incurring factors and communicating how the final pricing scheme was determined (including the verification of the hypothesis that 3D printing time and the weight of the resulting model are proportionate to each other) to all library staff who will be involved in the new 3D printing service. If any library user inquire about or challenges the service fee, the staff will be able to provide a reasonable explanation on the spot.

I implemented this pricing scheme at the same time as the launch of my library’s makerspace (the HS/HSL Innovation Space at the University of Maryland, Baltimore – http://www.hshsl.umaryland.edu/services/ispace/) back in April 2015. We have been providing 3D printing service and charging for it for more than two years. I am happy to report that during that entire duration, we have not received any complaint about the service fee. No library user expected our new 3D printing service to be free, and all comments that we received regarding the service fee were positive. Many expressed a surprise at how cheap our 3D printing service is and thanked us for it.

To summarize, libraries should be willing to explore and offer new innovating services even when they require charging service fees. And if you do so, make sure that the resulting pricing scheme for the new service is (a) sustainable and accountable, (b) readily graspable by users, and (c) easily handled by the library staff who will handle the payment transaction. Good luck and happy 3D printing at your library!

An example model with the 3D printing cost and the filament info displayed at the HS/HSL, University of Maryland, Baltimore

Hosting a Coding Challenge in the Library

In Fall of 2016, the city of Los Angeles held a 2-week “Innovate LA” event intended to celebrate innovation and creativity within the LA region.  Dozens of organizations around Los Angeles held events during Innovate LA to showcase and provide resources for making, invention, and application development.  As part of this event, the library at California State University, Northridge developed and hosted two weeks of coding challenges, designed to introduce novice coders to basic development using existing tutorials. Coders were rewarded with digital badges distributed by the application Credly.

The primary organization of the events came out of the library’s Creative Media Studio, a space designed to facilitate audio and video production as well as experimentation with emerging technologies such as 3D printing and virtual reality.  Users can use computers and recording equipment in the space, and can check out media production devices, such as camcorders, green screens, GoPros, and more.  Our aim was to provide a fun, very low-stress way to learn about coding, provide time for new coders to get hands-on help with coding tutorials, and generally celebrate how coding can be fun.  While anyone was welcome to join, our marketing efforts specifically focused on students, with coding challenges distributed daily throughout the Innovate LA period through Facebook.

The Challenges

The coding challenges were sourced from existing coding tutorial sites such as Free Code CampLearn Ruby and Codecademy.  We wanted to offer a mix of front-end and server side coding challenges, starting with HTML, CSS and JavaScript and ramping up to PHP, Python, and Ruby.  We tested out several free tutorials and chose tutorials that had the most straightforward instructions that provided immediate feedback about incorrect code. We also tried to keep the interfaces consistent, using Free Code Camp most frequently so participants could get used to the interface and focus on coding rather than the tutorial mechanism itself.

Here’s a list of the challenges and their corresponding badges earned:

Challenge Badge Received
Say Hello to the HML Elements, Headline with the H2 Element, Inform with the Paragraph Element HTML Ninja
Change the Color of Text, Use CSS Selectors to Style Elements, Use a CSS Class to Style an Element CSS Ninja
Use Responsive Design with Bootstrap Fluid Containers, Make Images Mobile Responsive, Center Text with Bootstrap Bootstrapper
Comment your JavaScript Code, Declare JavaScript Variables, Storing Values with the Assignment Operator JavaScript Hacker
Learn how Script Tags and Document Ready Work, Target HTML Elements with Selectors Using jQuery, Target Elements by Class Using jQuery jQuery Ninja
Uncomment HTML, Comment out HTML, Fill in the Blank with Placeholder Text HTML Master
Style Multiple Elements with a CSS Class, Change the Font Size of an Element, Set the Font Family of an Element CSS Master
Create a Bootstrap Button, Create a a Block Element Bootstrap Button, Taste the Bootstrap Button Color Rainbow Bootstrap Master
Getting Started and Cat/Dog JS Game Maker
Target Elements by ID Using jQuery, Delete your jQuery Functions, Target the same element with multiple jQuery selectors jQuery Master
Hello World

Variables and Types

Lists

Python Ninja
Hello World, Variables and Types, Simple Arrays PHP Ninja
Hello World, Variables and Types, Math Ruby Ninja
How to Use APIs with JavaScript (complete through Step 9: Authentication and API Keys) API Ninja
Edit or create a wikipedia page. You may join in at the Wikipedia Edit-a-thon or do editing remotely. The Citation Hunt tool is a cool/easy way of going about editing a Wikipedia page. Narrow it to a topic that interests you and make. WikiWiz
Create a 3D Model for an original animated character. You may use TinkerCAD or Blender as free options or feel free to use SolidWorks AutoCAD if you are familiar with them. If you don’t know where to begin, TinkerCAD has step by step tutorials for you to bring your ideas to life. 3D Designer
Get a selfie with a Google Cardboard or any virtual reality goggles VR Explorer

Note the final three challenges – editing a Wikipedia page, creating a 3D model, and experimenting with Google Cardboard or other virtual reality (VR) goggles are not coding challenges, but we wanted to use the opportunity to promote some of the other services the Creative Media Studio provides.  Conveniently, the library was hosting a Wikipedia Edit-A-Thon during the same period as the coding challenges, so it made sense to leverage both of those events as part of our Innovate LA programming.

The coding challenges and instructions were distributed via Facebook, and we also held “office hours” (complete with snacks) in one of the library’s computer labs to provide assistance with completing the challenges.  The office hours were mostly informal, with two library staff members available to walk users through completing and submitting the challenges.  One special office hours was planned, bringing in a guest professor from our Cinema and Television Arts program to help users with a web-based game making tutorial he had designed.  This partnership was very successful, and that particular office hour session had the most attendance of any we offered.  In future iterations of this event, more advance planning would enable us to partner with additional faculty members and feature tutorials they already use effectively with students in their curriculum.

Credly

We needed a way to both accept submissions documenting completion of coding challenges and a way to award digital badges.  Originally we had investigated potentially distributing digital badges through our campus learning management system, as some learning management systems like Moodle are capable of awarding digital badges.  There were a couple of problems with this – 1) we wanted the event to be open to anyone, including members of the community who wouldn’t have access to the learning management system, and 2), the digital badge capability hadn’t been activated in our campus’ instance of Moodle.   Another route we considered taking was accepting submissions for completed challenges was through the university’s Portfolium application, which has a fairly robust ability to accept submissions for completed work, but again, wouldn’t facilitate anyone from outside of the university participating. Credly seemed like an easy, efficient way to both accept submissions and award badges that could also be embedded in 3rd party applications, such as LinkedIN.  Since we hosted the competition in 2016, the capability to integrate Credly badges in Portfolium has been made available.

Credly enables you to either design your badges using Credly’s Badge Builder or upload your own badge designs.  Luckily, we had access to amazing student designers Katie Pappace, Rose Rieux, and Eva Cohen, who custom-created our badges using Adobe Illustrator.  A Credly account for the library’s Creative Media Studio was created to issue the badges, and Credly “Credits” were defined using the custom-created badge designs for each of the coding skills for which we wanted to award badges.

When a credit is designed in Credly and you enable the credit to allow others to claim the credit, you have several options.  You can require a claim code, which requires users to submit a code in order to claim the credit.  Claim codes are useful if you want to award badges not based on evidence (like file submission) but are awarding badges based on participation or attendance at an event at which you distribute the claim code to attendees.  When claim codes are required, you can also set approval of submissions to be automatic, so that anyone with a claim code automatically receives their badge.  We didn’t require a claim code, and instead required evidence to be submitted.

When requiring evidence, you can configure which what types of evidence are appropriate to receive the badge. Choices for evidence submission include a URL, a document (Word, text, or PDF), image, audio file, video file, or just an open text submission.  As users were completing code challenges, we asked for screenshots (images) as evidence of completion for most challenges.  We reviewed all submissions to ensure the submission was correct, but by requiring screenshots, we could easily see whether or not the tutorial itself had “passed” the code submission.

Awards

Credly gives the ability of easily counting the number of badges earned by each of the participants. From those numbers, we were able to determine the top badge earners and award them prizes. All participants, even the ones with a single badge, were awarded buttons of each of their earned badges. In addition to the virtual and physical badges, participants with the greatest number of earned badges were rewarded with prizes. The top five prizes were awarded with gift cards and the grand prize winner also got a 3D printed trophy designed with Tinkercad and their photo as a Lithopane incorporated into the trophy. A low stakes award ceremony was held for all contestants and winners. Top awards were high commodity and it was a good opportunity for students to meet others interested in coding and STEM.

Lessons Learned

Our first attempt at hosting coding challenges in the library taught us a few things.  First, taking a screenshot is definitely not a skill most participants started out with – the majority of initial questions we received from participants were not related to coding, but rather involved how to take a screenshot of their completed code to submit to Credly.  For future events, we’ll definitely make sure to include step-by-step instructions for taking screenshots on both PC and Mac with each challenge, or consider an alternative method of collecting submissions (e.g., copying and pasting code as a text submission into Credly).  It’s still important to not assume that copying and pasting text from a screen is a skill that all participants will have.

As noted above, planning ahead would enable us to more effectively reach out and partner with faculty, and possibly coordinate coding challenges with curriculum.  A few months before the coding challenges, we did reach out to computer science faculty, cinema and television arts faculty, and other faculty who teach curriculum involving code, but if we had reached out much earlier (e.g., the semester before) we likely would have been able to garner more faculty involvement.  Faculty schedules are so jam-packed and often set that way so far in advance, at least six months of advance notice is definitely appreciated.

Only about 10% of coding challenge participants came to coding office hours regularly, but that enabled us to provide tailored, one-on-one assistance to our novice coders.  A good portion of understanding how to get started with coding and application development is not related to syntax, but involves larger questions about how applications work:  if I wanted to make a website, where would my code go?  How does a URL figure out where my website code is?  How does a browser understand and render code?  What’s the difference between JavaScript (client-side code) and PHP (server-side code), and why are they different?  These were the types of questions we really enjoyed answering with participants during office hours.  Having fewer, more targeted office hours — where open questions are certainly encouraged, but where participants know the office hours are focused on particular topics — makes attending the office hours more worthwhile, and I think gives novice coders the language to ask questions they may not know they have.

One small bit of feedback that was personally rewarding for the authors:  at one of our office hours, a young woman came up to us and asked us if we were the planners of the coding challenges.  When we said yes, she told how excited she was (and a bit surprised) to see women involved with coding and development.  She asked us several questions about our jobs and how we got involved with careers relating to technology.  That interaction indicated to us that future outreach could potentially focus on promoting coding to women specifically, or hosting coding office hours to enable mentoring for women coders on campus, modeling (or joining up with) Women Who Code networks.

If you’re interested in hosting support for coding activities or challenges in your library, a great resource to get started with is Hour of Code, which promotes holding one-hour introductions to coding and computer science particularly during Computer Science Education Week.  Hour of Code provides tutorials, resources for hosts, promotional materials and more.  This year, Hour of Code week / Computer Science Education Week will be  December 4-10 2017, so start planning now!

Representing Online Journal Holdings in the Library Catalog

The Problem

It isn’t easy to communicate to patrons what serials they have access to and in what form (print, online). They can find these details, sure, but it’s scattered across our library’s web presence. What’s most frustrating is that we clearly have all the necessary information but the systems offer no built-in way to produce a clear display of it. My fellow librarians noted that “it’d be nice if the catalog showed our exact online holdings” and my initial response was to sigh and say “yes, that would be nice”.

To illustrate the scope of the problem, a user can search for journals in a few of our disparate systems:

  • we use a knowledgebase to track database subscriptions and which journals are included in each subscription package
  • the public catalog for our Koha ILS has records for our print journals, sometimes with a MARC 856$u 1 link to our online holdings in the knowledgebase
  • our discovery layer has both article-level results for the journals in our knowledgebase and journal-level search results for the ones in our catalog

While these systems overlap, they also serve distinct purposes, so it’s not so awful. However, there are a few downsides to our triad of serials information systems. First of all, if a patron searches the knowledgebase looking for a journal which we only have in print, our database holdings wouldn’t show that they have access to print issues. To work around this, we track our print issues both in our ILS and the knowledgebase, which duplicates work and introduces possible inconsistencies.

Secondly, someone might start their research in the discovery layer, finding a journal-level record that links out to our catalog. But it’s too much to ask a user to search the discovery layer, click into the catalog, click a link out to the knowledgebase, and only then discover our online holdings don’t include the particular volume they’re looking for. Possessing three interconnected systems creates labyrinthine search patterns and confusion amongst patrons. Simply describing the systems and their nuanced areas of overlap in this post feels like challenge, and the audience is librarians. I can imagine how our users must feel when we try to outline the differences.

The 360 XML API

Our knowledgebase is Serials Solutions 360KB. I went looking in the vendor’s help documentation for answers, which refers to an API for the product but apparently provides no information on using said API. Luckily, a quick search through GitHub projects yielded several using the API and I was able to determine its URL structure: http://{{your Serials Solution ID}}.openurl.xml.serialssolutions.com/openurlxml?version=1.0&url_ver=Z39.88-2004&issn={{the journal’s ISSN}}

It’s probably possible to search by other parameters as well, but for my purposes ISSN was ideal so I didn’t bother investigating further. If you send a request to the address above, you receive XML in response:

<ssopenurl:openURLResponse xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ssdiag="http://xml.serialssolutions.com/ns/diagnostics/v1.0" xmlns:ssopenurl="http://xml.serialssolutions.com/ns/openurl/v1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xml.serialssolutions.com/ns/openurl/v1.0 http://xml.serialssolutions.com/ns/openurl/v1.0/ssopenurl.xsd http://xml.serialssolutions.com/ns/diagnostics/v1.0 http://xml.serialssolutions.com/ns/diagnostics/v1.0/diagnostics.xsd">
    <ssopenurl:version>1.0</ssopenurl:version>
    <ssopenurl:results dbDate="2017-02-15">
        <ssopenurl:result format="journal">
            <ssopenurl:citation>
                <dc:source>Croquis</dc:source>
                <ssopenurl:issn type="print">0212-5633</ssopenurl:issn>
            </ssopenurl:citation>
            <ssopenurl:linkGroups>
                <ssopenurl:linkGroup type="holding">
                    <ssopenurl:holdingData>
                        <ssopenurl:startDate>1989</ssopenurl:startDate>
                        <ssopenurl:providerId>PRVLSH</ssopenurl:providerId>
                        <ssopenurl:providerName>Library Specific Holdings</ssopenurl:providerName>
                        <ssopenurl:databaseId>ZYW</ssopenurl:databaseId>
                        <ssopenurl:databaseName>CCA Print Holdings</ssopenurl:databaseName>
                        <ssopenurl:normalizedData>
                            <ssopenurl:startDate>1989-01-01</ssopenurl:startDate>
                        </ssopenurl:normalizedData>
                    </ssopenurl:holdingData>
                    <ssopenurl:url type="source">https://library.cca.edu/</ssopenurl:url>
                    <ssopenurl:url type="journal">
                    https://library.cca.edu/cgi-bin/koha/opac-search.pl?idx=ns&q=0212-5633
                    </ssopenurl:url>
                </ssopenurl:linkGroup>
            </ssopenurl:linkGroups>
        </ssopenurl:result>
    </ssopenurl:results>
    <ssopenurl:echoedQuery timeStamp="2017-02-15T16:14:12">
        <ssopenurl:library id="EY7MR5FU9X">
            <ssopenurl:name>California College of the Arts</ssopenurl:name>
        </ssopenurl:library>
        <ssopenurl:queryString>version=1.0&url_ver=Z39.88-2004&issn=0212-5633</ssopenurl:queryString>
    </ssopenurl:echoedQuery>
</ssopenurl:openURLResponse>

If you’ve read XML before, then it’s apparent how useful the above data is. It contains a list of our “holdings” for the periodical with information about the start and end (absent here, which implies the holdings run to the present date) dates of the subscription, which database they’re in, and what URL they can be accessed at. Perfect! The XML contains precisely the information we want to display in our catalog.

Unfortunately, our catalog’s JavaScript doesn’t have permission to access the 360 XML API. Due to a browser security policy resources must explicitly say that other domains or pages are allowed to request their data. A page needs to include the Access-Control-Allow-Origin HTTP header to abide by this policy, called Cross-Origin Resource Sharing (CORS), and the 360 API does not.

We can work around this limitation but it requires extra code on our part. While JavaScript from a web page cannot request data directly from 360, we can write a server-side script to pull data. That server-side script can then add its own CORS header which lets our catalog use it. So, in essence, we set up a proxy service that acts as a go-between for our catalog and the API that the catalog cannot use. Typically, this takes little code; the server-side script takes a parameter passed to it in the URL, sends it in a HTTP request to another server, and serves back up whatever response it receives.

Of course, it didn’t turn out to be that simple in practice. As I experimented with my scripts, I could tell that the 360 data was being received, but I couldn’t parse meaningful pieces of information out of it. It’s clearly there; I could see the full XML structure with holdings details. But neither my server-side PHP nor my client-side JavaScript could “find” XML elements like <ssopenurl:linkGroup> and <ssopenurl:normalizedData>. The text before the colon in the tag names is the namespace. Simple jQuery code like $('ssopenurl:linkGroup', xml), which can typically parse XML data, wasn’t working with these namespaced elements.

Finally, I discovered the solution by reading the PHP manual’s entry for the simplexml_load_string function: I can tell PHP how to parse namespaced XML by passing a namespace parameter to the parser function. So my function call turned into:

// parameters: 1) serials solution data since $url is the API we want to pull from
// 2) the type of object that the function should return (this is the default)
// 3) Libxml options (also the default, no special options)
// 4) (finally!) ns, the XML namespace
// 5) "True" here means ns is a prefix and not a URI
$xml= simplexml_load_string( file_get_contents($url), 'SimpleXMLElement', 0, 'ssopenurl', True );

As you can see, two of those parameters don’t even differ from the function’s defaults, but I still need to provide them to get to the “ssopenurl” namespace later. As an aside, technical digressions like these are some of the best and worst parts of my job. It’s rewarding to encounter a problem, perform research, test different approaches, and eventually solve it. But it’d also be nice, and a lot quicker, if code would just work as expected the first time around.

The Catalog

We’re lucky that Koha’s catalog both allows for JavaScript customization and has a well-structured, easy-to-modify record display. Now that I’m able to grab online holdings data from our knowledgebase, inserting into the catalog is trivial. If you wanted to do the same with a different library catalog, the only changes come in the JavaScript that finds ISSN information in a record and then inserts the retrieved holdings information into the display. The complete outline of the data flow from catalog to KB and back looks like:

  • my JavaScript looks for an ISSN on the record’s display page
  • if there’s an ISSN, it sends the ISSN to my proxy script
  • the proxy script adds a few parameters & asks for information from the 360 XML API
  • the 360 XML API returns XML, which my proxy script parses into JSON and sends to the catalog
  • the catalog JavaScript receives the JSON and parses holdings information into formatted HTML like “Online resources: 1992 to present in DOAJ
  • the JS inserts the formatted text into the record’s “online resources” section, creating that section if it doesn’t already exist

Is there a better way to do this? Almost certainly. The six steps above should give you a sense of how convoluted the process is, hacking around a few limitations. Still, the outcome is positive: we stopped updating our print holdings in our knowledgebase and our users have more information at their fingertips. It obviates the final step in the protracted “discovery layer to catalog” search described in the opening of this post.

Our next steps are obvious, too: we should aim to get this information into the discovery layer’s search results for our journals. The general frame of this project would be the same; we already know how to get the data from the API. Much like working with a different library catalog, the only edits are in parsing ISSNs from discovery layer search results and finding a spot in the HTML to insert the holdings data. Finally, we can also remove the redundant and less useful 856$u links from our periodical MARC records now.

The Scripts

These are highly specific to our catalog, but may be of general use to others who want to see how the pieces work together:

Notes

  1. For those unfamiliar, 856 is the MARC field for URLs, whether they URL represents the actual resource being described or something supplementary. It’s pretty common for print journals to also have 856 fields for their online counterparts.

Creating an OAI-PMH Feed From Your Website

Libraries who use a flexible content management system such as Drupal or WordPress for their library website and/or resource discovery have a challenge in ensuring that their data is accessible to the rest of the library world. Whether making metadata useable by other libraries or portals such as DPLA, or harvesting content in a discovery layer, there are some additional steps libraries need to take to make this happen. While there are a number of ways to accomplish this, the most straightforward is to create an OAI-PMH feed. OAI-PMH stands for Open Archives Initiative Protocol for Metadata Harvesting, and is a well-supported and understood protocol in many metadata management systems. There’s a tutorial available to understand the details you might want to know, and the Open Archives Initiative has detailed documentation.

Content management tools designed specifically for library and archives usage, such as LibGuides and Omeka, have a built in OAI-PMH feed, and generally all you need to do is find the base URL and plug it in. (For instance, here is what a LibGuides OAI feed looks like). In this post I’ll look at what options are available for Drupal and WordPress to create the feed and become a data provider.

WordPress

This is short, since there aren’t that many options. If you use WordPress for your library website you will have to experiment, as there is nothing well-supported. Lincoln University in New Zealand has created a script that converts a WordPress RSS feed to a minimal OAI feed. This requires editing a PHP file to include your RSS feed URL, and uploading to a server. I admit that I have been unsuccessful at testing this, but Lincoln University has a working example, and uses this to harvest their WordPress library website into Primo.

Drupal

If you use Drupal, you will need to first install a module called Views OAI-PMH. What this does is create a Drupal view formatted as an OAI-PMH data provider feed. Those familiar with Drupal know that you can use the Views module to present content in a variety of ways. For instance, you can include certain fields from certain content types in a list or chart that allows you to reuse content rather than recreating it. This is no different, only the formatting is an OAI-PMH compliant XML structure. Rather than placing the view in a Drupal page or block, you create a separate page. This page becomes your base URL to provide to others or reuse in whatever way you need.

The Views OAI-PMH module isn’t the most obvious module to set up, so here are the basic steps you need to follow. First, enable and set permissions as usual. You will also want to refresh your caches (I had trouble until I did this). You’ll discover that unlike other modules the documentation and configuration is not in the interface, but in the README file, so you will need to open that out of the module directory to get the configuration instructions.

To create your OAI-PMH view you have two choices. You can add it to a view that is already created, or create a new one. The module will create an example view called Biblio OAI-PMH (based on an earlier Biblio module used for creating bibliographic metadata). You can just edit this to create your OAI feed. Alternatively, if you have a view that already exists with all the data you want to include, you can add an OAI-PMH display as an additional display. You’ll have to create a path for your view that will make it accessible via a URL.

The details screen for the OAI-PMH display.

The Views OAI-PMH module only supports Dublin Core at this time. If you are using Drupal for bibliographic metadata of some kind, mapping the fields is a fairly straightforward process. However, choosing the Dublin Core mappings for data that is not bibliographic by nature requires some creativity and thought about where the data will end up. When I was setting this up I was trying to harvest most of the library website into our discovery layer, so I knew how the discovery layer parsed OAI DC and could choose fields accordingly.

After adding fields to the view (just as you normally would in creating a view), you will need to select settings for the OAI view to select the Dublin Core element name for each content field.

You can then map each element to the appropriate Dublin Core field. The example from my site includes some general metadata that appears on all content (such as Title), and some that only appears in specific content types. For instance, Collection Description only appears on digital collection content types. I did not choose to include the body content for any page on the site, since most of those pages contain a lot of scripts or other code that wasn’t useful to harvest into the discovery layer. Explanatory content such as the description of a digital collection or a database was more useful to display in the discovery layer, and exists only in special fields for those content types on my Drupal site, so we could pull those out and display those.

In the end, I have a feed that looks like this. Regular pages end up with very basic metadata in the feed:

<metadata>
<oai_dc:dc xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/  http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Hours</dc:title>
<dc:identifier>http://libraries.luc.edu/hours</dc:identifier><dc:creator>Loyola University Libraries</dc:creator></oai_dc:dc>
</metadata>

Whereas databases get more information pulled in. Note that there are two identifiers, one for the database URL, and one for the database description link. We will make these both available, but may choose one to use only one in the discovery layer and hide the other one.

<metadata>
<oai_dc:dc xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/  http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Annual Bibliography of English Language and Literature</dc:title>
<dc:identifier>http://flagship.luc.edu/login?url=http://collections.chadwyck.com/home/home_abell.jsp</dc:identifier>
<dc:subject>Modern Languages</dc:subject>
<dc:type>Index/Database</dc:type>
<dc:identifier>http://libraries.luc.edu/annual-bibliography-english-language-and-literature</dc:identifier>
<dc:creator>Loyola University Libraries</dc:creator>
</oai_dc:dc>
</metadata>

When someone does a search in the discovery layer for something on the library website, the result shows the page right in the interface. We are still doing usability tests on this right now, but expect to move it into production soon.

Conclusion

I’ve just touched on two content management systems, but there are many more out there. Do you create OAI-PMH feeds of your data? What do you do with them? Share your examples in the comments.