Creating Presentations with Beautiful.AI

Updated 2018-11-12 at 3:30PM with accessibility information.

Beautiful.AI is a new website that enables users to create dynamic presentations quickly and easily with “smart templates” and other design optimized features. So far the service is free with a paid pro tier coming soon. I first heard about Beautiful.AI in an advertisement on NPR and was immediately intrigued. The landscape of presentation software platforms has broadened in recent years to include websites like Prezi, Emaze, and an array of others beyond the tried and true PowerPoint. My preferred method of creating presentations for the past couple of years has been to customize the layouts available on Canva and download the completed PDFs for use in PowerPoint. I am also someone who enjoys tinkering with fonts and other design elements until I get a presentation just right, but I know that these steps can be time consuming and overwhelming for many people. With that in mind, I set out to put Beautiful.AI to the test by creating a short “prepare and share” presentation about my first experience at ALA’s Annual Conference this past June for an upcoming meeting.

A title slide created with Beautiful.AI.

Features

To help you get started, Beautiful.AI includes an introductory “Design Tips for Beautiful Slides” presentation. It is also fully customizable so you can play around with all of of the features and options as you explore, or you can click on “create new presentation” to start from scratch. You’ll then be prompted to choose a theme, and you can also choose a color palette. Once you start adding slides you can make use of Beautiful.AI’s template library. This is the foundation of the site’s usefulness because it helps alleviate guesswork about where to put content and that dreaded “staring at the blank slide” feeling. Each individual slide becomes a canvas as you create a presentation, similar to what is likely familiar in PowerPoint. In fact, all of the most popular PowerPoint features are available in Beautiful.AI, they’re just located in very different places. From the navigation at the left of the screen users can adjust the colors and layout of each slide as well as add images, animation, and presenter notes. Options to add, duplicate, or delete a slide are available on the right of the screen. The organize feature also allows you to zoom out and see all of the slides in the presentation.

Beautiful.AI offers a built-in template to create a word cloud.

One of Beautiful.AI’s best features, and my personal favorite, is its built-in free stock image library. You can choose from pre-selected categories such as Data, Meeting, Nature, or Technology or search for other images. An import feature is also available, but providing the stock images is extremely useful if you don’t have your own photos at the ready. Using these images also ensures that no copyright restrictions are violated and helps add a professional polish to your presentation. The options to add an audio track and advance times to slides are also nice to have for creating presentations as tutorials or introductions to a topic. When you’re ready to present, you can do so directly from the browser or export to PDF or PowerPoint. Options to share with a link or embed with code are also available.

Usability

While intuitive design and overall usability won’t necessarily make or break the existence of a presentation software platform, each will play a role in influencing whether someone uses it more than once. For the most part, I found Beautiful.AI to be easy and fun to use. The interface is bold, yet simplistic, and on trend with current website design aesthetics. Still, users who are new to creating presentations online in a non-PowerPoint environment may find the Beautiful.AI interface to be confusing at first. Most features are consolidated within icons and require you to hover over them to reveal their function. Icons like the camera to represent “Add Image” are pretty obvious, but others such as Layout and Organize are less intuitive. Some of Beautiful.AI’s terminology may also not be as easily recognizable. For example, the use of the term “variations” was confusing to me at first, especially since it’s only an option for the title slide.

The absence of any drag and drop capability for text boxes is definitely a feature that’s missing for me. This is really where the automated design adaptability didn’t seem to work as well as I would’ve expected given that it’s one of the company’s most prominent marketing statements. On the title slide of my presentation, capitalizing a letter in the title caused the text to move closer to the edge of the slide. In Canva, I could easily pull the text block over to the left a little or adjust the font size down by a few points. I really am a stickler for spacing in my presentations, and I would’ve expected this to be an element that the “Design AI” would pick up on. Each template also has different pre-set design elements, and it can be confusing when you choose one that includes a feature that you didn’t expect. Yet, text sizes that are pre-set to fit the dimensions of each template does help not only with readability in the creation phase but with overall visibility for audiences. Again, this alleviates some of the guesswork that often happens in PowerPoint with not knowing exactly how large your text sizes will appear when projected onto larger screens.

A slide created using a basic template and stock photos available in Beautiful.AI.

One feature that does work really well is the export option. Exporting to PowerPoint creates a perfectly sized facsimile presentation, and being able to easily download a PDF is very useful for creating handouts or archiving a presentation later on. Both are nice to have as a backup for conferences where Internet access may be spotty, and it’s nice that Beautiful.AI understands the need for these options. Unfortunately, Beautiful.AI doesn’t address accessibility on its FAQ page nor does it offer alternative text or other web accessibility features. Users will need to add their own slide titles and alt text in PowerPoint and Adobe Acrobat after exporting from Beautiful.AI to create an accessible presentation. 

Conclusion

Beautiful.AI challenged me to think in new ways about how best to deliver information in a visually engaging way. It’s a useful option for librarians and students who are looking for a presentation website that is fun to use, engaging, and on trend with current web design.

Click here to view “My first ALA”presentation created with Beautiful.AI.


Jeanette Sewell is the Database and Metadata Management Coordinator at Fondren Library, Rice University.

National Forum on Web Privacy and Web Analytics

We had the fantastic experience of participating in the National Forum on Web Privacy and Web Analytics in Bozeman, Montana last month. This event brought together around forty people from different areas and types of libraries to do in-depth discussion and planning about privacy issues in libraries. Our hosts from Montana State University, Scott Young, Jason Clark, Sara Mannheimer, and Jacqueline Frank, framed the event with different (though overlapping) areas of focus. We broke into groups based on our interests from a pre-event survey and worked through a number of activities to identify projects. You can follow along with all the activities and documents produced during the Forum in this document that collates all of them.

Drawing of ship
Float your boat exercise

            While initially worried that the activities would feel too forced, instead they really worked to release creative ideas. Here’s an example: our groups drew pictures of boats with sails showing opportunities, and anchors showing problems. We started out in two smaller subgroups of our subgroups and drew a boat, then met with the large subgroup to combine the boat ideas. This meant that it was easy to spot the common themes—each smaller group had written some of the same themes (like GDPR). Working in metaphor meant we could express some more complex issues, like politics, as the ocean—something that always surrounds the issue and can be helpful or unhelpful without much warning. This helped us think differently about issues and not get too focused on our own individual perspective.

The process of turning metaphor into action was hard. We had to take the whole world of problems and opportunities and come up with how these could be realistically accomplished. Good and important ideas had to get left behind because they were so big there was no way to feasibly plan them, certainly not in a day or two. The differing assortment of groups (which were mixable where ideas overlapped) ensured that we were able to question each other’s assumptions and ask some hard questions. For example, one of the issues Margaret’s group had identified as a problem was disagreement in the profession about what the proper limits were on privacy. Individually identifiable usage metrics are a valuable commodity to some, and a thing not to be touched to others. While everyone in the room was probably biased more in favor of privacy than perhaps the profession at large is, we could share stories and realities of the types of data we were collecting and what it was being used for. Considering the realities of our environments, one of our ideas to bring everyone from across the library and archives world to create a unified set of privacy values was not going to happen. Despite that, we were able to identify one of the core problems that led to a lack of unity, which was, in many cases, lack of knowledge about what privacy issues existed and how these might affect institutions. When you don’t completely understand something, or only half understand it, you are more likely to be afraid of it.

            On the afternoon of the second day and continuing into the morning of the third day, we had to get serious and pick just one idea to focus on to create a project plan. Again, the facilitators utilized a few processes that helped us take a big idea and break it down into more manageable components. We used “Big SCAI” thinking to frame the project: what is the status quo, what are the challenges, what actions are required, and what are the ideals. From there we worked through what was necessary for the project, nice to have, unlikely to get, and completely unnecessary to the project. This helped focus efforts and made the process of writing a project implementation plan much easier.

Laptop with postits on wall.
What the workday looked like.

Writing the project implementation plan as a group was made easier by shared documents, but we all commented on the irony of using Google Docs to write privacy plans. On the other hand, trying to figure out how to write in groups and easily share what we wrote using any other platform was a challenge in the moment. This reality illustrates the problems with privacy: the tool that is easiest to use and comes to mind first will be the one that ends up being used. We have to create tools that make privacy easy (which was a discussion many of us at the Forum had), but even more so we need to think about the tradeoffs that we make in choosing a tool and educate ourselves and others about this. In this case, since all the outcomes of the project were going to be public anyway, going on the “quick and easy” side was ok.

            The Forum project leaders recently presented about their work at the DLF Forum 2018 conference. In this presentation, they outlined the work that they did leading up to the Forum, and the strategies that emerged from the day. They characterized the strategies as Privacy Badging and Certifications, Privacy Leadership Training, Privacy for Tribal Communities and Organizations, Model License for Vendor Contracts, Privacy Research Institute, and a Responsible Assessment Toolkit. You can read through the thought process and implementation strategies for these projects and others yourself at the project plan index. The goal is to ensure that whoever wants to do the work can do it. To quote Scott Young’s follow-up email, “We ask only that you keep in touch with us for the purposes of community facilitation and grant reporting, and to note the provenance of the idea in future proposals—a sort of CC BY designation, to speak in copyright terms.”

            For us, this three-day deep dive into privacy was an inspiration and a chance to make new connections (while also catching up with some old friends). But even more, it was a reminder that you don’t need much of anything to create a community. Provided the right framing, as long as you have people with differing experiences and perspectives coming together to learn from each other, you’ve facilitated the community-building.  

The Ex Libris Knowledge Center and Orangewashing

Two days after ProQuest completed their acquisition of Ex Libris in December 2015, Ex Libris announced the launch of their new online Customer Knowledge Center. In the press release for the Knowledge Center, the company describes it as “a single gateway to all Ex Libris knowledge resources,” including training materials, release notes, and product manuals. A defining feature is that there has never been any paywall or log-on requirement, so that all Knowledge Center materials remain freely accessible to any site visitor. Historically, access to documentation for automated library systems has been restricted to subscribing institutions, so the Knowledge Center represents a unique change in approach.

Within the press release, it is also readily apparent how Ex Libris aims to frame the openness of the Knowledge Center as a form of support for open access. As the company states in the second paragraph, “Demonstrating the Company’s belief in the importance of open access, the site is open to all, without requiring any logon procedure.” Former Ex Libris CEO Matti Shem Tov goes a step further in the following paragraph: “We want our resources and documentation to be as accessible and as open as our library management, discovery, and higher-education technology solutions are.”

The problem with how Ex Libris frames their press release is that it elides the difference between mere openness and actual open access. They are a for-profit company, and their currently burgeoning market share is dependent upon a software-as-a-service (SaaS) business model. Therefore, one way to describe their approach in this case is orangewashing. During a recent conversation with me, Margaret Heller came up with the term, based on the color of the PLOS open access symbol. Similar in concept to greenwashing, we can define orangewashing as a misappropriation of open access rhetoric for business purposes.

What perhaps makes orangewashing more initially difficult to diagnose in Ex Libris’s (and more broadly, ProQuest’s) case is that they attempt to tie support for open access to other product offerings. Even before purchasing Ex Libris, ProQuest had been including an author-side paid open-access publishing option to its Electronic Thesis and Dissertation platform, though we can question whether this is actually a good option for authors. For its part, Ex Libris has listened to customer feedback about open access discovery. As an example, there are now open access filters for both the Primo and Summon discovery layers.

Ex Libris has also, generally speaking, remained open to customer participation regarding systems development, particularly with initiatives like the Developer Network and Idea Exchange. Perhaps the most credible example is in a June 24, 2015 press release, where the company declares “support of the Open Discovery Initiative (ODI) and conformance with ODI’s recommended practice for pre-indexed ‘web-scale’ discovery services.” A key implication is that “conforming to ODI regulations about ranking of search results, linking to content, inclusion of materials in Primo Central, and discovery of open access content all uphold the principles of content neutrality.”

Given the above information, in the case of the Knowledge Center, it is tempting to give Ex Libris the benefit of the doubt. As an access services librarian, I understand how much of a hassle it can be to find and obtain systems documentation in order to properly do my job. I currently work for an Ex Libris institution, and can affirm that the Knowledge Center is of tangible benefit. Besides providing easier availability for their materials, Ex Libris has done fairly well in keeping information and pathing up to date. Notably, as of last month, customers can also contribute their own documentation to product-specific Community Knowledge sections within the Knowledge Center.

Nevertheless, this does not change the fact that while the Knowledge Center is unique in its format, it represents a low bar to clear for a company of Ex Libris’s size. Their systems documentation should be openly accessible in any case. Moreover, the Knowledge Center represents openness—in the form of company transparency and customer participation—for systems and products that are not open. This is why when we go back to the Knowledge Center press release, we can identify it as orangewashing. Open access is not the point of a profit-driven company offering freely accessible documentation, and any claims to this effect ultimately ring hollow.

So what is the likely point of the Knowledge Center, then? We should consider that Alma has become the predominant service platform within academic libraries, with Primo and Summon being the only supported discovery layers for it. While OCLC and EBSCO offer or support competing products, Ex Libris already held an advantageous position even before the ProQuest purchase. Therefore, besides the Knowledge Center serving as supportive measure for current customers, we can view it as a sales pitch to future ones. This may be a smart business strategy, but again, it has little to do with open access.

Two other recent developments provide further evidence of Ex Libris’s orangewashing. The first is MLA’s announcement that EBSCO will become the exclusive vendor for the MLA International Bibliography. On the PRIMO-L listserv, Ex Libris posted a statement [listserv subscription required] noting that the agreement “goes against the goals of NISO’s Open Discovery Initiative…to promote collaboration and transparency among content and discovery providers.” Nevertheless, despite not being involved in the agreement, Ex Libris shares some blame given the long-standing difficulty over EBSCO not providing content to the Primo Central Index. As a result, what may occur is the “siloing” of an indispensable research database, while Ex Libris customers remain dependent on the company to help determine an eventual route to access.

Secondly, in addition to offering research publications through ProQuest and discovery service through Primo/Summon, Ex Libris now provides end-to-end content management through Esploro. Monetizing more aspects of the research process is certainly far from unusual among academic publishers and service providers. Elsevier arguably provides the most egregious example, and as Lisa Janicke Hinchliffe notes, their pattern of recent acquisitions belies an apparent goal of creating a vertical stack service model for publication services.

In considering what Elsevier is doing, it is unsurprising—from a business standpoint—for Ex Libris and ProQuest to pursue profits in a similar manner. That said, we should bear in mind that libraries are already losing control over open access as a consequence of the general strategy that Elsevier is employing. Esploro will likely benefit from having strong library development partners and “open” customer feedback, but the potential end result could place its customers in a more financially disadvantageous and less autonomous position. This is simply antithetical to open access.

Over the past few years, Ex Libris has done well not just in their product development, but also their customer support. Making the Knowledge Center “open to all” in late 2015 was a very positive step forward. Yet the company’s decision to orangewash through claiming support for open access as part of a product unveiling still warrants critique. Peter Suber reminds us that open access is a “revolutionary kind of access”—one that is “unencumbered by a motive of financial gain.” While Ex Libris can perhaps talk about openness with a little more credibility than their competitors, their bottom line is still what really matters.

Managing ILS Updates

We’ve done a few screencasts in the past here at TechConnect and I wanted to make a new one to cover a topic that’s come up this summer: managing ILS updates. Integrated Library Systems are huge, unwieldy pieces of software and it can be difficult to track what changes with each update: new settings are introduced, behaviors change, bugs are (hopefully) fixed. The video belows shows my approach to managing this process and keeping track of ongoing issues with our Koha ILS.

Blockchain: Merits, Issues, and Suggestions for Compelling Use Cases

Blockchain holds a great potential for both innovation and disruption. The adoption of blockchain also poses certain risks, and those risks will need to be addressed and mitigated before blockchain becomes mainstream. A lot of people have heard of blockchain at this point. But many are unfamiliar with how this new technology exactly works and unsure about under which circumstances or on what conditions it may be useful to libraries.

In this post, I will provide a brief overview of the merits and the issues of blockchain. I will also make some suggestions for compelling use cases of blockchain at the end of this post.

What Blockchain Accomplishes

Blockchain is the technology that underpins a well-known decentralized cryptocurrency, Bitcoin. To simply put, blockchain is a kind of distributed digital ledger on a peer-to-peer (P2P) network, in which records are confirmed and encrypted. Blockchain records and keeps data in the original state in a secure and tamper-proof manner[1] by its technical implementation alone, thereby obviating the need for a third-party authority to guarantee the authenticity of the data. Records in blockchain are stored in multiple ledgers in a distributed network instead of one central location. This prevents a single point of failure and secures records by protecting them from potential damage or loss. Blocks in each blockchain ledger are chained to one another by the mechanism called ‘proof of work.’ (For those familiar with a version control system such as Git, a blockchain ledger can be thought of as something similar to a P2P hosted git repository that allows sequential commits only.[2]) This makes records in a block immutable and irreversible, that is, tamper-proof.

In areas where the authenticity and security of records is of paramount importance, such as electronic health records, digital identity authentication/authorization, digital rights management, historic records that may be contested or challenged due to the vested interests of certain groups, and digital provenance to name a few, blockchain can lead to efficiency, convenience, and cost savings.

For example, with blockchain implemented in banking, one will be able to transfer funds across different countries without going through banks.[3] This can drastically lower the fees involved, and the transaction will take effect much more quickly, if not immediately. Similarly, adopted in real estate transactions, blockchain can make the process of buying and selling a property more straightforward and efficient, saving time and money.[4]

Disruptive Potential of Blockchain

The disruptive potential of blockchain lies in its aforementioned ability to render the role of a third-party authority obsolete, which records and validates transactions and guarantees their authenticity, should a dispute arise. In this respect, blockchain can serve as an alternative trust protocol that decentralizes traditional authorities. Since blockchain achieves this by public key cryptography, however, if one loses one’s own personal key to the blockchain ledger holding one’s financial or real estate asset, for example, then that will result in the permanent loss of such asset. With the third-party authority gone, there will be no institution to step in and remedy the situation.

Issues

This is only some of the issues with blockchain. Other issues include (a) interoperability between different blockchain systems, (b) scalability of blockchain at a global scale with large amount of data, (c) potential security issues such as the 51% attack [5], and (d) huge energy consumption [6] that a blockchain requires to add a block to a ledger. Note that the last issue of energy consumption has both environmental and economic ramifications because it can cancel out the cost savings gained from eliminating a third-party authority and related processes and fees.

Challenges for Wider Adoption

There are growing interests in blockchain among information professionals, but there are also some obstacles to those interests gaining momentum and moving further towards wider trial and adoption. One obstacle is the lack of general understanding about blockchain in a larger audience of information professionals. Due to its original association with bitcoin, many mistake blockchain for cryptocurrency. Another obstacle is technical. The use of blockchain requires setting up and running a node in a blockchain network, such as Ethereum[7], which may be daunting to those who are not tech-savvy. This makes a barrier to entry high to those who are not familiar with command line scripting and yet still want to try out and test how a blockchain functions.

The last and most important obstacle is the lack of compelling use cases for libraries, archives, and museums. To many, blockchain is an interesting new technology. But even many blockchain enthusiasts are skeptical of its practical benefits at this point when all associated costs are considered. Of course, this is not an insurmountable obstacle. The more people get familiar with blockchain, the more ways people will discover to use blockchain in the information profession that are uniquely beneficial for specific purposes.

Suggestions for Compelling Use Cases of Blockchain

In order to determine what may make a compelling use case of blockchain, the information profession would benefit from considering the following.

(a) What kind of data/records (or the series thereof) must be stored and preserved exactly the way they were created.

(b) What kind of information is at great risk to be altered and compromised by changing circumstances.

(c) What type of interactions may need to take place between such data/records and their users.[8]

(d) How much would be a reasonable cost for implementation.

These will help connecting the potential benefits of blockchain with real-world use cases and take the information profession one step closer to its wider testing and adoption. To those further interested in blockchain and libraries, I recommend the recordings from the Library 2.018 online mini-conference, “Blockchain Applied: Impact on the Information Profession,” held back in June. The Blockchain National Forum, which is funded by IMLS and is to take place in San Jose, CA on August 6th, will also be livestreamed.

Notes

[1] For an excellent introduction to blockchain, see “The Great Chain of Being Sure about Things,” The Economist, October 31, 2015, https://www.economist.com/news/briefing/21677228-technology-behind-bitcoin-lets-people-who-do-not-know-or-trust-each-other-build-dependable.

[2] Justin Ramos, “Blockchain: Under the Hood,” ThoughtWorks (blog), August 12, 2016, https://www.thoughtworks.com/insights/blog/blockchain-under-hood.

[3] The World Food Programme, the food-assistance branch of the United Nations, is using blockchain to increase their humanitarian aid to refugees. Blockchain may possibly be used for not only financial transactions but also the identity verification for refugees. Russ Juskalian, “Inside the Jordan Refugee Camp That Runs on Blockchain,” MIT Technology Review, April 12, 2018, https://www.technologyreview.com/s/610806/inside-the-jordan-refugee-camp-that-runs-on-blockchain/.

[4] Joanne Cleaver, “Could Blockchain Technology Transform Homebuying in Cook County — and Beyond?,” Chicago Tribune, July 9, 2018, http://www.chicagotribune.com/classified/realestate/ct-re-0715-blockchain-homebuying-20180628-story.html.

[5] “51% Attack,” Investopedia, September 7, 2016, https://www.investopedia.com/terms/1/51-attack.asp.

[6] Sherman Lee, “Bitcoin’s Energy Consumption Can Power An Entire Country — But EOS Is Trying To Fix That,” Forbes, April 19, 2018, https://www.forbes.com/sites/shermanlee/2018/04/19/bitcoins-energy-consumption-can-power-an-entire-country-but-eos-is-trying-to-fix-that/#49ff3aa41bc8.

[7] Osita Chibuike, “How to Setup an Ethereum Node,” The Practical Dev, May 23, 2018, https://dev.to/legobox/how-to-setup-an-ethereum-node-41a7.

[8] The interaction can also be a self-executing program when certain conditions are met in a blockchain ledger. This is called a “smart contract.” See Mike Orcutt, “States That Are Passing Laws to Govern ‘Smart Contracts’ Have No Idea What They’re Doing,” MIT Technology Review, March 29, 2018, https://www.technologyreview.com/s/610718/states-that-are-passing-laws-to-govern-smart-contracts-have-no-idea-what-theyre-doing/.

Introducing Our New Best Friend, GDPR

You’ve seen the letters GDPR in every single email you’ve gotten from a vendor or a mailing list lately, but you might not be exactly sure what it is. With GDPR enforcement starting on May 25, it’s time for a crash course in what GDPR is, and why it could be your new best friend whether you are in the EU or not.

First, you can check out the EU GDPR information site (though it probably will be under heavy load for a few days!) for lots of information on this. It’s important to recognize, however, that for universities like mine with a campus located in the EU, it has created additional oversight to ensure that our own data collection practices are GDPR compliant, or that we restrict people residing in the EU from accessing those services. You should definitely work with legal counsel on your own campus in making any decisions about GDPR compliance.

So what does the GDPR actually mean in practice? The requirements break down this way: any company which holds the data of any EU citizen must provide data controls, no matter where the company or the data is located. This means that every large web platform and pretty much every library vendor must comply or face heavy fines. The GDPR offers the following protections for personally identifiable information, which includes things like IP address: privacy terms and conditions must be written in easy to understand language, data breaches require quick notifications, the right to know what data is being collected and to receive a copy of it, the “right to be forgotten” or data erasure (unless it’s in the public interest for the data to be retained), ability to transfer data between providers, systems to be private by design and only collect necessary data, and for companies to appoint data privacy officers without conflicts of interest. How this all works in practice is not consistent, and there will be a lot to be worked out in the courts in the coming years. Note that Google recently lost several right to be forgotten cases, and were required to remove information that they had originally stated was in the public interest to retain.

The GDPR has actually been around for a few years, but May 25, 2018 was set as the enforcement date, so many people have been scrambling to meet that deadline. If you’re reading this today, there’s probably not a lot of time to do anything about your own practices, but if you haven’t yet reviewed what your vendors are doing, this would be a good time. Note too that there are no rights guaranteed for any Americans, and several companies, including Facebook, have moved data governance out of their Irish office to California to be out of reach of suits brought in Irish courts.

Where possible, however, we should be using all the features at our disposal. As librarians, we already tend to the “privacy by design” philosophy, even though we aren’t always perfect at it. As I wrote in my last post, my library worked on auditing our practices and creating a new privacy policy, and one of the last issues was trying to figure out how we would approach some of our third-party services which we need to provide services to our patrons but that did not allow deleting data. Now some of those features are being made available. For example, Google Analytics now has a data retention feature, which allows you to set data to expire and be deleted after a certain amount of time. Google provides some more detailed instructions to ensure that you are not accidentally collecting personally-identifiable information in your analytics data.

Lots of our library vendors provide personal account features, and those too are subject to these new GDPR features. This means that there are new levels of transparency about what kinds of tracking they are doing, and greater ability for patrons to control data, and for you to control data on the behalf of patrons. Here are a few example vendor GDPR compliance statements or FAQs:

Note that some vendors, like EBSCO, are moving to HTTPS for all sites that weren’t before, and so this may require changes to proxy servers or other links.

I am excited about GDPR because no matter where we are located, it gives us new tools to defend the privacy of our patrons. Even better than that, it is providing lots of opportunities on our campuses to talk about privacy with all stakeholders. At my institution, the library has been able to showcase our privacy expertise and have some good conversations about data governance and future goals for privacy. It doesn’t mean that all our problems will be solved, but we are moving in a more positive direction.

Names are Hard

A while ago I stumbled onto the post “Falsehoods Programmers Believe About Names” and was stunned. Personal names are one of the most deceptively difficult forms of data to work with and this article touched on so many common but unaddressed problems. Assumptions like “people have exactly one canonical name” and “My system will never have to deal with names from China/Japan/Korea” were apparent everywhere. I consider myself a fairly critical and studious person, I devote time to thinking about the consequences of design decisions and carefully attempt to avoid poor assumptions. But I’ve repeatedly run into trouble when handling personal names as data. There is a cognitive dissonance surrounding names; we treat them as rigid identifiers when they’re anything but. We acknowledge their importance but struggle to take them as seriously.

Names change. They change due to marriage, divorce, child custody, adoption, gender identity, religious devotion, performance art, witness protection, or none of these at all. Sometimes people just want a new name. And none of these reasons for change are more or less valid than others, though our legal system doesn’t always treat them equally. We have students who change their legal name, which is often something systems expect, but then they have the audacity to want to change their username, too! And that works less often because all sorts of system integrations expect usernames to be persistent.

Names do not have a universal structure. There is no set quantity of components in a name nor an established order to those components. At my college, we have students without surnames. In almost all our systems, surname is a required field, so we put a period “.” there to satisfy that requirement. Then, on displays in our digital repository where surnames are assumed, we end up with bolded section headers like “., Johnathan” which look awkward.

Many Western names might follow a [Given name] – [Middle name] – [Surname] structure and an unfortunate number of the systems I have to deal with assume all names share this structure. It’s easy to see how this yields problematic results. For instance, if you want to a see a sorted list of users, you probably want to sort by family name, but many systems sort by the name in the last position causing, for instance, Chinese names 1 to be handled differently from Western ones. 2 But it’s not only that someone might not have a middle name, or might have two middle names, or might have a family name in the first position—no, even that would be too simple! Some name components defy simple classifications. I once met a person named “Bus Stop”. “Stop” is clearly not a family affiliation, despite coming in the final position of the name. Sometimes the second component of a tripartite Western name isn’t a middle name at all, but a maiden name or the second word of a two-word first name (e.g. “Mary Anne” or “Lady Bird”)! One cannot even determine by looking at a familiar structure the roles of all of a name’s pieces!

Names are also contextual. One’s name with family, with legal institutions, and with classmates can all differ. Many of our international students have alternative Westernized first names. Their family may call them Qiáng but they introduce themselves as Brian in class. We ask for a “preferred name” in a lot of systems, which is a nice step forward, but don’t ask when it’s preferred. Names might be meant for different situations. We have no system remotely ready for this, despite the personalization that’s been seeping into web platforms for decades.

So if names are such a trouble, why not do our best and move on? Aren’t these fringe cases that don’t affect the vast majority of our users? These issues simply cannot be ignored because names are vital. What one is called, even if it’s not a stable identifier, has great effects on one’s life. It’s dispiriting to witness one’s name misspelled, mispronounced, treated as an inconvenience, botched at every turn. A system that won’t adapt to suit a name delegitimizes the name. It says, “oh that’s not your real name” as if names had differing degrees of reality. But a person may have multiple names—or many overlapping names over time—and while one may be more institutionally recognized at a given time, none are less real than the others. If even a single student a year is affected, it’s the absolute least amount of respect we can show to affirm their name(s).

So what do we to do? Endless enumerations of the difficulties of working with names does little but paralyze us. Honestly, when I consider about the best implementation of personal names, the MODS metadata schema comes to mind. Having a <name> element with any number of <namePart> children is the best model available. The <namePart>s can be ordered in particular ways, a “@type” attribute can define a part’s function 3, a record can include multiple names referencing the same person, multiple names with distinct parts can be linked to the same authority record, etc. MODS has a flexible and comprehensive treatment of name data. Unfortunately, returning to “Falsehoods Programmers Believe”, none of the library systems I administer do anywhere near as good a job as this metadata schema. Nor is it necessarily a problem with Western bias—even the Chinese government can’t develop computer systems to accurately represent the names of people in the country, or even agree on what the legal character set should be! 4 It seems that programmers start their apps by creating a “users” database table with columns for unique identifier, username, “firstname”/”lastname” [sic], and work from there. On the bright side, the name isn’t used as the identifier at least! We all learned that in databases class but we didn’t learn to make “names” a separate table linked to “users” in our relational databases.

In my day-to-day work, the best I’ve done is to be sensitive to the importance of names changes specifically and how our systems handle them. After a few meetings with a cross-departmental team, we developed a name change process at our college. System administrators from across the institution are on a shared listserv where name changes are announced. In the libraries, I spoke with our frontline service staff about assisting with name changes. Our people at the circulation desk know to notice name discrepancies—sometimes a name badge has been updated but not our catalog records, we can offer to make them match—but also to guide students who may need to contact the registrar or other departments on campus to initiate the top-down name change process. While most of our the library’s systems don’t easily accommodate username changes, I can write administrative scripts for our institutional repository that alter the ownership of a set of items from an old username to a new one. I think it’s important to remember that we’re inconveniencing the user with the work of implementing their name change and not the other way around. So taking whatever extra steps we can do on our own, without pushing labor onto our students and staff, is the best way we can mitigate how poorly our tools are able to support the protean nature of personal names.

Notes

  1. Chinese names typically have the surname first, followed by the given name.
  2. Another poor implementation can be seen in The Chicago Manual of Style‘s indexing instructions, which has an extensive list of exceptions to the Western norm and how to handle them. But CMoS provides no guidance on how one would go about identifying a name’s cultural background or, for instance, identifying a compound surname.
  3. Although the MODS user guidelines sadly limit the use of the type attribute to a fixed list of values which includes “family” and “given”, rendering it subject to most of the critiques in this post. Substantially expanding this list with “maiden”, “patronymic/matronymic” (names based on a parental given name, e.g. Mikhailovich), and more, as well as some sort of open-ended “other” option, would be a great improvement.
  4. https://www.nytimes.com/2009/04/21/world/asia/21china.html

Our Assumptions: of Neutrality, of People, & of Systems

Discussions of neutrality have been coming up a lot in libraryland recently. I would argue that people have been talking about this for years1 2 3 4,  but this year we saw a confluence of events drive the “neutrality of libraries” topic to the fore. To be clear, I have a position on this on this topic5 and it is that libraries cannot be neutral players and still claim to be a part of the society they serve. But this post is about what we assume to be neutral, what we bring forward with those assumptions, and how to we react when those assumptions are challenged. When we challenge ideas that have been built into systems, either as “benevolent, neutral” librarians or “pure logic, neutral” algorithms, what part of ourselves are we challenging? How do reactions change based on who is doing the challenging? Be forewarned, this is a convoluted landscape.

At the 2018 ALA Midwinter conference, the ALA President’s program was a debate about neutrality. I will not summarize that event (see here), but I do want to call attention to something that became very clear in the course of the program: everyone was using a different definition of neutrality. People spoke with assumptions of what neutrality means and why they do, or do not, believe that it is important for libraries to maintain. But what are we assuming when we make these assumptions? Without an agreed upon definition, some referred to legal rulings to define neutrality, some used a dictionary definition (“not aligned with a political or ideological grouping” – Merriam Webster) without probing how political or ideological perspectives play out in real life. But why do we assume libraries should be neutral? What safety or security does that assumption carry? What else are we assuming should be neutral? Software? Analytics? What value judgements are we bringing forward with those assumptions?

An assumption of neutrality often comes with a transference of trust. A speaker at ALA even said that the three professions thought of as the most trustworthy (via a national poll) are firefighting, nursing, and librarianship, and so, by his logic, we must be neutral. Perhaps some do not conflate trust and neutrality, but when we do assume neutrality equates with trust in these situations, we remove the human aspect from the equation. Nurses and librarians, as people, are not neutral. People hold biases and a variety of lived experiences that shape perspectives and approaches. If you engage this line of thought and interrogate your assumptions and beliefs, it can become apparent that it takes effort to recognize and mitigate our human biases throughout the various interactions in our lives.

What of our technology? Systems and software are often put forward as logic-driven, neutral devices, considered apart from their human creators. The position of some people is that machines lack emotions and are, therefore, immune to our human biases and prejudices. This position is inaccurate and dangerous and requires our attention. Algorithms and analytics are not neutral. They are designed by people, who carry forward their own notions of what is true and what is neutral. These ideas are built into the structure of the systems and have the potential to influence our perception of reality. As we rely on “data-driven decision-making” across all aspects of our society — education, healthcare, entertainment, policy — we transfer trust and power to that data. All too often, we do that without scrutinizing the sources of the data, or the algorithms acting upon them. Moreover, as we push further into machine learning systems – systems that are trained on data to look for patterns and optimize processes – we open the door for those systems to amplify biases. To “learn” our systemic prejudices and inequities.

People far more expert in this domain than me have raised these questions and researched the effects that biased systems can have on our society6 7 8. I often bring these issues up when I want to emphasize how problematic it is to let the assumption of data-driven outcomes as “truth” persist and how critical it is to apply information literacy practices to data. But as I thought about this issue and read more from these experts, I have been struck by the variety of responses that these experts illicit. How do reactions change based on who is doing the challenging?

Angela Galvan questioned assumptions related to hiring, performance, and belonging in librarianship, based on the foundation of the profession’s “whiteness,” and was met with hostile comments on the post9. Nicole A. Cooke wrote about implicit assumptions when we write about tolerance and diversity and has been met with hostile comments10 while her micro-aggression research has been highlighted by Campus Reform11, which led to a series of hostile communications to her. Chris Bourg’s keynote about diversity and technology at Code4Lib was met with hostility12. Safiya Noble wrote a book about bias in algorithms and technology, which resulted in one of the more spectacular Twitter disasters13 14, wherein someone found it acceptable to dismiss her research without even reading the book.

Assumptions of neutrality, whether it be related to library services, space, collections, or the people doing the work, allow oppressive systems to persist and contribute to a climate where the perspectives and expertise of marginalized people in particular can be dismissed. Insisting that we promulgate the library and technology – and the people working in it and with it – as neutral actors, erases the realities that these women (and countless others) have experienced. Moreover, it allows the those operating with harmful and discriminatory assumptions to believe that they *are* neutral, by virtue of working in those spaces, and that their truth is an objective truth. It limits the desire for dialog, discourse, and growth – because who is really motivated to listen when you think you are operating from a place of “Truth”…when you feel that the strength of your assumptions can invalidate a person’s life?

Introducing Omeka S

My library has used Omeka as part of our suite of platforms for creating digital collections and exhibits for many years now. It’s easy to administer and use, and many of our students, particularly in history or digital humanities, learn how to create exhibits with it in class or have experience with it from other institutions, which makes it a good solution for student projects. This creates challenges, however, since it’s been difficult to have multiple sites or distributed administration. A common scenario is that we have a volunteer student, often in history, working on a digital exhibit as part of a practicum, and we want the donor to review the exhibit before it goes live. We had to create administrative accounts for both the student and the donor, which required a lot of explanations about how to get in to just the one part of the system they were supposed to be in (it’s possible to create a special account to view collections that aren’t public, but not exhibits). Even though the admin accounts can’t do everything (there’s a super admin level for that), it’s a bit alarming to hand out administrative accounts to people I barely know.

This problem goes away with Omeka S, which is the new and completely rebuilt Omeka. It supports having multiple sites (which is the new name for exhibits) and distributed administration by site. Along with this, there are sophisticated metadata templates that you can assign to sites or users, which takes away the need for lots of documentation on what metadata to use for which item type. When I showed a member of my library’s technical services department the metadata templates in Omeka S, she gasped with excitement. This should indicate that, at least for those of us working on the back end, this is a fun system to use.

Trying it Out For Yourself

I have included some screenshots below, but you might want to use the Omeka S Sandbox to follow along. You can experiment with anything, and the data is reset every Monday, Wednesday, Friday, and Sunday. This includes a variety of sample exhibits, one is “A Battered Tin Dispatch Box” from which I include some screenshots below.

A Quick Tour Through Omeka S

This is what the Omeka Classic administrative dashboard looks like for a super administrator.Omeka Classic Administrative internfaceAnd this is the dashboard for Omeka S. It’s not all that different functionally, but definitely a different aesthetic experience.

Omeka S administrative interface

Most things in Omeka S work analogously to classic Omeka, but some things have been renamed or moved around. The documentation walks through everything in order, so it’s a great place to start learning. Overall, my feeling about Omeka S is that it’s much easier to tap into the  powerful features with less of a learning curve. I first learned Omeka S at the DLF Forum conference in fall 2017 directly from Patrick Murray-John, the Omeka Development Team Manager, and some of what is below is from his description.

Sites

Sites insetOmeka S has the very useful concept of Sites, which again function like exhibits in classic Omeka. Each site has its own set of administrative functions and user permissions, which allow for viewer, editor, or admin by site. I really appreciate this, since it allowed me to give student volunteers access to just the site they needed, and when we need to give other people access to view the site before it’s published we can do that. It’s easier to add outside or supplementary materials to the exhibit navigation. On the individual pages there are a variety of blocks available, and the layout is easier for people without a lot of HTML skills to set up.

Resource Templates

These existed in Omeka Classic, but were less straightforward. Now you can set a resource template with properties from multiple vocabularies and build the documentation right into the template. The data type can be text or URI, or draw from vocabularies with autosuggest. For example, you can set the Rights field to draw from Rights Statement options.

Items

Items work in a similar fashion to Omeka Classic. Items exist at the installation level, so can be reused across multiple sites. What’s great is that the nature of an item can be much more flexible. They can include URIs, maps, and multiple types of media such as a URL, HTML, IIIF image, oEmbed, or YouTube. This reflects the actual way that we were using Omeka Classic, but without the technical overhead to make it all work. This will make it easier for more people to create much more interactive and web-integrated exhibits.

Item Sets

Item Sets are the new name given to Collections and, like Items, they can have metadata from multiple vocabularies. Item Sets are analogous to Collections, but items can be in multiple Item Sets to be associated with sites to limit what people see. The tools for batch adding and editing are similar, but more powerful because you can actually remove or edit metadata in bulk.

Themes

Themes in Omeka S have changed quite a bit, and as Murray-John explained, it is more complicated to do theming than in the past. Rather than call to local functions, Omeka S uses patterns from Zend Framework 3, and so the process of theming will require more careful thought and planning. That said, the base themes provided are a great base, and thanks to the multiple options for layouts in sites, it’s less critical to be able to create custom themes for certain exhibits. I wrote about how to create themes in Omeka in 2013, and while some of that still holds true, you would want to consult the updated documentation to see how to do this in Omeka S.

Mapping

One of my favorite things in Omeka S is the Mapping module, which allows you to add geolocation metadata to items, and create a map on site pages. Here’s an example from the Omeka S Sandbox with locations related to Scotland Yard mapped for an item in the Battered Tin Dispatch Box exhibit.

Map interface for itemsThis can then turn into an interactive map on the front end.

Map interface for exhibits

For the vast majority of mapping projects that our students want to do, this works in a very straightforward manner. Neatline is a plugin for Omeka Classic that allows much more sophisticated mapping and timelines–while it should be ported over to Omeka S, it currently is not listed as a module. In my experience, however, Neatline is more powerful than what many people are trying to do, and that added complexity can be a challenge. So I think the Mapping module looks like a great compromise.

Possible Approaches to Migration

Migration between Omeka Classic and Omeka S works well for items. For that, there’s the Omeka2 Importer module. Because exhibits work differently, they would have to be recreated. Omeka.net, the hosted version of Omeka, will stay on Omeka Classic for the foreseeable future, so there’s no concern that it will stop being supported any time soon, according to Patrick Murray-John.

Conclusion

We are still working on setting up Omeka S. My personal approach is that as new ideas for exhibits come up we will start them first in Omeka S. As we have time and interest, we may start to migrate older exhibits if they need continual management. Because some of our older exhibits rely on Omeka Classic pla but are planning to mostly create new exhibits in there that don’t rely on Omeka Classic plugins. I am excited to pair this with our other digital collection platforms to build exhibits that use content across our platforms and extend into the wider web.

 

Reflections on Code4Lib 2018

A few members of Tech Connect attended the recent Code4Lib 2018 conference in Washington, DC. If you missed it, the full livestream of the conference is on the Code4Lib YouTube channel. We wanted to  highlight some of our favorite talks and tie them into the work we’re doing.

Also, it’s worth pointing to the Code4Lib community’s Statement in Support of opening keynote speaker Chris Bourg. Chris offered some hard truths in her speech that angry men on the internet, predictably, were unhappy about, but it’s a great model that the conference organizers and attendees promptly stood in support.


Ashley:

One of my favorite talks at Code4lib this year was Amy Wickner’s talk, “Web Archiving and You / Web Archiving and Us.” (Video, slides) I felt this talk really captured some of the essence of what I love most about Code4lib, this being my 4th conference in the past 5 years. (And I believe this was Amy’s first!). This talk was about a technical topic relevant to collecting libraries and handled in a way that acknowledges and prioritizes the essential personal component of any technical endeavor. This is what I found so wonderful about Amy’s talk and this is what I find so refreshing about Code4lib as an inherently technical conference with intentionality behind the human aspects of it.

Web archiving seems to be something of interest but seemingly overwhelming to begin to tackle. I mean, the internet is just so big. Amy brought forth a sort of proposal for ways in which a person or institution can begin thinking about how to start a web archiving project, focusing first on the significance of appraisal. Wickner, citing Terry Cook, spoke of the “care and feeding of archives” and thinking about appraisal as storytelling. I think this is a great way to make a big internet seem smaller, understanding the importance of care in appraisal while acknowledging that for web archiving, it is an essential practice. Representation in web archives is more likely to be chosen in the appraisal of web materials than in other formats historically.

This statement resonated with me: “Much of the power that archivists wield are in how we describe or create metadata that tells a story of a collection and its subjects.”

And also: For web archives, “the narrative of how they are built is closely tied to the stories they tell and how they represent the world.”

Wickner went on to discuss how web archives are and will be used, and who they will be used by, giving some examples but emphasizing there are many more, noting that we must learn to “critically read as much as learn to critically build” web archives, while acknowledging web archives exist both within and outside of institutions. And that for personal archiving, it can be as simple as replacing links in documents with perma.cc, Wayback Machine links, or WebRecorder links.

Another topic I enjoyed in this talk was the celebration of precarious web content through community storytelling on Twitter with the hashtags #VinesWithoutVines and #GifHistory, two brief but joyous moments.


Bohyun:

The part of this year’s Code4Lib conference that I found most interesting was the talks and the discussion at a breakout session related to machine learning and deep learning. Machine learning is a subfield of artificial intelligence and deep learning is a kind of machine learning that utilizes hidden layers between the input layer and the output layer in order to refine and produce the algorithm that best represents the result in the output. Once such algorithm is produced from the data in the training set, it can be applied to a new set of data to predict results. Deep learning has been making waves in many fields such as Go playing, autonomous driving, and radiology to name a few. There were a few different talks on this topic ranging from reference chat sentiment analysis to feature detection (such as railroads) in the map data using the convolutional neural network model.

“Deep Learning for Libraries” presented by Lauren Di Monte and Nilesh Patil from University of Rochester was the most practical one among those talks as it started with a specific problem to solve and resulted in action that will address the problem. In their talk, Di Monte and Patil showed how they applied deep learning techniques to solve a problem in their library’s space assessment. The problem that they wanted to solve is to find out how many people visit the library to use the library’s space and services and how many people are simply passing through to get to another building or to the campus bus stop that is adjacent to the library. This made it difficult for the library to decide on the appropriate staffing level or the hours that best serve the users’ needs. It also prevented the library from showing the library’s reach and impact based upon the data and advocate for needed resources or budget to the decision-makers on the campus. The goal of their project was to develop automated and scalable methods for conducting space assessment and reporting tools that support decision-making for operations, service design, and service delivery.

For this project, they chose an area bounded by four smart control access gates on the first floor. They obtained the log files (with the data at the sensor level minute by minute) from the eight bi-directional sensors on those gates. They analyzed the data in order to create a recurrent neural network model. They trained the algorithm using this model, so that they can predict the future incoming and the outgoing traffic in that area and visually present those findings as a data dashboard application. For data preparation, processing, and modeling, they used Python. The tools used included Seaborn, Matplotlib, Pandas, NumPy, SciPy, TensorFlow, and Keras. They picked the recurrent neural network with stochastic gradient descent optimization, which is less complex than the time series model. For data visualization, they used Tableau. The project code is available at the library’s GitHub repo: https://github.com/URRCL/predicting_visitors.

Their project result led to the library to install six more gates in order to get a better overview of the library space usage. As a side benefit, the library was also able to pinpoint the times when the gates malfunctioned and communicate the issue with the gate vendor. Di Monte and Patil plan to hand over this project to the library’s assessment team for ongoing monitoring and to look for ways to map the library’s traffic flow across multiple buildings as the next step.

Overall, there were a lot of interests in machine learning, deep learning, and artificial intelligence at the Code4Lib conference this year. The breakout session I led at the conference on these topics produced a lively discussion on a variety of tools, current and future projects for many different libraries, as well as the impact of rapidly developing AI technologies on society. This breakout session also generated #ai-dl-ml channel in the Code4Lib Slack Space. The growing interests in these areas are also shown in the newly formed Machine and Deep Learning Research Interest Group of the Library and Information Technology Association. I hope to see more talks and discussion on these topics in the future Code4Lib and other library technology conferences.


Eric:

One of the talks which struck me the most this year was Matthew Reidsma’s Auditing Algorithms. He used examples of search suggestions in the Summon discovery layer to show biased and inaccurate results:

In 2015 my colleague Jeffrey Daniels showed me the Summon search results for his go-to search: “Stress in the workplace.” Jeff likes this search because ‘stress’ is a common engineering term as well as one common to psychology and the social sciences. The search demonstrates how well a system handles word proximities, and in this regard, Summon did well. There are no apparent results for evaluating bridge design. But Summon’s Topic Explorer, the right-hand sidebar that provides contextual information about the topic you are searching for, had an issue. It suggested that Jeff’s search for “stress in the workplace” was really a search about women in the workforce. Implying that stress at work was caused, perhaps, by women.

This sort of work is not, for me, novel or groundbreaking. Rather, it was so important to hear because of its relation to similar issues I’ve been reading about since library school. From the bias present in Library of Congress subject headings where “Homosexuality” used to be filed under “Sexual deviance”, to Safiya Noble’s work on the algorithmic bias of major search engines like Google where her queries for the term “black girls” yielded pornographic results; our systems are not neutral but reify the existing power relations of our society. They reflect the dominant, oppressive forces that constructed them. I contrast LC subject headings and Google search suggestions intentionally; this problem is as old as the organization of information itself. Whether we use hierarchical, browsable classifications developed by experts or estimated proximities generated by an AI with massive amounts of user data at its disposal, there will be oppressive misrepresentations if we don’t work to prevent them.

Reidsma’s work engaged with algorithmic bias in a way that I found relatable since I manage a discovery layer. The talk made me want to immediately implement his recording script in our instance so I can start looking for and reporting problematic results. It also touched on some of what despairs me in library work lately—our reliance on vendors and their proprietary black boxes. We’ve had a number of issues lately related to full-text linking that are confusing for end users and make me feel powerless. I submit support ticket after support ticket only to be told there’s no timeline for the fix.

On a happier note, there were many other talks at Code4Lib that I enjoyed and admired: Chris Bourg gave a rousing opening keynote featuring a rallying cry against mansplaining; Andreas Orphanides, who keynoted last year’s conference, gave yet another great talk on design and systems theory full of illuminating examples; Jason Thomale’s introduction to Pycallnumber wowed me and gave me a new tool I immediately planned to use; Becky Yoose navigated the tricky balance between using data to improve services and upholding our duty to protect patron privacy. I fear I’ve not mentioned many more excellent talks but I don’t want to ramble any further. Suffice to say, I always find Code4Lib worthwhile and this year was no exception.