Libraries & Privacy in the Internet Age

Recently, we covered library data collection practices with an eye towards identifying what your library really needs to retain. In an era of seemingly comprehensive surveillance, libraries do their best to afford their patrons some privacy. Limiting our circulation statistics is a prime example: while many libraries track how many times a particular item circulates, it’s common practice to delete loan histories in patron records once items have been returned. Thus, in keeping with the Library Code of Ethics, we can “protect each library user’s right to privacy and confidentiality” while at once using data to improve our services.

However, not all information lives in books and our privacy protections must stay current with technology. Obfuscating the circulation of physical items is one thing, but what about all of our online resources? Most of the data noted in the data collection post is in and of the digital: web analytics, server logs, and heat maps. Today, people expose more and more of their personal information online and do so mostly on for-profit websites. In this post, I’ll go beyond library-specific data to talk further about how we can offer patrons enhanced privacy even when they’re not using resources we control, such as the library website or ILS.

Public Computers

Libraries are a great bastion of public computer access. We’re pretty much the only institution in modern society that a community can rely upon for free software use and web access. But how much thought do we put into the configuration of our public computers? Are we sure that each user’s session is completely isolated, unable to be accessed by others?

For a while, I tried to do quantitative research on how well libraries handled web browser settings on public computers. I went to whatever libraries I could—public, academic, law, anyone who would let me in the door and sit down at a computer, typically without a library card, which is not everyone. If I could get to a machine, I ran a brief audit of sorts, these being the main items:

  • List the web browsers on the machine, their versions, settings, & any add-ons present
  • Run Mozilla’s Plugin Check to test for outdated plugins, a common security vulnerability for browsers
  • Attempt to install problematic add-ons, such as keyloggers [1]
  • Attempt to change the browser’s settings, e.g. set it to offer to save passwords
  • Close the browser, then reopen it to see if my history and settings changes persisted
  • DELETE ALL THE THINGS

After awhile, I gave up on this effort, because I became busy with other projects and I never received a satisfactory sample size. Of the fourteen browsers across six (see what I mean about sample size?) libraries I tested, results were discouraging:

  • 93% (all but one) of browsers were outdated
  • On average, browsers had two plug-ins with known security vulnerabilities and two-and-a-half more which were outdated
  • The majority of browsers (79%) retained their history after being closed
  • A few (36%) offered to remember passwords, which could lead to dangerous accidents on shared computers
  • The majority (86%) had no add-ons installed
  • All but one allowed its settings to be changed
  • All but one allowed arbitrary add-ons to be installed

I understand that IT departments often control public computer settings and that there are issues beyond privacy which dictate these settings. But these are miserable figures by any standard. I encourage all librarians to run similar audits on their public computers and see if any improvements can be made. We’re allowing users’ sessions to bleed over into each other and giving them too much power to monitor each others’ activity. Much as libraries commonly anonymize circulation information to protect patrons from invasive government investigations, we should strive to keep web activities safe with sensible defaults. [2]

Many libraries force users to sign in or reserve a computer. Academic libraries may use Active Directory, wherein students sign in with a common login they use for other services like email, while public libraries may use PC reservation software like EnvisionWare. These approaches go a long way towards isolating user sessions, but at the cost of imposing access barriers and slowing start-up times. Now users need an AD account or library card to use your computers. Furthermore, users don’t always remember to sign off at the end of their session, meaning someone else could still sit down at their machine and potentially access their information. These can seem like unimportant edge cases, but they’re still worthy of consideration. Privacy almost always involves some kind of tradeoff, for users and for libraries. We need to ensure we’re making the right tradeoffs with due diligence.

Proactive Privacy

Libraries needn’t be on the defensive about privacy. We can also proactively help patrons in two ways: by modifying the browsers on our public computers to offer enhanced protections and by educating the public about their privacy.

While providing sensible defaults, such as not offering to remember passwords and preventing the installation of keylogging software, is helpful, it does little to offer privacy above and beyond what one would experience on a personal machine. However, libraries can use a little knowledge and research to offer default settings which are unobtrusive and advantageous. The most obvious example is HTTPS. HTTPS is probably familiar to most people; when you see a lock or other security-connoting icon in your browser’s address bar, it’ll be right alongside a URL that begins with the HTTPS scheme. You can think of the S in HTTPS as standing for “Secure,” meaning your web traffic is encrypted as it goes from node to node in between your browser and the server delivering data.

Banking sites, social media, and indeed most web accounts are commonly accessed over HTTPS connections. They operate rather seamlessly, the same as HTTP connections, with one slight caveat: HTTPS sites don’t load HTTP resources (e.g. if https://example.com happens to include the image http://cats.com/lol.jpg) by default, meaning sometimes pieces of a page are missing or broken. This commonly results in a “mixed content” warning which the user can override, though how intuitive that process is varies widely across browser user interfaces.

In any case, mixed content happens rarely enough that HTTPS, when available, is a no-brainer benefit. But here’s the rub: not all sites default to HTTPS, even if they should. Most notably, Facebook doesn’t. Do you want your patrons logging into Facebook with unencrypted credentials? No, you don’t, because anyone monitoring network traffic, using a tool like Firesheep for instance, can grab and reuse those credentials. So installing an extension like the superlative HTTPS Everywhere [3], which uses a crowdsourced set of formulas to deliver HTTPS sites where available, is of immense benefit to users even though they likely will never notice it.

HTTPS is just a start: there are numerous add-ons which offer security and privacy enhancements, from blocking tracking cookies to the NoScript Security Suite which blocks, well, pretty much everything. How disruptive these add-ons are is variable and putting NoScript or a similar script-blocking tool on public computers is probably a bad idea; it’s simply too strange for unacquainted users to understand. But awareness of these tools is vital and some of the less disruptive ones still offer benefits that the majority of your patrons would enjoy. If you’re on the fence about a particular option, a little targeted usability testing could highlight whether it’s worth it or not.

Education

In terms of education, online privacy is a massively under-taught field. Workshops in public libraries and courses in academic libraries are obvious and in-demand services we can provide. They can cater to users of all skill levels. A basic introduction might appeal to people just beginning to use the web, covering core concepts like HTTPS, session data (e.g. cookies), and the importance of auditing account settings. An advanced workshop could cover privacy software, two-factor authentication, and pivotal extensions that have a more niche appeal.

Password management alone is a rich topic. Why? Because it’s a problem for everyone. Being a modern web user virtually necessitates maintaining a double-digit number of accounts. Password best practices are fairly well-known: use lengthy, unique passwords with a mixture of character types (lowercase and uppercase letters, numbers, and punctuation). Applying them is another matter. Repeating one password across accounts means if one company get hacked, suddenly all your accounts are potentially vulnerable. Using tricky number-letter replacement strategies can lead to painful forgetting—was it LibrarianFervor with “1”s instead of “i”s, a “3” instead of an “e”, a “0” instead of an “o”, or any combination thereof? Or did I spell it in reverse? These strategies aren’t much more secure and yet they make remembering passwords tough.

Users aren’t to be blamed: creating a well-considered and scalable approach to managing online accounts is difficult. But many wonderful software packages exist for this, e.g. the open source KeePass or paid solutions like 1Password and LastPass. Merely showing users these options and explaining their immense benefits is a public service.

To use a specific example, I co-taught an interdisciplinary course recently with a title broad enough—”The Nature of Knowledge,” try that on for size—that sneaking in privacy, social media, and web browsers was easy. One task I had willing students perform was to install the PrivacyFix extension and then report back on their findings. PrivacyFix analyzes your Google and Facebook settings, telling you how much you’re worth to each company and pointing out places where you might be overexposing your information. It also includes a database of site ratings, judging sites based on how well they handle users data.

Our class was as diverse as any at my community college: we had adult students, teenage students, working parents, athletes, future teachers, future nurses, future police officers, black students, white students, Latino students, women, men. And you know what? Virtually everyone was shocked by their findings. They gasped, they changed their settings, they did independent research on online privacy, and at the end of the course they said still wanted to learn more. I hardly think this class was an anomaly. Americans know they’re being monitored at every turn. They want to share information online but they want to do so intelligently. If we offer them the tools to do so, they’ll jump at the chance.

Exeunt

For those who are curious about browser extensions, I wrote (shameless plug) a RUSQ column on web privacy that covers most of this post but goes further in detail in terms of recommendations. The Sec4Lib listserv is worth keeping an eye on as well, and if you really want to go the extra mile you could attend the Security preconference at the upcoming LITA Forum in November. Online privacy is not likely to get any less complicated in the future, but libraries should see that as an opportunity. We’re uniquely poised, both as information professionals with a devotion to privacy and as providers of public computing services, to tackle this issue. And no one is going to do it for us.

Footnotes

[1]^ Keyloggers are software which record each keystroke. As such, they can be used to find username and password information entered into web forms. I couldn’t find a free keylogger add-on for every browser so I only tested in browsers which had one available.

[2]^ I have a GitHub repository of what I consider to be sensible defaults for Mozilla Firefox, which happens to store settings in a JavaScript file and thus makes it easy to version and share them. The settings are liberally commented but not tested in production.

[3]^ As you’ll notice if you visit that link, HTTPS Everywhere is only available for Google Chrome and Mozilla Firefox. In my experience, it almost never causes problems, especially with major websites like Facebook, and there are a few similar extensions which one could try e.g. KB SSL for Chrome. Unfortunately, Internet Explorer has a much weaker add-on ecosystem with no real HTTPS solution that I’m aware of. Safari also has a weak extension ecosystem, though there is at least one HTTPS Everywhere-type option that I haven’t tried and has acknowledged limitations.

At the very least, installing HTTPS Everywhere on Firefox and Chrome still helps users who employ those browsers, without affecting users who prefer the others.