Aaron Swartz and Too-Comfortable Research Libraries

*** Update: Several references and a video added (thanks to Brett Bonfield) on Feb. 21, 2013. ***

Who was Aaron Swartz?

If you are a librarian and do not know who Aaron Swartz is, that should probably change now. He helped developing the RSS standard, was the co-founder of Reddit, worked on the Open Library project, downloaded and freed 20% (2.7 million documents) of the Public Access to Court Electronic Records (PACER) database that charges access fees for the United States federal court documents, out of which about 1,600 had privacy issues, played a lead role in preventing the Stop Online Piracy Act (SOPA), and wrote the Guerrilla Open Access Manifesto.

Most famously, he was arrested in 2011 for the mass download of journal articles from JSTOR. He returned the documents to JSTOR and apologized. The Massachusetts state court dismissed the charges, and JSTOR decided not to pursue civil litigation. But MIT stayed silent, and the federal court charged Swartz with wire fraud, computer fraud, unlawfully obtaining information from a protected computer and recklessly damaging a protected computer. If convicted on these charges, Swartz could be sentenced to up to 35 years in prison at the age of 26. He committed suicide after facing charges for two years, on January 11, 2013.

Information wants to be free; Information wants to be expensive

Now, he was a controversial figure. He advocated Open Access (OA) but to the extent of encouraging scholars, librarians, students who have access to copyrighted academic materials to trade passwords and circulate them freely on the grounds that this is an act of civil disobedience against unjust copyright laws in his manifesto. He was an advocate of the open Internet, the transparent government, and open access to scholarly output. But he also physically hacked into the MIT network wiring closet and attached his laptop to download over 4 million articles from JSTOR. Most people including librarians are not going to advocate trading their institutions’ subscription database passwords or breaking into a staff-only computer networking area of an institution. The actual method of OA that Swartz recommended was highly controversial even among the strongest OA advocates.

But in his Guerrilla OA manifesto, Swartz raised one very valid point about the nature of information in the era of the World Wide Web. That is, information is power. (a) As power, information can be spread to and be made useful to as many of us as possible. Or, (b) it can be locked up and the access to it can be restricted to only those who can pay for it or have access privileges some other way. One thing is clear. Those who do not have access to information will be at a significant disadvantage compared to those who do.

And I would like to ask what today’s academic and/or research libraries are doing to realize Scenario (a) rather than Scenario (b). Are academic/research libraries doing enough to make information available to as many as possible?

Too-comfortable Internet, Too-comfortable academic libraries

Among the many articles I read about Aaron Swartz’s sudden death, the one that made me think most was “Aaron Swartz’s suicide shows the risk of a too-comfortable Internet.” The author of this article worries that we may now have a too-comfortable Internet. The Internet is slowly turning into just another platform for those who can afford purchasing information. The Internet as the place where you could freely find, use, modify, create, and share information is disappearing. Instead pay walls and closed doors are being established. Useful information on the Internet is being fast monetized, and the access is no longer free and open. Even the government documents become no longer freely accessible to the public when they are put up on the Internet (likely to be due to digitization and online storage costs) as shown in the case of PACER and Aaron Swartz. We are more and more getting used to giving up our privacy or to paying for information. This may be inevitable in a capitalist society, but should the same apply to libraries as well?

The thought about the too-comfortable Internet made me wonder whether perhaps academic research libraries were also becoming too comfortable with the status quo of licensing electronic journals and databases for patrons. In the times when the library collection was physical, people who walk into the library were rarely turned away. The resources in the library are collected and preserved because we believe that people have the right to learn and investigate things and to form one’s own opinions and that the knowledge of the past should be made available for that purpose. Regardless of one’s age, gender, social and financial status, libraries have been welcoming and encouraging people who were in the quest for knowledge and information. With the increasing number of electronic resources in the library, however, this has been changing.

Many academic libraries offer computers, which are necessary to access electronic resources of the library itself. But how many of academic libraries keep all the computers open for user without the user log-in? Often those library computers are locked up and require the username and password, which only those affiliated with the institution possess. The same often goes for many electronic resources. How many academic libraries allow the on-site access to electronic resources by walk-in users? How many academic libraries insist on the walk-in users’ access to those resources that they pay for in the license?  Many academic libraries also participate in the Federal Depository Library program, which requires those libraries to provide free access to the government documents that they receive to the public. But how easy is it for the public to enter and access the free government information at those libraries?

I asked in Twitter about the guest access in academic libraries to computers and e-resources. Approximately 25 academic librarians generously answered my question. (Thank you!) According to the responses in Twitter,  almost all except a few libraries ( mentioned in Twitter responses) offer guest access to computers and e-resources on-site. It is to be noted, however, that a few offer the guest -access to neither. Also some libraries limit the guests’ computer-use to 30 minutes – 4 hours, thereby restricting the access to the library’s electronic resources as well. Only a few libraries offer free wi-fi for guests. And at some libraries, the guest wi-fi users are unable to access the library’s e-resources even on-site because the IP range of the guest wi-fi is different from that of the campus wi-fi.

I am not sure how many academic libraries consciously negotiate the walk-in users’ on-site access with e-resources vendors or whether this is done somewhat semi-automatically because many libraries ask the library building IP range to be registered with vendors so that the authentication can be turned off inside the building. I surmise that publishers and database vendors will not automatically permit the walk-in users’ on-site access in their licenses unless libraries ask for it. Some vendors also explicitly prohibit libraries from using their materials to fill the Interlibrary loan requests from other libraries. The electronic resource vendors and publishers’ pricing has become more and more closely tied to the number of patrons who can access their products. Academic libraries has been dealing with the escalating costs for electronic resources by filtering out library patrons and limiting the access to those in a specific disciplines. For example, academic medical and health sciences libraries often subscribe to databases and resources that have the most up-to-date information about biomedical research, diseases, medications, and treatments. These are almost always inaccessible to the general public and often even to those affiliated with the institution. The use of these prohibitively expensive resources is limited to a very small portion of people who are affiliated with the institution in specific disciplines such as medicine and health sciences. Academic research libraries have been partially responsible for the proliferation of these access limitations by welcoming and often preferring these limitations as a cost-saving measure. (By contrast, if those resources were in the print format, no librarian would think that it is OK to permanently limit its use to those in medical or health science disciplines only.)

Too-comfortable libraries do not ask themselves if they are serving the public good of providing access to information and knowledge for those who are in need but cannot afford it. Too-comfortable libraries see their role as a mediator and broker in the transaction between the information seller and the information buyer. They may act as an efficient and successful mediator and broker. But I don’t believe that that is why libraries exist. Ultimately, libraries exist to foster the sharing and dissemination of knowledge more than anything, not to efficiently mediate information leasing. And this is the dangerous idea: You cannot put a price tag on knowledge; it belongs to the human race. Libraries used to be the institution that validates and confirms this idea. But will they continue to be so in the future? Will an academic library be able to remain as a sanctuary for all ideas and a place for sharing knowledge for people’s intellectual pursuits regardless of their institutional membership? Or will it be reduced to a branch of an institution that sells knowledge to its tuition-paying customers only? While public libraries are more strongly aligned with this mission of making information and knowledge freely and openly available to the public than academic libraries, they cannot be expected to cover the research needs of patrons as fully as academic libraries.

I am not denying that libraries are also making efforts in continuing the preservation and access to the information and resources through initiatives such as Hathi Trust and DPLA (Digital Public Library of America). My concern is rather whether academic research libraries are becoming perhaps too well-adapted to the times of the Internet and online resources and too comfortable serving the needs of the most tangible patron base only in the most cost-efficient way, assuming that the library’s mission of storing and disseminating knowledge can now be safely and neutrally relegated to the Internet and the market. But it is a fantasy to believe that the Internet will be a sanctuary for all ideas (The Internet is being censored as shown in the case of Tarek Mehanna.), and the market will surely not have the ideal of the free and open access to knowledge for the public.

If libraries do not fight for and advocate those who are in need of information and knowledge but cannot afford it, no other institution will do so. Of course, it costs to create, format, review, and package content. Authors as well as those who work in this business of content formatting, reviewing, packaging, and producing should be compensated for their work. But not to the extent that the content is completely inaccessible to those who cannot afford to purchase but nevertheless want access to it for learning, inquiry, and research. This is probably the reason why we are all moved by Swartz’s Guerrilla Open Access Manifesto in spite of the illegal implications of the action that he actually recommended in the manifesto.

Knowledge and information is not like any other product for purchase. Sharing increases its value, thereby enabling innovation, further research, and new knowledge. Limiting knowledge and information to only those with access privilege and/or sufficient purchasing power creates a fundamental inequality. The mission of a research institution should never be limited to self-serving its members only, in my opinion. And if the institution forgets this, it should be the library that first raises a red flag. The mission of an academic research institution is to promote the freedom of inquiry and research and to provide an environment that supports that mission inside and outside of its walls, and that is why a library is said to be the center of an academic research institution.

I don’t have any good answers to the inevitable question of “So what can an academic research library do?” Perhaps, we can start with broadening the guest access to the library computers, wi-fi, and electronic resources on-site. Academic research libraries should also start asking themselves this question: What will libraries have to offer for those who seek knowledge for learning and inquiry but cannot afford it? If the answer is nothing, we will have lost libraries.

In his talk about the Internet Archive’s Open Library project at the Code4Lib Conference in 2008 (at 11:20), Swartz describes how librarians had argued about which subject headings to use for the books in the Open Library website. And he says, “We will use all of them. It’s online. We don’t have to have this kind of argument.” The use of online information and resources does not incur additional costs for use once produced. Many resources, particularly those scholarly research outputs, already have established buyers such as research libraries. Do we have to deny access to information and knowledge to those who cannot afford but are seeking for it, just so that we can have a market where information and knowledge resources are sold and bought and authors are compensated along with those who work with the created content as a result? No, this is a false question. We can have both. But libraries and librarians will have to make it so.

Videos to Watch

“Code4Lib 2008: Building the Open Library – YouTube.”

“Aaron Swartz on Picking Winners” American Library Association Midwinter meeting, January 12, 2008.

“Freedom to Connect: Aaron Swartz (1986-2013) on Victory to Save Open Internet, Fight Online Censors.”


“Aaron Swartz.” 2013. Accessed February 10. http://www.aaronsw.com/.

“Aaron Swartz – Wikipedia, the Free Encyclopedia.” 2013. Accessed February 10. http://en.wikipedia.org/wiki/Aaron_Swartz#JSTOR.

“Aaron Swartz on Picking Winners – YouTube.” 2008. http://www.youtube.com/watch?feature=player_embedded&v=BvJqXaoO4FI.

“Aaron Swartz’s Suicide Shows the Risk of a Too-comfortable Internet – The Globe and Mail.” 2013. Accessed February 10. http://www.theglobeandmail.com/commentary/aaron-swartzs-suicide-shows-the-risk-of-a-too-comfortable-internet/article7509277/.

“Academics Remember Reddit Co-Founder With #PDFTribute.” 2013. Accessed February 10. http://www.slate.com/blogs/the_slatest/2013/01/14/aaron_swartz_death_pdftribute_hashtag_aggregates_copyrighted_articles_released.html.

“After Aaron, Reputation Metrics Startups Aim To Disrupt The Scientific Journal Industry | TechCrunch.” 2013. Accessed February 10. http://techcrunch.com/2013/02/03/the-future-of-the-scientific-journal-industry/.

American Library Association, “A Memorial Resolution Honoring Aaron Swartz.” 2013. http://connect.ala.org/files/memorial_5_aaron%20swartz.pdf.

“An Effort to Upgrade a Court Archive System to Free and Easy – NYTimes.com.” 2013. Accessed February 10. http://www.nytimes.com/2009/02/13/us/13records.html?_r=1&.

Bonfield, Brett. 2013. “Aaron Swartz.” In the Library with the Lead Pipe (February 20). http://www.inthelibrarywiththeleadpipe.org/2013/aaron-swartz/.

“Code4Lib 2008: Building the Open Library – YouTube.” 2013. Accessed February 10. http://www.youtube.com/watch?v=oV-P2uzzc4s&feature=youtu.be&t=2s.

“Daily Kos: What Aaron Swartz Did at MIT.” 2013. Accessed February 10. http://www.dailykos.com/story/2013/01/13/1178600/-What-Aaron-Swartz-did-at-MIT.

Dupuis, John. 2013a. “Around the Web: Aaron Swartz Chronological Link Roundup – Confessions of a Science Librarian.” Accessed February 10. http://scienceblogs.com/confessions/2013/01/20/around-the-web-aaron-swartz-chronological-link-roundup/.

———. 2013b. “Library Vendors, Politics, Aaron Swartz, #pdftribute – Confessions of a Science Librarian.” Accessed February 10. http://scienceblogs.com/confessions/2013/01/17/library-vendors-politics-aaron-swartz-pdftribute/.

“FDLP for PUBLIC.” 2013. Accessed February 10. http://www.gpo.gov/libraries/public/.

“Freedom to Connect: Aaron Swartz (1986-2013) on Victory to Save Open Internet, Fight Online Censors.” 2013. Accessed February 10. http://www.democracynow.org/2013/1/14/freedom_to_connect_aaron_swartz_1986.

“Full Text of ‘Guerilla Open Access Manifesto’.” 2013. Accessed February 10. http://archive.org/stream/GuerillaOpenAccessManifesto/Goamjuly2008_djvu.txt.

Groover, Myron. 2013. “British Columbia Library Association – News – The Last Days of Aaron Swartz.” Accessed February 21. http://www.bcla.bc.ca/page/news/ezlist_item_9abb44a1-4516-49f9-9e31-57685e9ca5cc.aspx#.USat2-i3pJP.

Hellman, Eric. 2013a. “Go To Hellman: Edward Tufte Was a Proto-Phreaker (#aaronswnyc Part 1).” Accessed February 21. http://go-to-hellman.blogspot.com/2013/01/edward-tufte-was-proto-phreaker.html.

———. 2013b. “Go To Hellman: The Four Crimes of Aaron Swartz (#aaronswnyc Part 2).” Accessed February 21. http://go-to-hellman.blogspot.com/2013/01/the-four-crimes-of-aaron-swartz.html.

“How M.I.T. Ensnared a Hacker, Bucking a Freewheeling Culture – NYTimes.com.” 2013. Accessed February 10. http://www.nytimes.com/2013/01/21/technology/how-mit-ensnared-a-hacker-bucking-a-freewheeling-culture.html?pagewanted=all.

March, Andrew. 2013. “A Dangerous Mind? – NYTimes.com.” Accessed February 10. http://www.nytimes.com/2012/04/22/opinion/sunday/a-dangerous-mind.html?pagewanted=all.

“MediaBerkman » Blog Archive » Aaron Swartz on The Open Library.” 2013. Accessed February 22. http://blogs.law.harvard.edu/mediaberkman/2007/10/25/aaron-swartz-on-the-open-library-2/.

Peters, Justin. 2013. “The Idealist.” Slate, February 7. http://www.slate.com/articles/technology/technology/2013/02/aaron_swartz_he_wanted_to_save_the_world_why_couldn_t_he_save_himself.html.

“Public Access to Court Electronic Records.” 2013a. Accessed February 10. http://www.pacer.gov/.

“Publishers and Library Groups Spar in Appeal to Ruling on E-Reserves – Technology – The Chronicle of Higher Education.” 2013. Accessed February 10. http://chronicle.com/article/PublishersLibrary-Groups/136995/?cid=pm&utm_source=pm&utm_medium=en.

“Remember Aaron Swartz.” 2013. Celebrating Aaron Swartz. Accessed February 22. http://www.rememberaaronsw.com.

Rochkind, Jonathan. 2013. “Library Values and the Growing Scholarly Digital Divide: In Memoriam Aaron Swartz | Bibliographic Wilderness.” Accessed February 10. http://bibwild.wordpress.com/2013/01/13/library-values-and-digital-divide-in-memoriam-aaron-swartz/.

Sims, Nancy. 2013. “What Is the Government’s Interest in Copyright? Not That of the Public. – Copyright Librarian.” Accessed February 10. http://blog.lib.umn.edu/copyrightlibn/2013/02/what-is-the-governments-interest-in-copyright.html.

Stamos, Alex. 2013. “The Truth About Aaron Swartz’s ‘Crime’.” Unhandled Exception. Accessed February 22. http://unhandled.com/2013/01/12/the-truth-about-aaron-swartzs-crime/.

Summers, Ed. 2013. “Aaronsw | Inkdroid.” Accessed February 21. http://inkdroid.org/journal/2013/01/19/aaronsw/.

“The Inside Story of Aaron Swartz’s Campaign to Liberate Court Filings | Ars Technica.” 2013. Accessed February 10. http://arstechnica.com/tech-policy/2013/02/the-inside-story-of-aaron-swartzs-campaign-to-liberate-court-filings/.

“Welcome to Open Library (Open Library).” 2013. Accessed February 10. http://openlibrary.org/.

West, Jessamyn. 2013. “Librarian.net » Blog Archive » On Leadership and Remembering Aaron.” Accessed February 21. http://www.librarian.net/stax/3984/on-leadership-and-remembering-aaron/.


The End of Academic Library Circulation?

What Library Circulation Data Shows

Unless current patterns change, by 2020 university libraries will no longer have circulation desks. This claim may seem hyperbolic if you’ve been observing your library, or even if you’ve been glancing over ACRL or National Center for Education Statistics data. If you have been looking at the data, you might be familiar with a pattern that looks like this:

total circulationThis chart shows total circulation for academic libraries, and while there’s a decline it certainly doesn’t look like it will hit zero anytime soon, definitely not in just 8 years. But there is a problem with this data and this perspective on library statistics.  When we talk about “total circulation” we’re talking about a property of the library, we’re not really thinking about users.

Here’s another set of data that you need to look at to really understand circulation:
fall enrollmentsAcademic enrollment has been rising rapidly.  This means more students, which in turns means greater circulation.  So if total circulation has been dropping despite an increase in users then something else must be going on.  So rather than asking the question “How many items does my library circulate?” we need to alter that to “How many items does the average student checkout?”

Here is that data:

circulation per student

This chart shows the upper/lower quartiles and median for circulation per FTE student.  As you can see this data shows a much more dramatic drop in the circulation of library materials. Rising student populations hide this fact.

[source: http://xkcd.com/605/]

But 2020? Can I be serious?  The simple linear regression model in the charts is probably a good predictor of 2012, but not necessarily 2020. Hitting zero without flattening out seems pretty unlikely. However, it is worth noting the circulation per user in the lower quartile for less than 4 year colleges reached 1.1 in 2010. If you’re averaging around 1 item per user, every user that takes out 2 items means there’s another who has checked out 0.

What’s Happening Here?

Rather than waste too much time trying to predict a future we’ll live in in less than a decade, let’s explore the more interesting question: “What’s happening here?”

By far the number one hypothesis I get when I show people this data is “Clearly this is just because of the rise of e-journals and e-books”. This hypothesis is reasonable: What has happened is simply that users have switched from print to electronic. This data represents a shift in media, nothing more.

But there are 2 very large problems with this hypothesis.

First, print journal circulation is not universal among academic libraries. In the cases where there is no print journal circulation the effect of e-journals would not be present in circulation data. However, I don’t have information to point out exactly how many academic libraries did circulate print journals. Maybe the effect of e-journals on just the libraries that do circulate serials could effect the data for everyone. The data we have already shown resolves this issue. Libraries that did circulate serials would have higher circulation per user than those that did not. By showing different quartiles we can address this discrepancy in the data between libraries that did and did not circulate journals. If you look at the data you’ll see that indeed the upper quartile does seem to have a higher rate of decline, but not enough to validate this hypothesis. The median and lower quartiles also experience this shift, so something else must be at work.

Second, e-books were not largely adopted until the mid 2000s, yet the decline preceding 2000 is at least as steep as after. If you look at the chart below you’ll notice that ebook acquisition rates did not exceed print until 2010:

ebooks vs printEbooks, of course, do have an effect on usage, but they’re not the primary factor in this change.

So clearly we must reject the hypothesis that this is merely a media shift. Certainly the shift from print to electronic has had some effect, but it is not the sole cause. If it’s not a shift in media, the most reasonable explanation is that it’s a shift in user behavior.  Students are simply not using books (in any format) as much as they used to.

What is Causing this Shift in User Behavior?

The next question is what is the cause of this shift.

I think the most simple answer is the web. 1996 is the first data point showing a drop in circulation. Of course the web was quite small then, but AOL and Yahoo! were already around, and the Internet Archive had been founded.  If you think back to a pre-web time, pretty much anything you needed to know more about required a trip to the library and checking out a book.

The most important thing to take away is that, regardless of cause, user behavior has changed and by all data points is still changing.  In the end, the greatest question is how will academic libraries adapt?  It is clear that the answer is not as simple as a transition to a new media. To survive, librarians must find the answer before we have enough data to prove these predictions.

If you enjoyed exploring this data please check out Library Data and follow @librarydata on twitter.

Data Source:

About our guest author: Will Kurt is a software engineer at Articulate Global, pursuing his masters in computer science at the University of Nevada, Reno and is a former librarian. He holds an MLIS from Simmons College and has worked in various roles in public, private and special libraries at organizations such as: MIT, BBN Technologies and the University of Nevada, Reno. He has written and presented on a range of topics including: play, user interfaces, functional programming and data