It’s Open Access Week, which for scholarly communications librarians and institutional repository managers is one of the big events of the year to reflect on our work and educate others. Over the years, it has become less necessary to explain what open access is. Rather, everyone seems to have a perception of open access and an opinion about it. But those perceptions and opinions may not be based on the original tenets of the open access movement. The commercialization of open access means that it may now seem too expensive to pursue for individuals to publish open access, and too complicated for institutions to attempt without buying a product.
In some ways, the open access movement is analogous to punk music–a movement that grew out of protest and DIY sensibilities, but was quickly coopted as soon as it became financially successful. While it was never free or easy to make work open access, changes in the market recently might make it feel even more expensive and complicated. Those who want to continue to build open access repositories and promote open access need to understand where their institution fits in the larger picture, the motivations of researchers and administration, and be able to put the right solutions together to avoid serious missteps that will set back the open access movement.
Like many exciting new ideas, open access is partially a victim of its own success. Heather Morrison has kept for the past ten years a tally of the dramatic growth of open access in an on-going series. Her post for this year’s Open Access Week is the source for the statistics in this paragraph. Open access content makes up a sizeable portion of online content, and therefore is more of a market force. BASE now includes 100 million articles. Directory of Open Access Journals, even after the stricter inclusion process, has an 11% growth in article level searching with around 500,000 items. There are well over a billion items with a Creative Commons license. These numbers are staggering, and give a picture of how overwhelming the amount of content available all told is, much less open access. But it also means that almost everyone doing academic research will have benefited from open access content. Not everyone who has used open access (or Creative Commons licensed) content will know what it is, but as it permeates more of the web it becomes more and more relevant. It also becomes much harder to manage, and dealing with that complexity requires new solutions–which may bring new areas of profit.
An example of this new type of service is 1Science, which launched a year ago. This is a service that helps libraries manage their open access collections, both in terms of understanding what is available in their subscribed collections as well as their faculty output. 1Science grew out of longer term research projects around emerging bibliometrics, started by Eric Archambault, and according to their About Us page started as a way to improve findability of open access content, and grew into a suite of tools that analyzes collections for open access content availability. The market is now there for this to be a service that libraries are interested in purchasing. Similar moves happened with alternative metrics in the last few years as well (for instance, Plum Analytics).
But the big story for commercial open access in 2016 was Elsevier. Elsevier already had a large stable of open access author-pays journals, with fees of up to $5000. That is the traditional way that large commercial publishers have participated in open access. But Elsevier has made several moves in 2016 that are changing the face of open access. They acquired SSRN in May, which built on their acquisition of Mendeley in 2013, and hints at a longer term strategy for combining a content platform and social citation network that potentially could form a new type of open access product that could be marketed to libraries. Their move into different business models for open access is also illustrated in their controversial partnership with the University of Florida. This uses an API to harvest content from ScienceDirect published by UF researchers, but will not provide access to those without subscriptions except to certain accepted manuscripts, and grew out of a recognition that UF researchers published heavily in Elsevier journals and that working directly with Elsevier would allow them to get a large dataset of their researchers’ content and funder compliance status more easily. 1 There is a lot to unpack in this partnership, but the fact that it can even take place shows that open access–particularly funder compliance for open access versions–is something about which university administration outside the library in the Office of Research Services is taking note. Such a partnership serves certain institutional needs, but it does not create an open access repository, and in most ways serves the needs of the publisher in driving content to their platform (though UF did get a mention of interlibrary loan into the process rather than just a paywall). It also removes incentives for UF faculty to publish in non-Elsevier journals, since their content in those journals will become even easier to find, and there will be no need to look elsewhere for open access grant compliance. Either way, this type of move takes the control of open access out of the hands of libraries, just as so many previous deals with commercial enterprises have done.
As I said in the beginning of this piece, more and more people already know about and benefit from open access, but all those people have different motivations. I break those into three categories, and which administrative unit I think is most likely to care about that aspect of open access:
- Open access is about the justice of wider access to academic content or getting back at the big publishers for exploitative practices. These people aren’t going to be that interested in a commercial open access solution, except inasmuch as it allows more access for a lower cost–for instance, a hosted institutional repository that doesn’t require institutional investment in developers. This group may include librarians and individual researchers.
- Open access is about following the rules for a grant-funded project since so many of those require open access versions of articles. Such requirements lead to an increase in author-pays open access, since publishers can command a higher fee that can be part of the grant award or subsidized by an institution. Repositories to serve these requirements and to address these needs are in progress but still murky to many. This group may include the Office of Research Services or Office of Institutional Research.
- “Open access” is synonymous with putting articles or article citations online to create a portfolio for reputation-building purposes. This last group is going to find something like that UF/Elsevier partnership to be a great situation, since they may not be aware of how many people cannot actually read the published articles. This last group may include administrators concerned with building the institution’s reputation.
For librarians who fall into the first category but are sensitive to the needs of everyone in each category, it’s important to select the right balance of solutions to meet everyone’s needs but still maintain the integrity of the open access repository. That said, this is not easy. Meeting these variety of needs is exactly why some of these new products are entering the market, and it may seem easier to go with one of them even if it’s not exactly the right long-term solution. I see this as an important continuing challenge facing librarians who believe in open access, and have to underpin future repository and outreach strategies.
- Russell, Judith C.; Wise, Alicia; Dinsmore, Chelsea S.; Spears, Laura I.; Phillips, Robert V.; and Taylor, Laurie (2016) “Academic Library and Publisher Collaboration: Utilizing an Institutional Repository to Maximize the Visibility and Impact of Articles by University Authors,” Collaborative Librarianship: Vol. 8: Iss. 2, Article 4.
Anyone who has worked on an institutional repository for even a short time knows that collecting faculty scholarship is not a straightforward process, no matter how nice your workflow looks on paper or how dedicated you are. Keeping expectations for the process manageable (not necessarily low, as in my clickbaity title) and constant simplification and automation can make your process more manageable, however, and therefore work better. I’ve written before about some ways in which I’ve automated my process for faculty collection development, as well as how I’ve used lightweight project management tools to streamline processes. My newest technique for faculty scholarship collection development brings together pieces of all those to greatly improve our productivity.
Allocating Your Human and Machine Resources
First, here is the personnel situation we have for the institutional repository I manage. Your own circumstances will certainly vary, but I think institutions of all sizes will have some version of this distribution. I manage our repository as approximately half my position, and I have one graduate student assistant who works about 10-15 hours a week. From week to week we only average about 30-40 hours total to devote to all aspects of the repository, of which faculty collection development is only a part. We have 12 librarians who are liaisons with departments and do the majority of the outreach to faculty and promotion of the repository, but a limited amount of the collection development except for specific parts of the process. While they are certainly welcome to do more, in reality, they have so much else to do that it doesn’t make sense for them to spend their time on data entry unless they want to (and some of them do). The breakdown of work is roughly that the liaisons promote the repository to the faculty and answer basic questions; I answer more complex questions, develop procedures, train staff, make interpretations of publishing agreements, and verify metadata; and my GA does the simple research and data entry. From time to time we have additional graduate or undergraduate student help in the form of faculty research assistants, and we have a group of students available for digitization if needed.
Those are our human resources. The tools that we use for the day-to-day work include Digital Measures (our faculty activity system), Excel, OpenRefine, Box, and Asana. I’ll say a bit about what each of these are and how we use them below. By far the most important innovation for our faculty collection development workflow has been integration with the Faculty Activity System, which is how we refer to Digital Measures on our campus. Many colleges and universities have some type of faculty activity system or are in the process of implementing one. These generally are adopted for purposes of annual reports, retention, promotion, and tenure reviews. I have been at two different universities working on adopting such systems, and as you might imagine, it’s a slow process with varying levels of participation across departments. Faculty do not always like these systems for a variety of reasons, and so there may be hesitation to complete profiles even when required. Nevertheless, we felt in the library that this was a great source of faculty publication information that we could use for collection development for the repository and the collection in general.
We now have a required question about including the item in the repository on every item the faculty member enters in the Faculty Activity System. If a faculty member is saying they published an article, they also have to say whether it should be included in the repository. We started this in late 2014, and it revolutionized our ability to reach faculty and departments who never had participated in the repository before, as well as simplify the lives of faculty who were eager participants but now only had to enter data in one place. Of course, there are still a number of people whom we are missing, but this is part of keeping your expectation low–if you can’t reach everyone, focus your efforts on the people you can. And anyway, we are now so swamped with submissions that we can’t keep up with them, which is a good if unusual problem to have in this realm. Note that the process I describe below is basically the same as when we analyze a faculty member’s CV (which I described in my OpenRefine post), but we spend relatively little time doing that these days since it’s easier for most people to just enter their material in Digital Measures and select that they want to include it in the repository.
The ease of integration between your own institution’s faculty activity system (assuming it exists) and your repository certainly will vary, but in most cases it should be possible for the library to get access to the data. It’s a great selling point for the faculty to participate in the system for your Office of Institutional Research or similar office who administers it, since it gives faculty a reason to keep it up to date when they may be in between review cycles. If your institution does not yet have such a system, you might still discuss a partnership with that office, since your repository may hold extremely useful information for them about research activity of which they are not aware.
We get reports from the Faculty Activity System on roughly a quarterly basis. Faculty member data entry tends to bunch around certain dates, so we focus on end of semesters as the times to get the reports. The reports come by email as Excel files with information about the person, their department, contact information, and the like, as well as information about each publication. We do some initial processing in Excel to clean them up, remove duplicates from prior reports, and remove irrelevant information. It is amazing how many people see a field like “Journal Title” as a chance to ask a question rather than provide information. We focus our efforts on items that have actually been published, since the vast majority of people have no interest in posting pre-prints and those that do prefer to post them in arXiv or similar. The few people who do know about pre-prints and don’t have a subject archive generally submit their items directly. This is another way to lower expectations of what can be done through the process. I’ve already described how I use OpenRefine for creating reports from faculty CVs using the SHERPA/RoMEO API, and we follow a similar but much simplified process since we already have the data in the correct columns. Of course, following this process doesn’t tell us what we can do with every item. The journal title may be entered incorrectly so the API call didn’t pick it up, or the journal may not be in SHERPA/RoMEO. My graduate student assistant fills in what he is able to determine, and I work on the complex cases. As we are doing this, the Excel spreadsheet is saved in Box so we have the change history tracked and can easily add collaborators.
At this point, we are ready to move to Asana, which is a lightweight project management tool ideal for several people working on a group of related projects. Asana is far more fun and easy to work with than Excel spreadsheets, and this helps us work together better to manage workload and see where we are with all our on-going projects. For each report (or faculty member CV), we create a new project in Asana with several sections. While it doesn’t always happen in practice, in theory each citation is a task that moves between sections as it is completed, and finally checked off when it is either posted or moved off into some other fate not as glamorous as being archived as open access full text. The sections generally cover posting the publisher’s PDF, contacting publishers, reminders for followup, posting author’s manuscripts, or posting to SelectedWorks, which is our faculty profile service that is related to our repository but mainly holds citations rather than full text. Again, as part of the low expectations, we focus on posting final PDFs of articles or book chapters. We add books to a faculty book list, and don’t even attempt to include full text for these unless someone wants to make special arrangements with their publisher–this is rare, but again the people who really care make it happen. If we already know that the author’s manuscript is permitted, we don’t add these to Asana, but keep them in the spreadsheet until we are ready for them.
We contact publishers in batches, trying to group citations by journal and publisher to increase efficiency so we can send one letter to cover many articles or chapters. We note to follow up with a reminder in one month, and then again in a month after that. Usually the second notice is enough to catch the attention of the publisher. As they respond, we move the citation to either posting publisher’s PDF section or to author’s manuscript section, or if it’s not permitted at all to the post to SelectedWorks section. While we’ve tried several different procedures, I’ve determined it’s best for the liaison librarians to ask just for author’s accepted manuscripts for items after we’ve verified that no other version may be posted. And if we don’t ever get them, we don’t worry about it too much.
I hope you’ve gotten some ideas from this post about your own procedures and new tools you might try. Even more, I hope you’ll think about which pieces of your procedures are really working for you, and discard those that aren’t working any more. Your own situation will dictate which those are, but let’s all stop beating ourselves up about not achieving perfection. Make sure to let your repository stakeholders know what works and what doesn’t, and if something that isn’t working is still important, work collaboratively to figure out a way around that obstacle. That type of collaboration is what led to our partnership with the Office of Institutional Research to use the Digital Measures platform for our collection development, and that in turn has led to other collaborative opportunities.
The Directory of Open Access Journals (DOAJ) is an international directory of journals and index of articles that are available open access. Dating back to 2003, the DOAJ was at the center of a controversy surrounding the “sting” conducted by John Bohannon in Science, which I covered in 2013. Essentially Bohannon used journals listed in DOAJ to try to find journals that would publish an article of poor quality as long as authors paid a fee. At the time many suggested that a crowdsourced journal reviewing platform might be the way to resolve the problem if DOAJ wasn’t a good source. While such a platform might still be a good idea, the simpler and more obvious solution is the one that seems to have happened: for DOAJ to be more strict with publishers about requirements for inclusion in the directory. 1.
The process of cleaning up the DOAJ has been going on for some time and is getting close to an important milestone. All the 10,000+ journals listed in DOAJ were required to reapply for inclusion, and the deadline for that is December 30, 2015. After that time, any journals that haven’t reapplied will be removed from the DOAJ.
“Proactive Not Reactive”
Contrary to popular belief, the process for this started well before the Bohannon piece was published 2. In December 2012 an organization called Infrastructure Services for Open Access (IS4OA) (founded by Alma Swan and Caroline Sutton) took over DOAJ from Lund University, and announced several initiatives, including a new platform, distributed editorial help, and improved criteria for inclusion. 3 Because DOAJ grew to be an important piece of the scholarly communications infrastructure it was inevitable that they would have to take such a step sooner or later. With nearly 10,000 journals and only a small team of editors it wouldn’t have been sustainable over time, and to lose the DOAJ would have been a blow to the open access community.
One of the remarkable things about the revitalization of the DOAJ is the transparency of the process. The DOAJ News Service blog has been detailing the behind the scenes processes in detail since May 2014. One of the most useful things is a list of journals who have claimed to be listed in DOAJ but are not. Another important piece of information is the 2015-2016 development roadmap. There is a lot going on with the DOAJ update, however, so below I will pick out what I think is most important to know.
The New DOAJ
In March 2014, the DOAJ created a new application form with much higher standards for inclusion. Previously the form for inclusion was only 6 questions, but after working with the community they changed the application to require 58 questions. The requirements are detailed on a page for publishers, and the new application form is available as a spreadsheet.
While 58 questions seems like a lot, it is important to note that journals need not fulfill every single requirement, other than the basic requirements for inclusion. The idea is that journal publishers must be transparent about the structure and funding of the journal, and that journals explicitly labeled as open access meet some basic theoretical components of open access. For instance, one of the basic requirements is that “the full text of ALL content must be available for free and be Open Access without delay”. Certain other pieces are strong suggestions, but not meeting them will not reject a journal. For instance, the DOAJ takes a strong stand against impact factors and suggests that they not be presented on journal websites at all 4.
To highlight journals that have extremely high standards for “accessibility, openness, discoverability reuse and author rights”, the DOAJ has developed a “Seal” that is awarded to journals who answer “yes” to the following questions (taken from the DOAJ application form):
have an archival arrangement in place with an external party (Question 25). ‘No policy in place’ does not qualify for the Seal.
provide permanent identifiers in the papers published (Question 28). ‘None’ does not qualify for the Seal.
provide article level metadata to DOAJ (Question 29). ‘No’ or failure to provide metadata within 3 months do not qualify for the Seal.
embed machine-readable CC licensing information in article level metadata (Question 45). ‘No’ does not qualify for the Seal.
allow reuse and remixing of content in accordance with a CC BY, CC BY-SA or CC BY-NC license (Question 47). If CC BY-ND, CC BY-NC-ND, ‘No’ or ‘Other’ is selected the journal will not qualify for the Seal.
have a deposit policy registered in a deposit policy directory. (Question 51) ‘No’ does not qualify for the Seal.
allow the author to hold the copyright without restrictions. (Question 52) ‘No’ does not qualify for the Seal.
Part of the appeal of the Seal is that it focuses on the good things about open access journals rather than the questionable practices. Having a whitelist is much more appealing for people doing open access outreach than a blacklist. Journals with the Seal are available in a facet on the new DOAJ interface.
Getting In and Out of the DOAJ
Part of the reworking of the DOAJ was the requirementand required all currently listed journals to reapply–as of November 19 just over 1,700 journals had been accepted under the new criteria, and just over 800 had been removed (you can follow the list yourself here). For now you can find journals that have reapplied with a green check mark (what DOAJ calls The Tick!). That means that about 85% of journals that were previously listed either have not reapplied, or are still in the verification pipeline 5. While DOAJ does not discuss specific reasons a journal or publisher is removed, they do give a general category for removal. I did some analysis of the data provided in the added/removed/rejected spreadsheet.
At the time of analysis, there were 1776 journals on the accepted list. 20% of these were added since September, and with the deadline looming this number is sure to grow. Around 8% of the accepted journals have the DOAJ Seal.
There were 809 journals removed from the DOAJ, and the reasons fell into the following general categories. I manually checked some of the journals with only 1 or 2 titles, and suspect that some of these may be reinstated if the publisher chooses to reapply. Note that well over half the removed journals weren’t related to misconduct but were ceased or otherwise unavailable.
|Inactive (has not published in the last calender year)||233|
|Suspected editorial misconduct by publisher||229|
|Website URL no longer works||124|
|Journal not adhering to Best Practice||62|
|Journal is no longer Open Access||45|
|Has not published enough articles this calendar year||2|
|Other; delayed open access||1|
|Other; no content||1|
|Other; taken offline||1|
|Removed at publisher’s request||1|
The spreadsheet lists 26 journals that were rejected. Rejected journals will know the specific reasons why their applications were rejected, but those specific reasons are not made public. Journals may reapply after 6 months once they have had an opportunity to amend the issues. 6 The general stated reasons were as follows:
|Has not published enough articles||2|
|Journal website lacks necessary information||2|
|Not an academic/scholarly journal||1|
|Web site URL doesn’t work||1|
The work that DOAJ is doing to improve transparency and the screening process is very important for open access advocates, who will soon have a tool that they can trust to provide much more complete information for scholars and librarians. For too long we have been forced to use the concept of a list of “questionable” or even “predatory” journals. A directory of journals with robust standards and easy to understand interface will be a fresh start for the rhetoric of open access journals.
Are you the editor of an open access journal? What do you think of the new application process? Leave your thoughts in the comments (anonymously if you like).