Data, data everywhere…but do we want to drink?

The role of data, digital curation, and scholarly communication in academic libraries.

Ask around and you’ll hear that data is the new bacon (or turkey bacon, in my case. Sorry, vegetarians). It’s the hot thing that everyone wants a piece of. It is another medium with which we interact and derive meaning from. It is information[1]; potentially valuable and abundant. But much like [turkey] bacon, un-moderated gorging, without balance or diversity of content, can raise blood pressure and give you a heart attack. To understand how best to interact with the data landscape, it is important to look beyond it.

What do academic libraries need to know about data? A lot, but in order to separate the signal from the noise, it is imperative to look at the entire environment. To do this, one can look to job postings as a measure of engagement. The data curation positions, research data services departments, and data management specializations focus almost exclusively on digital data. However, these positions, which are often catch-alls for many other things do not place the data management and curation activities within the larger frame of digital curation, let alone scholarly communication. Missing from job descriptions is an awareness of digital preservation or archival theory as it relates to data management or curation. In some cases, this omission could be because a fully staffed digital collections department has purview over these areas. Nonetheless, it is important to articulate the need to communicate with those stakeholders in the job description. It may be said that if the job ad discusses data curation, digital preservation should be an assumed skill, yet given the tendencies to have these positions “do-all-the-things” it is negligent not to explicitly mention it.

Digital curation is an area that has wide appeal for those working in academic and research libraries. The ACRL Digital Curation Interest Group (DCIG) has one of the largest memberships within ACRL, with 1075 members as of March 2015. The interest group was intentionally named “digital curation” rather than “data curation” because the founders (Patricia Hswe and Marisa Ramirez) understood the interconnectivity of the domains and that the work in one area, like archives, could influence the work in another, like data management. For example, the work from Digital POWRR can help inform digital collection platform decisions or workflows, including data repository concerns. This Big Tent philosophy can help frame the data conversations within libraries in a holistic, unified manner, where the various library stakeholders work collaboratively to meet the needs of the community.

The absence of a holistic approach to data can result in the propensity to separate data from the corpus of information for which librarians already provide stewardship. Academic libraries may recognize the need to provide leadership in the area of data management, but balk when asked to consider data a special collection or to ingest data into the institutional repository. While librarians should be working to help the campus community become critical users and responsible producers of data, the library institution must empower that work by recognizing this as an extension of the scholarly communication guidance currently in place. This means that academic libraries must incorporate the work of data information literacy into their existing information literacy and scholarly communication missions, else risk excluding these data librarian positions from the natural cohort of colleagues doing that work, or risk overextending the work of the library.

This overextension is most obvious in the positions that seek a librarian to do instruction in data management, reference, and outreach, and also provide expertise in all areas of data analysis, statistics, visualization, and other data manipulation. There are some academic libraries where this level of support is reasonable, given the mission, focus, and resourcing of the specific institution. However, considering the diversity of scope across academic libraries, I am skeptical that the prevalence of job ads that describe this suite of services is justified. Most “general” science librarians would scoff if a job ad asked for experience with interpreting spectra. The science librarian should know where to direct the person who needs help with reading the spectra, or finding comparative spectra, but it should not be a core competency to have expertise in that domain. Yet experience with SPSS, R, Python, statistics and statistical literacy, and/or data visualization software find their way into librarian position descriptions, some more specialized than others.

For some institutions this is not an overextension, but just an extension of the suite of specialized services offered, and that is well and good. My concern is that academic libraries, feeling the rush of an approved line for all things data, begin to think this is a normal role for a librarian. Do not mistake me, I do not write from the perspective that libraries should not evolve services or that librarians should not develop specialized areas of expertise. Rather, I raise a concern that too often these extensions are made without the strategic planning and commitment from the institution to fully support the work that this would entail.

Framing data management and curation within the construct of scholarly communication, and its intersections with information literacy, allows for the opportunity to build more of this content delivery across the organization, enfranchising all librarians in the conversation. A team approach can help with sustainability and message penetration, and moves the organization away from the single-position skill and knowledge-sink trap. Subject expertise is critical in the fast-moving realm of data management and curation, but it is an expertise that can be shared and that must be strategically supported. For example, with sufficient cross-training liaison librarians can work with their constituents to advise on meeting federal data sharing requirements, without requiring an immediate punt to the “data person” in the library (if such a person exists). In cases where there is no data point person, creating a data working group is a good approach to distribute across the organization both the knowledge and the responsibility for seeking out additional information.

Data specialization cuts across disciplinary bounds and concerns both public services and technical services. It is no easy task, but I posit that institutions must take a simultaneously expansive yet well-scoped approach to data engagement – mindful of the larger context of digital curation and scholarly communication, while limiting responsibilities to those most appropriate for a particular institution.

[1] Lest the “data-information-knowledge-wisdom” hierarchy (DIKW) torpedo the rest of this post, let me encourage readers to allow for an expansive definition of data. One that allows for the discrete bits of data that have no meaning without context, such as a series of numbers in a .csv file, and the data that is described and organized, such as those exact same numbers in a .csv file, but with column and row descriptors and perhaps an associated data dictionary file. Undoubtedly, the second .csv file is more useful and could be classified as information, but most people will continue to call it data.

Yasmeen Shorish is assistant professor and Physical & Life Sciences librarian at James Madison University. She is a past-convener for the ACRL Digital Curation Interest Group and her research focus is in the areas of data information literacy and scholarly communication.

How is programming work supported (or not…) by administrators in libraries?

[Editor’s Note:  This post is part of a series of posts related to ACRL TechConnect’s 2015 survey on Programming Languages, Frameworks, and Web Content Management Systems in Libraries.  The survey was distributed between January and March 2015 and received 265 responses.  The first post in this series is available here.]

In our last post in this series, we discussed how library programmers learn about and develop new skills in programming in libraries.  We also wanted to find out how library administrators or library culture in general does or does not support learning skills in programming.

From anecdotal accounts, we hypothesized that learning new programming skills might be impeded by factors including lack of access to necessary technologies or server environments, lack of support for training, travel or professional development opportunities, or overloaded job descriptions that make it difficult to find the time to learn and develop new skills.  While respondents to our survey did in some cases indicate these barriers, we actually found that most respondents felt supported by their administration or library to develop new programming skills.

Most respondents feel supported, but lack of time is a problem

The question we asked respondents was:

Please describe how your employing institution either does or does not support your efforts to learn or improve programming or development skills. “Support” can refer to funding, training, mentoring, work time allocation, or other means of support.

The question was open-ended, enabling respondents to provide details about their experiences.  We received 193 responses to this question and categorized responses by whether they overall indicated support or lack of support.  74% of respondents indicated at least some support for learning programming by their library administration, while 26% report a lack of support for learning programming.

Of those who mentioned that their administration or supervisors provide a supportive environment for learning about programming, the top kind of support mentioned was training, closely followed by funding for professional development opportunities.  Flexibility in work time was also frequently mentioned by respondents.  Mentoring and encouragement were mentioned less frequently.


However, even among those who feel supported in terms of funding and training opportunities, respondents indicated that time to actually complete training or professional development, is, in practice, scarce:

Work time allocation is a definite issue – I’m the only systems librarian and have responsibilities governing web site, intranet, discovery layer, link resover, ereserve system, meeting room booking system and library management system. No time for deep learning.

Low staffing often contributes to the lack of time to develop skills, even in supportive environments:

They definitely support developing new skills, but we have a very small technology staff so it’s difficult to find time to learn something new and implement it.

Respondents indicated the importance to their employers of aligning training and funding requests with current work projects and priorities:

I would be able to get support in terms of work time allocation, limited funding for training. I’m limited by external control of library technology platforms (centrally administrated), need to identify utility of learning language to justify training, use, &c.

26% of respondents indicate a lack of support for learning programming

Of those respondents who indicated that their workplace is not supportive of programming professional development or learning opportunities, lack of funding and training was the most commonly cited type of support that respondents found lacking.

Lack of  Funding and Training

The main lack of support comes in the form of funding and training. There are few opportunities to network and attend training events (other than virtually online) to learn how to do my job better. I basically have to read and research (either with a book or on the web) to learn about programming for libraries.

Respondents mentioned that though they could do training during their work hours, they are not necessarily funded to do so:

I am given time for self-education, but no formal training or provision for formal education classes.

Lack of Mentoring / Peer Support

Peer support was important to many respondents, both in supportive and unsupportive environments.  Many respondents who felt supported mentioned how important it was to have colleagues in their workplace to whom they can turn to get advice and help with troubleshooting.  Comments such as this one illustrate the difficulty of being the only systems or technology support person in one’s workplace:

They are very open to supporting me financially and giving me work time to learn (we have an institutional license to and they have funded off site training), but there is not a lot of peer support for learning. I am a solo systems department and most of our campus IT staff are contractors, so there is not the opportunity for a community of colleagues to share ideas and to learn from each other.

Understaffing / Low Pay for Programming Skills

Closely related to the lack of peer support, respondents specifically mentioned that being the only technical staff person at their institution can make it difficult to find time for learning, and that understaffing contributes to the high workload:

There’s no money for training and we are understaffed so there’s no time for self-taught skills. I am the only non-Windows programmer so there’s no one I can confer with on programming challenges. I learn whatever I need to know on the fly and only to the degree it’s necessary to get the job done.

I’m the only “tech” on site, so I don’t have time to learn anything new.

One respondent mentioned that pay for those with programming skills is not competitive at his or her institution:

We have zero means for support, partially due to a complex web of financial reasons. No training, little encouragement, and a refusal to hire/pay at market rates programming staff.

Future Research and Other Questions

As with the first post in this series, the analysis of the data yields more questions than clear conclusions.  Some respondents indicated they have very supportive workplaces, where they feel like their administration and supervisors provide every opportunity to develop new skills and learn about the technologies they want to learn about.  Others express frustration with the lack of funding or ability to collaborate with colleagues on projects that require programming skills.

One question that requires a more thorough examination of the data is whether those whose jobs do not specifically require programming skills feel as supported in learning about programming as those who were hired to be programmers.  30% of survey respondents indicated that programming is *not* part of their official job duties, but that they do programming or similar activities to perform job functions.  Initial analysis indicates there is no significant difference between these respondents and respondents as a whole.  However, there may be differences in support based on the type of position one has in a library (e.g., staff, faculty, or administration), and we did not gather that information from respondents in this survey.  At least two respondents, however, indicates that this may be the case at least at some libraries:

Training & funding is available; can have release time to attend; all is easier for librarians to obtain than for staff to obtain which is sad since staff tend to do more of the programming

Some staff have a lot of support, some have nill, it depends on where/what project you are working on.

In the next (and final) post in this series, we’ll explore some preliminary data on popular programming languages in libraries, and examine how often library programmers get to use their preferred programming languages in their work.