Job Analyses of Earth Science Data Librarians and Data Managers

: This study’s purpose is to capture the skills of Earth science data managers and librarians through interviews with current job holders. Job analysis interviews were conducted of 14 participants—six librarians and eight data managers—to assess the types and frequencies of job tasks. Participants identified tasks related to communication, including collaboration, teaching, and project management activities. Data-specific tasks included data discovery, processing, and curation, which require an understanding of the data, technology, and information infrastructures to support data use, reuse, and preservation. Most respondents had formal science education and six had a master’s degree in Library and Information Sciences. Most of the knowledge, skills, and abilities for these workers were acquired through on-the-job experience, but future professionals in these careers may benefit from tailored education informed through job analyses.


Literature review
Today's researchers face large-scale challenges as they try to navigate massive quantities of data, work across disciplinary boundaries, and keep pace with the requirements of data management plans (DMPs) and preservation needs (Jaguszewski and Williams 2013).Working across data types and scientific disciplines requires that data managers and librarians evolve and, in many instances, new job titles have emerged.Regardless of job title, these science data managers and librarians adopt innovative roles as liaisons within the science community.In these roles, data managers and librarians are not ancillary passive data providers but active participants in the research process (Rockenbach et al. 2015).Within the context of this change, more studies of the KSA held by current data managers and librarians are needed in order to create relevant coursework, writing job descriptions, and anticipating developing trends.Park et al. (2009) summarized studies conducted over a decade, reporting "a variety of issues and trends facing the cataloging [LIS] profession" and the needs to examine emerging skill sets and "identify roles and competencies engendered by the emergence of the digital environment" from the current roles, qualifications, and skill sets.
Earth science data librarian and data manager workforce.Data management roles are institution-dependent and can include science librarians, data managers, data scientists, and other data-centric job titles.Many research sponsors ask for DMPs within funding proposals.Therefore, there is a heightened focus on how researchers manage their project data throughout the data life cycle.Hence, to keep pace with funder mandates and new expectations for open science, it is necessary to have dedicated, trained data curation specialists ensuring data are usable and accessible into the indefinite future.At any point in the data life cycle, "data curation provides distinctive roles separate from those methods conducted by scientists for science" and requires the need of a dedicated staff who may be charged with many other tasks before and after other analyses (Bishop and Hank 2018).Scientists have to manage their own data to a degree, unless they can fund a dedicated data manager.Often parts of the data life cycle [e.g., minting a digital object identifier (DOI)] are done by data managers at large institutions after data creation.At that point, other data managers normalize the metadata to meet any institutional standards.
Many scientists have been looking to Library and Information Science (LIS) for guidance on how to manage their data.Still, there is confusion among primary investigators, librarians or data managers, and information technology professionals as to how these research data management (RDM) duties should be distributed-in other words, who should do what (Antell et al. 2014).Kennan (2016) documented interviews of 25 data professionals within universities and scientific research organizations.Few interviewees had formal technical training in data management, and only some reported learning such skills through on-thejob training in classes or seminars.Many skills that the data managers reported were soft skills, including teamwork, knowledge sharing, and communication.Another study found science domain knowledge is necessary to work with science data, but equally important are the data management skills related to communication, project management, community building, and data-specific tasks (Bishop and Hank 2018).In her book chapter, "A biologist adapts to librarianship," Wainscott echoes the importance of these transferable skills.Not only was the science data knowledge beneficial to her career change, but also project management and technical writing skills (Wainscott 2014).To bridge this knowledge gap, several universities began training within various STEM departments to increase awareness of RDM in the hard sciences and also in information science programs.The experimental training initiatives showed to be most beneficial when focused on domain-specific data needs (Wittenberg et al. 2018).Federer (2018) studied biomedical and health science data librarians to discover two distinct professional groups: "subject specialists" who specialized in a small number of tasks, and "data generalists" who worked across multiple fields.Those working across multiple fields tended to rate many more tasks as important to their jobs.Some science institutions are highly heterogeneous, and data management needs vary drastically based on research focus areas and resources which would require more specialization than general training could provide.
There is still a notable lack of workforce studies to provide an indication of the extent of professionals in these areas, as reporting on the extent of data librarianship and data management is problematic.For example, the U.S. Bureau of Labor Statistics does not keep data on the number of these jobs.There may be disparity between job title and job responsibilities, making identification and reporting challenging.Also, many doing this type of work may consider themselves scientists and not view their data management work as a separate career path.These distinctions are not more easily made in higher education with a variety of curricular structures and program names related to data science, which may or may not contain data management coursework.Graduate degrees in LIS focus on research data management training to prepare students for these very careers.In other domains, it is difficult to tease out the content of particular courses without conducting a deep dive into the syllabi across multiple programs.By studying those currently working within these data jobs, formal education, and training requirements can be solidified by those most qualified-current job holders.inform training, certification, and education in most formal professions.Data collection can either be 1) observational, such as documenting the behaviors and the tools needed to accomplish work tasks, or 2) verifiable, including a list of elements within the job environment (Singh 2008).With more resources and time, workers could be readily observed or asked to do task inventories that compile in great detail what a workday or work week looks like.These job analysis approaches work best for jobs with observable tasks, but data work is mostly intellectual work that is not easily observable (Raymond 2001).Also, job descriptions provide expectations of educational background, professional experience, and task expectations.While job descriptions provide insights of ideal candidates and job expectations, this approach may not capture what job applicants do once hired or which attributes matter the most to perform the job tasks.
As this study's purpose is to capture the KSA that eventually will inform the curriculum for Earth science data managers and librarians, iSchool educators, and librarians, the Developing a Curriculum (DACUM) approach was used (Hermann 1987).Although there are many options for curriculum development, a DACUM is the best first step to create a list of KSA, operationalized job descriptions, and, eventually, educational learning outcomes.A scope statement starts the DACUM process by having a consensus discussion with job incumbents about aspects that are and are not part of the job.
The DACUM approach is based on three core principles: 1) Job incumbents (i.e., current job holders) know their job better than anyone else, and therefore workers are the best at describing what it is they do.Job incumbents are currently working in the field and are not necessarily leaders or educators in the field.
2) The best way to define a job is by describing the specific tasks that are performed on the job.Earth science data managers and librarians, like any other intellectual work, routinely perform tasks that might be challenging to describe (i.e., thinking).Yet, the information professionals who are actually performing the jobs currently should be the best source to explain what those tasks are and are not.3) All tasks performed on a job require the use of knowledge, skills, and abilities for successful job performance.The following presents the DACUM method used in this study.

Method
The DACUM approach includes a qualitative, semi-structured interview method.To inform the curriculum in iSchools as well as in domain sciences for Earth science data managers and librarians, the study recruited job incumbents in these fields.Various scientific disciplines targeted reflect the areas of interest of the authors, specifically atmospheric and other Earth sciences.Institutional Review Board approval was gained prior to any recruitment for data collection.A sample of convenience was used and only participants, who are currently employed as Earth science data managers or librarians, were recruited through the Atmospheric Science Librarians International (ASLI), Federation of Earth Science Information Partners (ESIP), and at other science conferences with data managers in attendance, which included some that worked at Woods Hole Oceanographic Institution (WHOI) and the Lamont-Doherty Earth Observatory at Columbia University.This recruitment potentially reached over 100 librarians and data managers and 14 agreed to be interviewed.The interview consisted of thirteen questions inquiring about their jobs, tasks, work experience, and educational background.The resulting interview schedule is provided.
1) What is your current job title?2) How many years in total have you been working in your current job?
3) How many years in total have you been working with Earth science data (including relevant higher education)?4) Please indicate your credentials and degrees.5) Please provide any other education or training you have received that is applicable to performing your job.6) What are some daily tasks associated with the job?7) What are some weekly tasks associated with the job?8) What are some less frequent tasks associated with the job? 9) Please provide any other feedback about this project.
The interviews were recorded via Zoom online meeting rooms and lasted approximately 30-40 min each, transcribed, and coded in NVivo.Two of the authors coded the transcriptions and inductively created and shared a codebook.Agreement percentage was conducted, but more advanced intra-rater reliability was not calculated due to little variation in responses.The coders identified common themes across responses.

Results
Job titles, work settings, experience, and education.Participants described their job titles as Librarian (6) (with titles such as data engineering librarian and research data librarian); Manager (4); and one each Project Scientist, Director, Staff Associate, and Research Scholar.For the purpose of analyzing these results, six participants identified as Earth science librarians and eight as data managers.The average time working in science overall, including formal education, was approximately 16 years.Years in their current job ranged from less than 1 year to 16 years.
All participants held both bachelor's and master's degrees.Bachelor's degrees are in Math and Physics (2); Biology (2); Marine Biology (2); and one each from Geology, Geological Engineering, Physical Anthropology, Meteorology, History, Engineering Mechanics, English and Religious Studies, and Fine Arts.Master's degrees were in LIS (6); Geology (2); and one each from Astronomy, Meteorology, Biology, Physics, Ecology, and Marine Biology and Biological Oceanography.Three participants also held doctoral degrees in Information Technology, Ecology and Evolutionary Biology, and Physics.In addition, seven participants held one or more professional certifications in computer systems administration (2), project management, Agile Scrum Master, science and technology policy, various GIS applications, database administration, and Data Carpentry instructor.Participants had previous work experience in academic libraries (7), data repositories and data management (4), research laboratories or science agencies (6), teaching (2), and the aerospace industry (1).
On-the-job training, tools, and necessary skills.The Earth science data managers and librarians interviewed held wide and diverse skill sets.Participants mentioned learning new programming languages, software and frameworks, Data Carpentry, metadata standards, and information modeling.To perform their jobs, all participants mentioned the importance of supplemental training or self-taught skills, often specific to whatever projects arise.According to Participant 005, "Tools were dependent on who I was working with and what they were doing.A lot of my tools are things to figure out how to get the tools they were using and get sort of familiar with it!"Although many learned these technical skills while on the job, six participants mentioned that they wished they had gained such experience before starting their professional careers.Table 1 presents a breakdown of the type of tools used by participants.Job tasks.Participants described their daily, weekly, and less frequent tasks; however, the frequency was difficult to determine because some participants' daily tasks were others' weekly tasks or less frequent tasks, and formulating a concise list proved difficult.Participants would begin speaking about daily tasks and seamlessly move to other less frequent tasks without a clear delineation.Therefore, a more meaningful frequency analysis did not occur.The five most frequently described tasks are outlined below.
CommuniCation.The most frequently described job task, communication, appeared in all 14 interviews.Although communication occurs and is important in every profession, participants emphasized these "soft skills" over other domain knowledge or applied skills.Communication skills included verbal and written communication, overall "people skills," and the ability to handle stressful situations.Participant 005 stated soft skills are "way under-valued by a lot of people!Because people are oftentimes stressed."Six participants commented on the importance of understanding scientific methods in order to better communicate with the scientists they serve.Participant 013 mentioned the importance of being able to communicate with those working in information technology who may not necessarily have a research or science background.The role as a bridge or intermediary across domains resonated in other participants' comments.
Communication relies on an understanding between information professionals and their users.Indeed, Earth science data managers and librarians adapt with the needs of the data producers as both evolve.The "collaborative nature of data management [and library] work" entails efficient and effective communication between the information professionals and the users.Participant 008 eloquently described the role of the data managers: "I would say we're more trying to support the science community rather than doing real science ourselves."By acting as liaison to specific scientific communities, both data managers and librarians gather pieces of information needed to complete data-specific tasks, and communication skills are pervasive throughout all aspects of these jobs.
Collaboration.Collaboration was also a consistent theme throughout the interviews.All participants collaborated with peers, fellow staff members, students, faculty, and researchers on a daily, weekly, and less frequent basis depending on the scope of the work.Nine participants emphasized the importance of determining research data needs as part of that collaboration.Overall, the general impression was that data managers often provided the support system throughout the data life cycle within their institutions and collaboration in the research enterprise was inherent to their jobs.
Nine participants also stressed the necessity of collaboration within their work teams.Five data managers specified the importance of weekly or biweekly team meetings.Participant 007 stated they hold kickoff meetings at the start of a project to get everyone on the same page and plan out everything rather than constantly relying on emails throughout a project.This internal communication allows data managers to conduct their work consistently among the various members, keeps each member accountable, and bounce ideas and strategies among each other for tackling different datasets and projects.Participant 014 mentioned project management as an important part of their position, further stating, "I think the surprising part of all this is how much it matters just to build that interpersonal relationship.I feel like now we just trust each other, and we work well as a unit, and we can tackle all sorts of things." outreaCh.Eleven participants included outreach in their job tasks.They inform students, faculty, researchers, and scientists that they exist, are available to help with any data questions, and can guide them through the data management process.Two data managers and one Earth science librarian mentioned updating community websites to market services outside of their institutions.Other outreach tasks included organizing programs (6), participation in committees (4), and giving presentations or workshops (4).They are recruited for research projects, to attend and present at conferences, and to field market their services across their organizations and communicate directly with department subject librarian liaisons.
teaChing and researCh.Four Earth science librarians teach courses or workshops as part of their job.One librarian is also a university professor who teaches 2-or 3-credit-hour courses on RDM.Two other librarians prepared and taught data-related workshops to faculty and students.Participant 003 noticed "some students really take [RDM] and run with it.Other students don't really understand the applicability of it as much, but the students who take it and run with it become the data managers for their labs and start to implement best practices in their labs."Participant 009 took pieces of workshops they normally teach and combines them into unique guest lecturer presentations depending on what faculty members request.Two participants learned teaching skills on the job, something they wish they knew prior to starting.While formal instruction tasks are not commonly included in data manager roles, one participant stated: "You have to teach…sometimes, your scientist asks what and why you're doing certain things."They see scientists' data all the time and usually help them clean up the datasets before depositing them so "their datasets can be reused by somebody else who has no background in what happened with their datasets."This type of one-to-one instruction may not be verbalized as teaching by data managers but it is an educational part of the job.
data-speCifiC tasks.Ten participants said data-related tasks as large components of their jobs, with seven noting data curation tasks, including cleaning and checking data (7), processing data to be uploaded to websites (4), and entering data (2).Within the process of making datasets interoperable for web applications and other databases, data must be given appropriate metadata, archived, and given a persistent identifier (e.g., DOI).Participant 007 said the process is lengthy and involves many unique steps: "it's a 70-80 step process, some automatic, some manual, but these are the steps that we take, once our science team develops a dataset and how to get it into our production website."One of the seven participants who worked on data curation tasks mainly checked journal alerts for new datasets to enter and upload to the organization's website.Two participants worked in more liaison-specific roles, focusing on connecting individuals with data experts; a third participant focused on teaching data management, visualization, and curation workshops and classes; and a fourth participant primarily focused on website content and usability studies.
Four participants also highlighted data storage.One Earth science librarian stated, "I answer a lot of questions about available data storage on campus and help them identify appropriate data storage options here on campus."In their roles, data managers' work involves data curation tasks such as writing plans for researchers (3), answering questions involving short-term and long-term data storage (2), and actively uploading data to repositories (2).Finally, a librarian mentioned helping researchers and students find secondary data for research purposes.

Discussion
This study attempts to capture the skills of Earth science data librarians and data managers.Recruiting participants from other science disciplines may lead to other job tasks and different results.Still, some saturation in responses from 14 participants and clear themes in job tasks emerged, even if not generalizable.The participants had a great deal of work experience and discussion of the job titles, job tasks, and other aspects of their work inform future coursework and more quantitative research of these jobs now that frequent job tasks have been validated.
Job titles meaningless.In many data-intensive organizations, job titles and classifications were created long before current job holders.For example, the academic librarian role has rapidly evolved with changes in technology and user needs despite the job title remaining unchanged.Therefore, in nearly all cases, the titles of "data librarian" and "data manager" did not directly indicate what and if any data-specific tasks were part of their jobs.The data management duties might be assumed to be a significant portion of the job for data managers, but further granularity of what data-specific tasks is not possible with generic job titles within science organizations such as Project Scientist.All the data managers' education and work experience were for other roles in science, but over time data tasks become more central.This may explain the relative ambiguity of job titles in science as any scientists may perform data management tasks.This may devalue the expense of those in data management roles if others may do these tasks as part of other jobs.
Education varied.Eleven participants had STEM backgrounds before working as science data managers or librarians.A STEM background means these information professionals are already familiar with science data and social norms in those communities.Two participants said this knowledge helps them best capture and organize data.On the other hand, having a formal LIS education allows librarians to understand how data are curated, accessed, and used.Most participants mentioned they learned aspects of field-specific knowledge or technical data management skills on the job.
Eight participants have STEM undergraduate and non-LIS graduate degrees; three have non-STEM undergraduate and LIS graduate degrees; and three participants have both STEM undergraduate and LIS graduate degrees.Seeing the value of the education in both, Participant 009 shared they are currently working on their second master's in LIS, while holding a librarian position."There's a lack of librarians in academic libraries that have a STEM background and so, [...] they opened up the job posting requirements to include people with PhDs in STEM."The necessary KSA to perform these job tasks eclipses the discipline name on the degree.The ideal remains a combination of both LIS and domain-specific knowledge to many employers, but without specialized coursework in either the domain or in LIS there will continue to be a dearth of qualified applicants.Participant 013 alluded to this unique combination of skills: "try to find the person who knows how to do, you know, both the science, the data management, and the IT.That's a rare mix."Job tasks listed in job descriptions would benefit from a shared terminology of data curation to outline what parts of data management a job entails.
For participants charged with both new data management duties and no science background and those doing data curation without education in that, the knowledge gap is clear and must be filled in on the job.While more experience with science data results in improved KSA, some data curation training could help resolve potential interoperability and reuse issues when data must be shared broadly.Perhaps, many in-house solutions and/ or project-specific jargon metadata might be locally useful but problematic for aggregating science data.

Conclusions
This study's job analysis approach is one attempt and a method for others to use to inform education by the current job tasks described by those doing the work.This is more important than ever in a world driven by data, machine learning, and artificial intelligence.Science data present unique challenges and requires specific training, yet many traditional skills learned in graduate LIS programs remain critical.Organizations like ASLI perform outreach to students at conferences about these careers.One author met and communicated with the ASLI Membership Chair when researching LIS and iSchools graduate programs as a result of this personal interaction."You may not know what you really want to do with [LIS] until you start school" (A. Orehek 2018, personal communication).Students can also refer to these organizations' online resources for career and further education.Many of the core concepts related to information literacy are sets of abilities enabling individuals to "recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information" (American Library Association 1989).These information literacy skills may be transferable across domains and to data, but there is no substitute for domain knowledge of science and science data when assisting others and managing it.
Clearly, further research needs to be conducted to validate the tasks mentioned by these participants across science data managers and librarians in other domains.Future work could validate these job tasks by surveying a larger number of participants for more generalizability on typical job tasks.These most frequent job tasks require workers to gain the KSA needed in curriculum to work in these emerging areas.Further, more research will help to distinguish the differences and similarities between the jobs of data managers and Earth science librarians and domain scientists.The similarities in the jobs from these fourteen participants indicate that data curation coursework would benefit anyone moving into these careers as data life cycle tasks are part of every job.A review of job descriptions over time may show changes to the expectations from employers, but those current workers will be the best data points for describing the actual work being done now.Regardless of what any data job is called, those working to facilitate science need to be supported and trained for the benefit of all science.
Unauthenticated | Downloaded 10/27/23 12:41 PM UTC Job analyses.Job analyses are research on jobs as they currently exist and/or have existed.The KSA required for effective job performance are secondary artifacts of this work and may Unauthenticated | Downloaded 10/27/23 12:41 PM UTC