Harvard University, Boston, MA, February 3, 2017 Do your data management and curation practices support data reuse? Ixchel M. Faniel, PhD Research Scientist, OCLC Research [email protected], @imfaniel Dissemination Information Packages

for Information Reuse (DIPIR) The DIPIR Project was made possible by a National Leadership Grant from the Institute of Museum and Library Services, LG-06-10-0140-10, Dissemination Information Packages for Information Reuse and support from OCLC Online Computer Library Center, Inc., and the University of Michigan. HOW DOES CONTEXTUAL INFORMATION SERVE TO MEANINGFULLY COMMUNICATE

DATA TO REUSERS? Data collection information Sometimes they'll simply declare we were only interested in broad-based information. We were only collecting broadbased artifacts...So, they're walking huge tracts of land, but they're only hitting big thingsI've heard of things like shoulder surveys, where they literally walk side by side and pick those little things, but then, again, you've only, you're doing a very narrow tract. So there are procedures.

- Archaeologist 01 Artifact Information I actually mean what strata it's from. I was talking about the importance of having a clear stratigraphy. And so, if they had labeled stratigraphy, let's say, A, B, C, D, E, and if they're comparing the fauna from E to A, that tells me that when they excavated, they

were really careful about preserving that information. - Archaeologist 06 Repository Information I don't give [the archaeological repository] a sort of blanket trust that all the data in there is correctthey provide enough metadata for me to check that on my ownI sort of trust going there because I know that I can find the information I need to validate it.


STANDARDIZATION Everything is turned into intentionally difficult codes... hundreds of lines you have to translate, It was really important to streamline that translation process. - Data Producer 3 INTEGRATION I don't know what kind of format the datasets had before they got integrated, but I believe there was a lot of work. - Data Reuser 9 Repositories cant reverse producer practices

DATA DOCUMENTATION Just so you have an idea of the issue with tooth wearthe following [seven] sites all record a single field Payne Wear[four datasets] all code each tooth in a separate field with a Payne number. But they dont come up with a letter code for the entire specimenThe remaining datasets dont provide Payne Stages. - Repository Staff 2 Data sharing influence on repositories & reuse DATA CONDITION

It took 10 times longer to deal with those [coded] datasets. - Repository Staff 2 DATA PRODUCERS SELECTION I did think quite carefully about [including]those big subjective descriptions we write about the units...,but I decided to...I couldnt really be sure if people would necessarily want them out but they are an important part of the data set. - Data Producer 10 Data reuse influence on repositories & sharing

REPOSITORY PROCEDURES There are some inherent issues with CSVthe simplicity is why it is preferred for interoperability and longevitywe need to give users a few tips on working with CSV. I'm also looking into other open spreadsheet formats. - Repository Staff 1 DATA PRODUCERS DOCUMENTATION I had a completely different recording system for [teeth, now Im] just usingPayne. - Data Reuser 6 [Im] just dropping numeric codes, not doingnumeric codes

anymore. - Data Reuser 10 Next Steps SLO-data Some Conclusions Stakeholders reusers needs must

be considered throughout the data lifecycle How do we do it? Research data management takes time and effort Who should bear the burden? How can we lessen the burden?


Institute of Museum and Library Services Co-PI: Elizabeth Yakel (University of Michigan) Partners: Nancy McGovern, Ph.D. (MIT), Eric Kansa, Ph.D. (Alexandria Archive, Open Context), William Fink, Ph.D. (University of Michigan Museum of Zoology), Sarah Whitcher Kansa (Alexandria Archive, Open Context) OCLC Fellow: Julianna Barrera-Gomez Doctoral Students: Rebecca Frank, Adam Kriesberg, Morgan Daniels, Ayoung Yoon

Masters Students: Alexa Hagen, Jessica Schaengold, Gavin Strassel, Michele DeLia, Kathleen Fear, Mallory Hood, Annelise Doll, Monique Lowe Undergraduates: Molly Haig Select References

Faniel, Ixchel M., and Elizabeth Yakel. 2017. Practices Do Not Make Perfect: Disciplinary Data Sharing and Reuse Practices and Their Implications for Repository Data Curation. In Curating Research Data Volume 1: Practical Strategies for Your Digital Repository, 103-126. Chicago, IL: Association of College and Research Libraries Press.

Frank, R., Yakel, E., & Faniel, I. M. (2015). Destruction/reconstruction: Preservation of archaeological and zoological research data. Archival Science, 15(2), 141-167. doi: 10.1007/s10502-014-9238-9 Frank, R. D., Kriesberg, A., Yakel, E., & Faniel, I. M. (2015). Looting hoards of gold and poaching spotted owls: Data confidentiality among archaeologists & zoologists. Proceedings of the Association for Information Science and Technology (ASIS&T), 52. Kriesberg, A., Frank, R., Faniel, I., & Yakel, E. (2013). The role of data reuse in the apprenticeship process. Proceedings of the Association for Information Science and Technology (ASIS&T), 50. Faniel, I., Kansa, E., Whitcher Kansa, S., Barrera-Gomez, J., & Yakel, E. (2013). The challenges of digging data: A study of context in archaeological data reuse. Proceedings of the Joint Conference on Digital Libraries (JCDL), 295-304.

Yakel, Elizabeth, Ixchel Faniel, Adam Kriesberg, and Ayoung Yoon. 2013. Trust in Digital Repositories. International Journal of Digital Curation 8 (1): 14356. doi:10.2218/ijdc.v8i1.251. Daniels, M., Faniel, I., Fear, K., & Yakel, E. (2012). Managing fixity and fluidity in data repositories. In Mai, J. (Ed.), Proceedings of the 2012 iConference (pp. 279-286). New York: ACM. Additional references for the DIPIR project: Critical Perspectives on the Practice of Digital Archaeology

Thank you Ixchel M. Faniel, PhD Research Scientist [email protected], @imfaniel

