Elena Colón-Marrero is a second year MSI student specializing in Archives and Records Management and Preservation of Information. She is the latest participant in The Archivists Apprentice series, which follows School of Information students as they complete their internships.
Digging for Buried Treasure
When I got my internship outline in early May I was a bit overwhelmed at everything that I was meant to accomplish. I felt like I was being sent out to sea on a piece of floating wood. Fortunately, my supervisors at the Mudd Library instantly greeted me with support and enthusiasm. They realized that the outline was ambitious. They eased my fears of what could have been treacherous waters by bringing me aboard their ship and outfitting me with all the gear I would need. The first week I learned the ropes and slowly started to inch my dinghy towards the water.
After the first week I was given my first big task: to conduct a survey of the digital media present within the collections that Mudd houses. It was imperative to get an idea of how many floppy disks, CD-roms, DVDs, Zip Disks, etc. are contained in the archive. Unlike an early modern book, you can’t leave digital media on the shelf to preserve it. Different types of digital media are quickly becoming obsolete as technology advances. Extracting the data on those materials needs to be done sooner rather than later.
I wasn’t sure how this was supposed to be accomplished. My outline said the words xQuery and Access Database, but I only had a vague understanding of what it all meant. My deductive reasoning told me that I would be doing a search (query) of the collections. I have limited experience with using Access as a resource, though I had never created anything. Jarrett Drake, the Digital Archivist, sat me down and carefully explained every step in the process.
Here’s the quick and dirty recap:
- xQuery uses a set of commands, developed by one of the librarians at Firestone Library, and searches through all of the designated XML findings aids for pre-determined keywords (floppy, CD, DVD, diskette, disks, and the ilk).
- It then extracts the portions of the finding aid containing those keywords and outputs it into a final XML document.
- Using the output I looked through each result and listed the collection number and the box numbers before searching in the stacks.
The xQuery results gave me a map to find the media hidden within the collections. The xQuery wasn’t perfect though. Since it was looking for keywords and phrases, any sentence that contained those strings appeared in the results. A number of times I was greeted by the initials of creators. I found that the word “disk” could describe a number of things other than a floppy.
Coming across a box with no media felt like opening an empty chest. I had managed to dig it out of the sand only to be met with more sand. After the first few run-ins with what seemed to be fool’s gold, I learned how to discern the reliable results from those that were misleading.
The work wasn’t over when I confirmed the existence of media. For every item I found, the item’s information was entered into an Access database. The item’s type, visible markings, location, data size, and any additional notes were inputed. This allowed my supervisors to determine which collections would be processed first.
Searching through all of the boxes was a treasure hunt. I knew I was searching for something, but not sure what or how much I would find. The survey I completed contained over 1,000 different objects. In one collection I managed to find 8”, 5.25”, and 3.5” inch floppy disks. It was better than finding Spanish gold!
My survey is in no way complete. The xQuery only looked for items that were already accounted for in the finding aids. The practices relating to digital media within Mudd have evolved over the years. In some years it was good practice to account for the items and in others it was not. There is no doubt that there are many more items tucked away in boxes and folders. Those items will have to be stumbled upon. For now the survey serves as a good starting point for my next big project: processing the materials.