Special Collections by Jennifer Darragh
INTRODUCTION
The digitization of special collections is currently a very “sexy” topic within the field of library science. Digitization endeavors involve many different members of the special collections community - special collections librarians, library preservation departments, the scholars using these collections, systems librarians, and other information technology specialists. As this pot has many cooks stirring it, the perceived value and impact of digitization on special collections varies, but two things remain constant across current literature – the digitization of special collections is necessary to broaden access to unique collections for innovative scholarship, and to act as a distinguishing element for libraries as more and more general collections go digital.
WHAT YOU WILL LEARN ABOUT IN THIS CHAPTER:
1. What makes a collection a “special” collection;2. How digitization efforts are beneficial for users of physical special collections, and the libraries with these unique collections;3. Processing issues of “born digital” special collections; and4. Issues libraries face when making the decision to digitize.
Special collections and digitization are both fairly expansive subjects, and this chapter only provides a glimpse into where both library worlds merge. The reader is advised to consult the sources used to create this chapter, and to visit Tanya Zanish Belcher’s Archives and special collections: A guide to Internet resources on the Web (
accessible on the ALA Web site) and the
Digital Library Federation Web site.
WHAT MAKES A COLLECTION A SPECIAL COLLECTION?
In the traditional brick and mortar library purview, a “Special Collection” is a collection of physical artifacts that are either fragile (often due to age), expensive, rare, and/or in a format not conducive to housing within the general collection (Graham, 1998). These materials are segregated from the general collection, primarily to keep them safe and secure; usually within a special area of the library that has heightened security and is climate controlled (
Wikipedia, 2005). A
special collections library is often only accessible to a select group of users, and only one user can have access to a particular artifact at one time. Access to the collection is also typically limited by hours of operation, and the number of days the collection is on display. Users are also required to follow very
specific terms and conditions when viewing/handling materials; such as viewing only one book at a time, wearing gloves, pencils only for writing notes, specific copying instructions and fees, etc.
Beyond the materials typically found in a special collections library, there are other items that could be considered “special” within a library’s general collection – “born digital” items. While these electronic materials are valuable for research and teaching, the tenuous and/or complex nature of their formatting complicates their accessibility and preservation. For a specific example, consider an
EBCDIC data file on a
3480 tape cartridge. EBCDIC – the acronym for Extended Binary-Coded Decimal Interchange Code – is a character to number coding scheme developed by IBM for data stored on IBM mainframe computers. Your average personal computer (Windows or Mac) uses the
ASCII - American Standard Code for Information Interchange – coding scheme. The successful conversion from EBCDIC to ASCII is not a trivial process, and requires significant technological know-how. In fact, there are many companies advertising conversion services on the Internet for a fee (
Google EBCDIC to ASCII and take a look at the results). Prior to even thinking about the conversion process, the data has to be read from the tape cartridge. Here, there are more challenges – does the library have ready access to a mechanism that will read the tape? Is the tape still readable (oxidization is as much a problem for data tape cartridges as it is for microfilm – exposure to magnetic fields will also damage tapes)? Most data files are now distributed on CD-ROM or are mounted for download, so some libraries – even universities on the whole - have retired their tape readers. Therefore, it is easy to see that like a brittle book, digital materials are in just as much danger – if not more - of losing their integrity and inherent value if not properly housed in an environment that both preserves and maintains the materials in a usable form.
WHY DIGITIZE SPECIAL COLLECTIONS?
Many of the world's most prominent libraries (national, academic and research) have received significant funding to digitize artifacts housed within their special collections, and organize them into well-indexed, freely-accessible, comprehensive digital libraries.
Peter Hirtle, former Co-Director of the Cornell Institute for Digital Collections (CIDC), has identified three primary benefits for the digitization of special collections:
1) increased use;
2) facilitating new types of research; and
3) new users resulting in new uses for the materials (Hirtle, 2002).
Shan Sutton, Head of Special Collections at the University of the Pacific Library, cites the benefits identified by Hirtle, and provides two more:
4) preservation of physical artifacts; and
5) promotion of the library (Sutton, 2004).
This author has identified one more benefit, based on a recent project undertaken by Yale University (Linden & Green, 2006):
6) digitization can make a collection special.
DIGITIZATION INCREASES USAGE OF SPECIAL COLLECTIONS RESOURCES
Increased usage is considered the primary benefit when special collections are digitized. Hirtle’s prime example is the Making of America digital library collections (a joint project of
Cornell University and
University of Michigan). The MOA project involved the digitization of nineteenth-century American monographs and serials in order to create a social history from the antebellum period through reconstruction (
http://cdl.library.cornell.edu/moa/about.html, ND), and includes several thousand volumes across both University collections. At Cornell, “only a few hundred” volumes in hard-copy would circulate in a year (Hirtle, 2002. Pg. 43). Once digital page images of these volumes were mounted on a Web server, Cornell was quickly averaging 4000 page views per month. Once the digital collection was enhanced – searchable text added behind the page images - the average rose to 5,000 page views per day.
DIGITIZATION OPENS DOORS TO NEW RESEARCH ON SPECIAL COLLECTIONS
Digital libraries allow for the integration of resources, often including digital surrogates of physical artifacts that 1) are varied in original format (map vs. book vs. sculpture, etc.), and 2) permanently housed in different libraries all over the world. As a result, existing research becomes far more efficient, and the possibility of new research – using these resources in a manner never before realized – greatly increases. Historic maps, biographical resources, historical Census figures, magazines, etc. – can be converted to digital form, enhanced, and then interwoven to provide a very detailed picture of a specific country, people, or organization during a specific time period. For an example, see
A Vision of Britain Through Time, an interactive Web site created by the Great Britain Historical GIS Project. If a researcher had to consult all of these resources in their raw form, create digital surrogates and paper facsimiles and enter data for integrative analyses, he/she would need to invest a considerable amount of time and a tremendous amount of effort in the project – a serious commitment that might not be feasible.
DIGITIZATION REDEFINES THE USER BASE OF SPECIAL COLLECTIONS
When developing a digital library collection, technical designers and librarians have the opportunity to develop back-end sophisticated/front-end simple tools for the retrieval, viewing, and analyses of information. Librarians can create taxonomies, annotations and help guides for specific user communities – be they elementary school students, college students, or advanced academic scholars. The
English literature portion of the British Library’s Online Gallery is a good representation of how a digital library collection could appeal to different types of users. It is not uncommon to find classic literature as required reading in America’s high schools – common titles include Beowulf, Canterbury Tales, and The Illiad. A high-school English teacher wishing to enrich his/her students’ exposure to the classics could make good use of this collection. The British Library’s Online Gallery includes clearly-written, descriptive introductory information for each text and its author, and at least one digitized page image from the physical artifact. Some texts within the online collection include tools allowing for deeper exploration. For example, Canterbury Tales has an
online viewing tool that allows users to compare and contrast page images from both the first and second editions; something that would appeal to the more advanced user.
PRESERVATION OF THE PHYSICAL ARTIFACTWhen digital surrogates of physical artifacts are made available online, there “theoretically” is a decline in requests for access to the original items (Sutton, 2005. Pg. 234). Without regular handling, the deterioration of physical artifacts slows down. There currently is only anecdotal evidence supporting decline in requests for original materials access (see
THE POSITIVES AND NEGATIVES FOR DIGITIZATION OF SPECIAL COLLECTIONSbelow for more information).
PROMOTING THE LIBRARY
General collections across academic and research libraries are becoming more homogenous due to the widespread availability of electronic resources. James Neal, Columbia University Librarian, stated at the 2005 ACRL Annual Conference that ‘[r]esearch libraries traditionally have been evaluated by how many volumes they hold, but the smallest library can eventually access as many volumes as the largest…In the future I believe great research libraries will be evaluated more and more on their special collections’ (Albanese, 2005.Pg. 40).
Cambridge,
Stanford, and the
University of Virginia Libraries are just a few of the renowned research libraries actively engaged in the digitization of unique special collections. As more libraries wish to maintain and/or establish an elite reputation, the digitization of special collections will shift “from a temporary endeavor to a fundamental responsibility” (Sutton, 2004. Pg. 234).
CREATING A SPECIAL COLLECTION BY DIGITIZATIONThere are resources within a library’s general collection whose value can be raised as a result of digitization. Yale University has created just such a value-added collection with the creation of the
Economic Growth Center Library Collection (EGCLC). The EGCLC was created in order to “transform scholarly use of data found in print statistical publications” (Linden & Green, 2006. Pg. 1). Print statistical publications are useful for providing a snapshot of area and population demographics, but utilizing print tools for in-depth longitudinal demographic analyses is complicated due to their static and disconnected nature. The EGCLC project has taken static tables from statistical abstracts – Mexican state statistical abstracts are the most complete collection to date – and converted them into manipulable digital analyses files (Excel format). Beyond manipulable files, this collection includes PDF documents tagged with Dublin Core elements (each chapter, each volume) allowing for retrieval by state, topic, and/or date, and Data Documentation Initiative (DDI) metadata about the contents of each Excel data file (metadata is highly detailed and strictly structured, outputs as an XML file). The EGCLC project is striving to ensure optimum usage of this resources, and is committed to preserving this resource over the long-term (Linden & Green, 2006).
"BORN DIGITAL" SPECIAL COLLECTIONS
Special collections of materials that are “
born-digital” are those that exist in digital form only. Born-digital items can include working papers of a Nobel laureate (Word documents), a company president’s email messages, weather data recorded by a satellite, and survey data collected through a
computer assisted personal interview (CAPI). In order for these materials to remain viable, they need to assessed, processed and preserved with the same level of attention given to hard copy items within a library’s collection. Traditional methods of collection development and cataloging/metadata assignment can definitely be applied to the processing of digital items. However, some modifications and extensions of these existing methods are often necessary.
COLLECTION METADATA
For example, a popular reference work is released as a new edition. Information about copyright, LOC cataloging, whether this is a first print for this edition, etc. can all be found on the verso of the title page. The authoring entity’s reason for releasing a new edition is usually also included in the book’s preface. Digital files can also be re-released as different versions, but documentation of why the file was re-released, when the update actually occurred, and who holds the explicit copyright may not be available and/or immediately clear. In order for a digital file to retain its usefulness, this information needs to be tracked-down, and entered into the metadata record for that file.
Levy and Marshall state “[t]he highest priority of a library, digital or otherwise, is to serve the research needs of its constituents” (1995, Pg. 80). In order to best serve users of a specific digital library collection – such as geospatial or survey data files – documentation developed according to specific metadata standards should also be included. The recognized metadata standard for geospatial data is the
Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM). Quickly becoming the de facto metadata standard for social science survey data is the
Data Documentation Initiative (DDI) metadata schema. These schema were developed, for reasons akin to why the
Encoded Archival Description (EAD) was developed for archival finding aids - to ensure successful discovery of appropriate resources, and to preserve information about origin, version, creator, and other needed elements for research.
PRESERVATION AND ACCESS - DUAL PROCESS
Digital file preservation is an extremely complicated process, therefore only a few aspects of the process will be covered here. It is well known in the brick and mortar library world that physical artifacts do deteriorate, but the rate of deterioration is relatively slow, and with effort, can actually be put-off, or even fully prevented. Within the realm of digital libraries, the possibility of file corruption and data loss is well known, but is unfortunately far less predictable. Ideally, digital libraries should include tools for access and preservation at time of implementation, to ensure collections remain persistent over time.
Digital files are available in many different formats; some can easily migrate across different technological platforms (an ASCII file perhaps being the easiest), others are bound to proprietary software programs, and some require specific computer hardware in order to run. Over time, digital documents become increasingly vulnerable due to “the decay and obsolescence of the media on which they are stored”(
Rothenberg, 1999). In order to maintain long-term usability of digital files, a viable preservation strategy is absolutely necessary. The most logical statement about digital preservation is that it should be built in to the collection process – carried out at time of acquisition or very soon thereafter (Graham, 1998). That being said, there is a dizzying number of digital files sitting on servers that have been long available in digital format, making reactive preservation strategies and practices equally important.
The
Digital Library Federation (DLF), a consortium of libraries and other related agencies, is the leading authority for digital library research and development. This organization is engaged in pioneering work in digital collection building, file production, preservation, usability, and digital library design and construction (DLF Home Page, 2006). With respect to preservation, a current initiative of the DLF is
The Global Digital Format Repository. Only the preliminary design of this registry has been completed (as of 2003), but proposed functions show great promise of a tool that will be extremely valuable for the persistence of digital files over the long-term:
Identification – “I have a digital object; what format is it?”Validation – “I have an object purportedly of format F; is it?”Transformation – “I have an object of format F, but need G; how can I produce it?”Characterization – “I have an object of format F; what are its significant properties?”Risk Assessment – “I have an object of format F; is it at risk of obsolescence?”Delivery – “I have an object of format F; how can I render it?”(Abrams & Seaman, 2003. Pg. 3)
The preservation of the software and hardware needed to read file formats is also extremely important. One of the proposed methods for doing this is to encapsulate original versions of software with digital files and the creation of software and hardware emulators (
Rothenberg, 1999). In the simplest terms, an
emulator imitates a particular software program or piece of hardware, thereby effectively convincing the system it is running on that it is that particular software program or piece of hardware. Emulators designed for digital preservation have to be generalized, as future computing platforms are unknown.
THE POSITIVES AND NEGATIVES FOR DIGITIZATION OF SPECIAL COLLECTIONSPeter Hirtle, whose principal benefits for digitization of special collections were cited above, also foresees some potential negative outcomes in the following areas:
1. Requests for use of original artifacts will decrease;
2. The number of digital books vs. paper books will increase;
3. Special collections print holdings will decrease; and
4. Special collections librarianship will drastically change. (Hirtle, 2002)
Hirtle readily admits for his first point that there is no body of hard evidence supporting the theory that usage of physical artifacts will decrease. His concern is based on what he witnessed, and heard about from Michigan, pertaining to requests for print MOA titles soon after the MOA went online – namely, only a “handful” of requests were made, and when users found out about the MOA online, half the requests were withdrawn (Hirtle, 2002).
For Hirtle’s second point, he cites the University of Virginia’s e-book collection as hard evidence; 1200 titles were made available in Microsoft Reader e-book format, and within two months more than 600,000 copies of the books were downloaded (Hirtle, 2002. Pg. 45). While many would say that reading books on-screen is not the most comfortable reading option (and Hirtle makes this assertion himself), it is undeniable that electronic texts are becoming more and more popular.
Neal and Hirtle both assert that unique, special collections will act as a distinguishing showpiece for libraries. However, Hirtle feels that the need to house printed items within special collections libraries will decrease as a result of digital facsimiles being readily available (Hirtle, 2002). He provides no quantifiable statistics to indicate that this is actually happening, but considering the oft-heard complaints about budget cuts and tight space – Hirtle’s prediction could end up ringing true. Peter Graham, on the other hand – and granted his purview is a few years before Hirtle’s – is that artifactual print collections will continue to be valuable because of their physicality (1998) – bindings, printing, imprints, liner notes, and even their smells (1998).
The final point about special collections librarianship changing is considered by Hirtle to be an “inevitable change” (Hirtle, 2002. Pg. 48), and there are may others who agree – to a point (see Graham, 1998; Sutton, 2004; and Albanese, 2005). Hirtle’s concern is that the need for special collections experts will decrease, but there are those – like Graham and Sutton – who feel that special collections expertise will be necessary in order to develop and maintain these collections.
CONCLUSIONThe digitization of physical special collections – print and other media – is reaching what Sutton calls a ‘point of no return,’ where digitization is becoming a “primary and permanent function…” (Sutton, 2004. Pg. 241). The fields of special librarianship and digital librarianship – seemingly so opposite from one another – have merged in a very exciting way. It is not yet known whether the impact upon artifactual collections will be more positive or negative, but the traditional landscape of special collections has definitely been changed forever.
SOURCES CONSULTED & LINKED RESOURCES
Abrams, S. L. & Seaman, D. (2003). Towards a global digital format repository. Conference proceedings from the World Library and Information Congress: 69th IFLA General Conference and Council. August 1-9, 2003; Berlin, Germany.
●Albanese, A. R. (2006). Rarities online. Library Journal (130:18). Pages 40-43.
●British Library. (ND).
Online gallery: English literature.
●British Library. (ND).
Treasures in full: Caxton’s Chaucer.
●Cambridge University Library. (2006).
Digital image collections.
●Cornell University Library. (2005).
Making of America.
●
Data Documentation Initiative Web Site.
●
Digital Library Federation Web Site.
●Duke University Libraries. (2006).
Rare book, manuscript, and special collections library: Policies & services: Reading room: Handling materials.
●Federal Geographic Data Committee. (2006).
Geospatial metadata standards.
●Graham, P. S. (1998). New roles for special collections on the network. College & Research Libraries. Pages 1-7.
●Great Britain Historical GIS Project. (ND).
A vision of Britain through time.
●Hirtle, P. B. (2002). The impact of digitization on special collections in libraries. Libraries & Culture (37:1). Pages 42-52.
●Library of Congress. (2006).
Encoded archival description: Finding aids.
●Linden, J. & Green, A. (2006). Don't leave the data in the dark. D-Lib Magazine (12:1). Pages 1-9.
●Penn State University Libraries. (ND).
Special collections library Web site.
●Rothenberg, J. (1998/1999). Avoiding technological quicksand: Finding a viable technical foundation for digital preservation.
CLIR Report #77. Pages 1-35.
●Stanford University Libraries. (2005).
Preservation department: Special projects.
●Sutton, S. (2004). Navigating the point of no return: Organizational implications of digitization in special collections. portal: Libraries and the Academy (4:2) (via Johns Hopkins University Press). Pages 233-243.
●University of Michigan. (2005).
Making of America.
●University of Virginia Library. (ND).
Items from special collections at the University of Virginia Library: Prepared by the electronic text center.
●
Webopedia●
Wikipedia●
Wordspy●Yale University Library. (ND).
Economic Growth Center Digital Library.
●Zanish-Belcher, T. (2003). Archives and special collections: A guide to resources on the Web. College & Research Libraries News (64:3).
Accessible Online.