Best Practices and Planning for Digitization Projects

Alyce L. Scott, Lecturer, San Jose State University /

"A good digital initiative has a substantial design and planning component."

A Framework of Guidance for Building Good Digital Collections

The long term viability of digital collections will be ensured only by considering, from the beginning, the ramifications of the initial decisions you make throughout the process. A well-developed project plan, that takes into consideration all the factors and problems of your digitization project, will be the most important document you create.

Although the promise of digital imaging technology is great, there are a number of drawbacks that can limit its potential and usefulness:

  • It’s labor-intensive, with a big learning curve, and time-consuming
  • It’s expensive
  • It requires a long-term commitment of funds and resources           
  • There are many issues dealing with long-term preservation of digital images
  • The need for technical support – not to mention storage space and a high bandwidth network

Because institutional resources are limited, it is very important for digital imaging projects to be carefully defined up front to, if not guarantee success, at least to minimize failure. For these reasons, the success of digital projects hinges not on expensive technology, but rather on sound project planning. Technology should never drive digital projects. Goals should be determined first, and only then should the appropriate technology be selected in order to meet a project’s objectives.

It is best to ask a series of questions before starting a digital project.

  1. Why do you want to digitize the collection?
  • Is it heavily used and in need of preservation?
  • Do you want to provide new/remote access to materials that are too fragile to be handled in their original form, or provide improved access that will save time for users?
  • Is the use of the materials currently restricted because of location?
  • Does the collection supplement/complement other print/digital resources, perhaps in an unusual subject area?
  1. Who is the audience and what need is being filled?
  2. What access points are needed for the digital collection?
  3. How will access be delivered and what Interface will be created?
  4. How will the project fit into the institution’s goals?
  5. How will you determine if the project was successful?

When creating your planning document, there are three key components to include:

  1. Selection/analysis of the collection
  2. Standards for digital imaging
  3. Metadata

Let’s look at each of these in turn:

Selection/analysis of the collection

The process of selecting objects to be digitized is not an insignificant issue. It is one of the most important issues you must address when considering a digitization project. Selection helps to ensure that resources are invested wisely in digitizing the most significant and useful collections. Remember that the cost and complexity of digital imaging projects precludes experimentation and learning through trial and error. More than likely you will only have one chance to digitize a collection.

Issues to consider when selecting material for digitization:

  • Intellectual value of the collection to researchers
  • Demand from current (or potential) users
  • Historical or geographic area covered by the collection
  • Uniqueness - has another institution digitized the same, or similar, materials?
  • Physical condition of the collection, is the material suitable for digitization? (Issues to consider: will preservation work need to be done prior to digitization?; bound volumes should be able to be opened to at least a 90 degree angle to be scanned; maps may need to be significantly reduced to display online resulting in a loss of fine detail and spatial context)
  • Copyright permission (if the materials are not in the public domain you MUST have permission from the copyright owner to digitize the material)

Standards for digital imaging

We are making huge investments in digitizing collections, and it is incumbent upon us to create and manage our digital collections properly to ensure their long-term value and usefulness, as well as protecting the investment that has been made in them. There is great value in creating "archival masters" that are rich enough to be useful over time in the most cost-effective manner. Once created, the rich archival master can then be used to create derivatives to meet a variety of current and future users' needs. The quality and usefulness of various derivatives (e.g., for publication or image display) will be directly affected by the quality of the initial scan.

There are many best practices recommendations for digitizing materials. Remember that these guidelines may require adaptation to particular projects, dependent upon source document characteristics such as font size, photographic detail, and physical size.

Below is a brief sample of the best practices promoted by ALCTS : Minimum Digitization Capture Recommendations.

Some general recommendations for file formats:

Material type Minimum
resolution
Minimum
bit depth
Minimum
color space
Books and textual documents without images 300 dpi 8 Grayscale
Books and textual documents with images 400 dpi 8 Grayscale
Manuscripts 400 dpi 24 Color
Maps 300 dpi 8/24 Grayscale/Color
Photographic prings (<8" x 10">) 400 dpi 8/24 Grayscale/Color
Posters 300 dpi 24 Color

 

Archival master files Master files
resolution
TIFF (uncompressed) JPEG
JPEG2000
PDF/A

Metadata

“Anyone who has suffered the exercise in irrelevance offered by an Internet search engine will appreciate the value of precise metadata.”

Minnesota State Archives

Metadata is cataloging and technical data associated with digital images. It can be embedded in the image or in associated text – but it is essential for searching and access. Metadata is important because it facilitates the discovery of relevant information; it improves the efficiency of searching; and makes it easier to find the most specific and relevant resources. If a resource is worth making available on the web then it’s worth describing it with metadata to maximize the ability of users to locate it.  Below is a very brief introduction to the topic of metadata.

The five basic functions of metadata (Guenther, 2004) are:

  • Resource discovery 
  • Organizing e-resources
  • Facilitating interoperability 
  • Digital identification
  • Archiving and preservation

This last function(archiving and preservation) is particularly important, as metadata is a key method of ensuring that digital resources will continue to be accessible in the future.

Generally metadata is broken down into 3 main categories:

  • Descriptive metadata
  • Structural metadata
  • Administrative metadata (There are several subsets of administrative data; two that are sometimes listed as separate metadata types are:
    • Rights management metadata, which deals with intellectual property rights
    • Preservation metadata, which contains information needed to archive and preserve a resource.
    • Technical metadata

Metadata has its roots in the cataloging of print materials. As you can see, metadata is broader than cataloging – it also supports structure, functionality, and preservation of digital objects. Much has been written on this topic, and there are some reference materials in the Resource list that will give you a more in-depth look at metadata and how it works.

Selected Resources

General:

Handbook for Digital Projects: A Management Tool for Preservation and Access. Andover, MA: Northeast Document Conservation Center, 2000. Web. https://www.nedcc.org/assets/media/documents/dman.pdf

Moving Theory into Practice. Cornell University, 2003. Web. http://preservationtutorial.library.cornell.edu/

 (This is no longer being updated, but it remains a valuable resource for reference.)

National Digital Library Program Planning Checklist. Library of Congress, National Digital Library Program, 1997. Web. https://memory.loc.gov/ammem/techdocs/prjplan.html

Standards/best practices:

BCR’s CDP Digital Imaging Best Practices, v. 2.0. BCR’s CDP Digital Imaging Best Practices Working Group, 2006. Web. http://chnm.gmu.edu/digitalhistory/links/pdf/chapter1/1.17.pdf

Federal Agencies Digital Guidelines Initiative. 2016. Web. http://www.digitizationguidelines.gov/guidelines/

Minimum Digitization Capture Recommendations. ALA-ALCTS, 2013. Web. http://www.ala.org/alcts/resources/preserv/minimum-digitization-capture-recommendations

The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials. National Initiative for a Networked Cultural Heritage, 2002. Web. http://chnm.gmu.edu/digitalhistory/links/pdf/chapter1/1.17.pdf

Metadata:

Descriptive Metadata for Web Archiving: OCLC. Web. https://www.oclc.org/research/publications/2018/oclcresearch-descriptive-metadata.html

Descriptive Metadata Guidelines. Mountain View: Research Libraries Group, 2005. Web. https://www.oclc.org/content/dam/research/activities/culturalmaterials/RLG_desc_metadata.pdf

"Dublin Core Metadata Element Set, Version 1.1." Dublin Core Metadata Initiative. Dublin Core, 11 Oct 2010. Web. http://dublincore.org/documents/dces/.

Guenther, Rebecca. Understanding Metadata. Bethesda, MD: National Information Standards Organization, 2004. Web. https://www.lter.uaf.edu/metadata_files/UnderstandingMetadata.pdf