Skip to main content
LibApps staff login

HathiTrust Digital Library

tips for searching and data mining the Hathi Trust Digital Library

What is in the HathiTrust Digital Library?

The HathiTrust Digital Library started with the collection of the University of Michigan Library, which was digitized by the Google Books Project.  Since then the Digital Library has grown to include collections digitized from other partner libraries and research institutions as well as collections from other digital projects like the Internet Archive.

Full-text access and downloading is available for those items in the public domain, including:

  • US federal government documents
  • Works published before 1923
  • Works still protected by copyright, but made available to HathiTrust with the permission of the copyright holder

The number of works in the HathiTrust Digital Library is large and ever-increasing.  Currently digitized as of August 2018:

  • 16,468,932 total volumes
  • 8,016,383 book titles
  • 442,413 serial titles
  • 5,764,126,200 pages
  • 738 terabytes
  • 195 miles
  • 13,381 tons
  • 6,226,134 volumes(~38% of total) in the public domain

Google Books vs. HathiTrust

Comparison of the Google Books Project with HathiTrust:

Features

Google Books Project HathiTrust
Numbers 20 million items and growing 10 million items and growing
Content Type books, journals, magazines, reports, and govenment documents books, journals, magazines, reports, and govenment documents
Organization Google owns the Google Book Project and decides how to manage it partnership of libraries and institutions, with membership and shared governance
Full Text Search full text search within all the content; browse to sections containing the search terms full text search within all the content; pages where terms are found are displayed
Full Text Download full text download of the whole item for those items that are out of copyright full text download for members of HT of the whole item for those items that are out of copyright
Copyright status does not work on identifying copyright status of "orphan" works has worked on identifying copyright status of "orphan" works

WorldCat Records

FAQ about WorldCat records.

MARC records in WorldCat that lead to the project's "landing page" ; there will sometimes be duplicate records; leads to links to holidng libraries MARC records in WorldCat that lead to the project's "landing page" ; there will sometimes be duplicate records; leads to links to holidng libraries
Long term archiving and reformatting as needed of full text Not a key part of the mission Is a key part of the mission
Replacement of library material Not a source of replacements for missing library materials

Is a source of replacements for missing library materials, under US Copyright Section 108

Thanks to Barbara DeFelice and William Fontaine of Dartmouth College for this chart.

Unique Items in HathiTrust

In addition to their informational value, many items in HathiTrust have interest as historical, cultural, or archival objects.  This Civil War era diary of Lucius L. Shattuck stopped a bullet and the damage can be traced deep into the book.  The original can be found in the Bentley Historical Library at the University of Michigan.

The ASU Library acknowledges the twenty-three Native Nations that have inhabited this land for centuries. Arizona State University's four campuses are located in the Salt River Valley on ancestral territories of Indigenous peoples, including the Akimel O’odham (Pima) and Pee Posh (Maricopa) Indian Communities, whose care and keeping of these lands allows us to be here today. ASU Library acknowledges the sovereignty of these nations and seeks to foster an environment of success and possibility for Native American students and patrons. We are advocates for the incorporation of Indigenous knowledge systems and research methodologies within contemporary library practice. ASU Library welcomes members of the Akimel O’odham and Pee Posh, and all Native nations to the Library.