The HathiTrust Research Center (HTRC) enables computational access for nonprofit and educational users to published works in the public domain and, in the future, on limited terms to works in-copyright from the HathiTrust.
The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Library, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.
Leveraging data storage and computational infrastructure at Indiana University and the University of Illinois at Urbana-Champaign, the HTRC will provision a secure computational and data environment for scholars to perform research using the HathiTrust Digital Library. The center will break new ground in the areas of text mining and non-consumptive research, allowing scholars to fully utilize content of the HathiTrust Library while preventing intellectual property misuse within the confines of current U.S. copyright law.
HathiTrust provides APIs for accessing and analyzing bibliographic information, page images, OCR text, and other data about objects in the repository.
They also make the texts of public domain works available for research purposes. ASU Library has signed the institutional agreement with Google, which means ASU researchers can use the dataset of Google-digitized volumes in addition to those provided by other HathiTrust partners. The process for obtaining public domain datasets is outlined on their website.