Text and Data Mining (TDM) refers to a research process that uses software to extract and organize information from text files or data sets. Researchers use TDM tools to assist with identifying patterns, connections, and relationships in text or data.
While Text and Data Mining for non-profit educational purposes often falls under Fair Use provisions of U.S. Copyright Law (see Association of Research Libraries' ISSUE BRIEF: Text and Data Mining and Fair Use in the United Statesfor further information), the licenses for many electronic resources prohibit TDM. Additionally, the use of bots, crawlers, scripts of other automated methods to search for and extract data is usually not permitted. Violating licensing terms can result in the University losing access to an electronic resource.
ASU Library is actively negotiating licenses with vendors to increase permissions for TDM by ASU researchers, and we will update this guide with new resources as these permissions are secured.
Please keep in mind that even if a license permits TDM, there are usually restrictions regarding how data can be accessed, used, and disseminated. Some vendors require researchers to provide detailed information about the research they plan to conduct; they may require researchers to sign an additional license; and they may charge a fee to provide data files. Some vendors require use of a specific application programming interface (API) to conduct TDM.