Mining & Plotting Historical Scholarship

How has historical scholarship changed over the past century? What trends and changes in this scholarship have gone unnoticed?

If you could trace, plot, graph, and map how historical research in and beyond North America has changed over a century, what would you like to know?

In this project, I draw on tens of thousands of book reviews from the flagship journal of historical research in North America: the American Historical Review. I currently am analyzing 10,000 born-digital book reviews from the AHR for the past decade.

To this collection of recent book reviews, I also want to add the 112 years and approximately 50 to 100,000 AHR book reviews that preceded the born-digital reviews. These earlier reviews are available on JSTOR. I currently am beginning work with JSTOR's "Data for Research" text-mining service. With this service, I can download large numbers of these book reviews at once with one caveat - they have been disaggregated into frequency lists of 'n-grams,' which list individual words, two-word pairs (bigrams), and three-word groups (trigrams) by frequency. This form of dissemination is meant to prevent piracy. Unfortunately, it also makes some forms of digital text analysis impossible (i.e. identifying which adjectives are commonly found near particular peoples, places, or events). Nonetheless, the n-grams and frequency lists still allow for some basic analysis of trends in historical scholarship.

Text-Mining Possibilities
In what ways can we apply text-mining strategies to this corpus of book reviews? What questions can we answer?

Questions labeled with a 'd' can only be answered with the born-digital reviews and with 'd?' are better answered by these reviews. I will be able to use the frequency lists for older book reviews to answer the remaining questions, although perhaps with less precision.

  1. Diversity in topics, regions, presses: To what extent did historical scholarship diversify over time in terms of the regions, themes, and peoples studied?
  2. Trends in words and phrases: What words or phrases suddenly appeared in certain periods? Did they represent new concepts and theories or simply new ways of describing old themes?
  3. Topic modeling [d?]: What themes in historical literature predominated for a given period? How did these themes change over time?
  4. Descriptions [d]: How did the ways scholars described particular peoples or topics change over time? For example, how did scholars' treatment of Indigenous history and Indigenous peoples change across the decades?
  5. Junctures and Turning Points [d?]: What moments and periods witnessed the most sudden and abrupt changes in the ways scholars wrote about history? Can we connect these ruptures to broader historical moments and events? or to specific developments in academic and historical institutions?
