I grew up in Waukesha, Wisconsin. Surrounding Waukesha are towns with names such as Pewaukee, Mukwonago, Muskego, and Oconomowoc. The nearest large city is Milwaukee. The state itself borders Minnesota, Iowa, Illinois, and Michigan (well the Upper Peninsula anyways). What do these place names all have in common?
All are Indigenous place names that have survived centuries of colonial removal and modern (neocolonial) erasures.
This tension between Indigenous resilience and colonial and modern erasures defines nearly every corner of the Americas. Studying these place names, thus offers the chance to dismantle several intellectual and academic barriers, including those dividing the local and the global; North and Latin America; the Indigenous and Western.
This project examines what can be learned by applying a 'big data' approach to place names. The creation of large digital datasets and the tools to process and analyze these datasets have revolutionized other fields of study, such as demographic history, public health - and for textual study, corpus linguistics - to name a few.
I argue these same tools and methods can be applied to the study of place names or toponyms. In fact, the data is already there to do that. For example, this project relies on the free, publicly available dataset of place names produced by GeoNames. This dataset includes over 11 million world-wide place names with geographic coordinates, alternate names or aliases, feature type classifications (town/city, mountain, river, etc.), country, and administrative district(s) (i.e. the states or provinces in which the place is found).
This project focuses on the 3.7 million places from the Americas or Western Hemisphere in the GeoNames dataset. Despite the vastness of even this subset of data, various digital tools allow the mining, exploration, analysis, pattern-recognition, and visualization of this data. I use the programming language R for this analysis. Several tools both within and beyond R allow this data exploration and analysis. A few of these tools include:
The initial analysis for this project will engage with the following questions:
Due to the diversity and huge number of Indigenous languages in the Americas, identifying indigenouos-language place names is difficult. I hypothesize that the easier task of identifying European place names - most of which are in Spanish, English, French, and Portuguese - will allow for the assumption that the vast majority of remaining place names are indeed of Indigenous origin. There are several ways to identify European place names. For languages such as Spanish, I hypothesized that a search for Spanish definite articles (la, el, las, los) and common religious titles (san, santa) would allow for the identification of many Spanish place names. To some extent this hypothesis was correct, as the map to the right makes clear. The greatest concentration of locations on this map are in areas that were once part of Spanish America, which once stretched from the southwestern part of what is now the U.S. to Patagonia.
However, the French also use the feminine article "la" and the Portuguese also refer to female saints as "santa." These two words - the French "la" and the Portuguese "saint" explain the concentrations of such locations in Quebec, the Mississippi Valley, and Brazil. Other anomalistic spatial patterns are the result of unique historical circumstances or modern corporate expansion. For the former, in the northwestern-most corner of the U.S. the explorer Francisco de Eliza named a group of islands between Vancouver Island and present-day Seattle "San Juan." For the latter, many places in the interior of the U.S. Northwest are actually La Quinta Inns. Obviously, I need to subset out modern business names from the GeoNames database.
As a scholar of early colonial Peruvian and Andean history, outside of my home state of Wisconsin, I am most familiar with Peruvian place names. Quechua place names, for example, can often be identified by some common suffixes. These include: "marca (marka)," "bamba (pampa)," "mayo (mayu)", and "tambo (tampu)" to name a few (here I give traditional Spanish spellings with modern Quechua orthography in parentheses). I am currently working on creating a map of place names with these suffixes. The distribution of such places names certainly will reflect to some extent the distribution of the Quechua language. However, I hypothesize the widespread expansion of the Inka Empire from Chile to Colombia would have dispersed such place names beyond the territory where Quechua speakers reside today. This is a work in progress....
The clearest identifiers of the linguistic origins of a place name are of course suffixes and prefixes. Often, however, units of language may be found within words as opposed to at one end or another. I recently noticed a fellow native Midwesterner, who happenes to be a data scientist, has also found the ubiquity of Indigenous place names in the Midwest fascinating. This blogger mapped the spatial distribution of the syllable "wau" which happens to be the root of my hometown (Waukesha) and the middle of the nearest sizable city (Milwaukee). This blogger's work is here.
I decided to recreate the same type of map using the GeoNames dataset. Using the simple regular expression ".*wau.*" I identified 1657 place names in the Western Hemisphere that include this syllable or morpheme (defined as the smallest meaningful unit of language that cannot be divided).
A careful review of this map reveals several patterns. Most obviously, confirming the work of the above-mentioned blogger and my own experiences, place names containing the morpheme "wau" are most commonly found in the Midwest. Taking a closer look shows the concentration of these places throughout eastern Wisconsin and northeastern Illinois. In fact, of the 1657 occurrences of this morpheme across the Western Hemisphere, nearly half are found in these two states (710 or 42.8% appear in Wisconsin and 101 or 6.1% appear in Illinois). "Wau" is commonly found in Potawatomi territory as becomes apparent after comparing my map to the right with this map showing Indigenous territory in the Great Lakes region during the 1760s.
However, the larger map of the Western Hemisphere indicates a few other concentrations of such place names. Interestingly, in many cases elsewhere in North America, place names with "wau" still have ties to the Potawatomi heartland. Many locations throughout the west in particular are named after Milwaukee. The most obvious example is Milwaukie, Oregon. There is also "Milwaukee Peak" in Colorado, "Milwaukee Junction" in South Dakota, and "Milwaukee Bridge" in Montana. Even my much smaller home town of Waukesha has experienced its own diaspora: there is a "Waukesha Pass" in Montana and a "Waukesha Spring" in Washington. I could not find any more information on these other Waukeshas, but it is easy to imagine settlers moving west out of Wisconsin naming new locations they come upon based on the Wisconsin town. This is especially the case for Waukesha Spring, WA, as its Wisconsin counterpart rose to fame in the mid to late nineteenth century as the site of springs said to have curative powers.
Outside the Potawatomi heartland, there is at least one other dense concentration of place names containing "wau" with local origins. In Guyana, the GeoNames database records the presence of 102 such place names. Furthermore, these 102 places are concentrated in one small region near the country's southwestern border with Brazil. The vast majority of these places are assigned the feature class "H", signifying a hydrological feature. A little research reveals "wau" means river in the Guyanese Indigenous language of Wapishana.3