Home About Research Profile Teaching Profile Skills Blog Tutorials CV

Toponymia Americana

Indigenous and European Place Names across the Americas

I grew up in Waukesha, Wisconsin. Surrounding Waukesha are towns with names such as Pewaukee, Mukwonago, Muskego, and Oconomowoc. The nearest large city is Milwaukee. The state itself borders Minnesota, Iowa, Illinois, and Michigan (well the Upper Peninsula anyways). What do these place names all have in common?

All are Indigenous place names that have survived centuries of colonial removal and modern (neocolonial) erasures.

This tension between Indigenous resilience and colonial and modern erasures defines nearly every corner of the Americas. Studying these place names, thus offers the chance to dismantle several intellectual and academic barriers, including those dividing the local and the global; North and Latin America; the Indigenous and Western.

This project examines what can be learned by applying a 'big data' approach to place names. The creation of large digital datasets and the tools to process and analyze these datasets have revolutionized other fields of study, such as demographic history, public health - and for textual study, corpus linguistics - to name a few.

I argue these same tools and methods can be applied to the study of place names or toponyms. In fact, the data is already there to do that. For example, this project relies on the free, publicly available dataset of place names produced by GeoNames. This dataset includes over 11 million world-wide place names with geographic coordinates, alternate names or aliases, feature type classifications (town/city, mountain, river, etc.), country, and administrative district(s) (i.e. the states or provinces in which the place is found).

This project focuses on the 3.7 million places from the Americas or Western Hemisphere in the GeoNames dataset. Despite the vastness of even this subset of data, various digital tools allow the mining, exploration, analysis, pattern-recognition, and visualization of this data. I use the programming language R for this analysis. Several tools both within and beyond R allow this data exploration and analysis. A few of these tools include:

  1. Regular expression programs, for example, allow the identification and extraction of certain textual patterns such as common prefixes / suffixes or morphemes specific to a particular Indigenous or European language. To illustrate, the regular expression ".*(pampa)|(bamba)(\s.*){0,1}$" identifies all text strings that end in "pampa" - the common Quechua place-name suffix for "plain" or "valley" - and its Hispanicized equivalent: "bamba." To translate this regular expression, it matches all strings that contain the sub-string "pampa" or "bamba" following any combination of letters (".*") and which are followed by a space and other words ("(\s.*)") or marks the end of the string ("$"). [The expression "(\s.*){0,1}" means pampa/bamba can be followed by a space and more text ("\s.*") or the end of the string ("$"). More specifically, "{0,1}" states pampa/bamba may be followed by zero or one instance of space and more text. What is implicitly excluded therefore is pampa/bamba being followed by other letters and thus not forming the end of the word.]
  2. Ggplot and similar "packages" available in R are some of the best tools available for data visualization and data exploration.
  3. Subsets of this dataset can then be mapped using GIS software. While R modules such as ggplot and others do allow mapping and geovisualization, importing data subsets into GIS allows easier integration with other geographic layers, such as political borders, rivers, mountain ranges, and linguistic and historical regions.

Guiding Questions

The initial analysis for this project will engage with the following questions:

  1. Why did so many Indigenous place names survive throughout the Americas? How can we explain the spatial and historical variation in this resilience?
  2. To what extent are place names indicative of historical presence? In other words, how often are the linguistic origins of a place name reflective of the place's history? And, conversely, how often are Spanish or Potawatomi place names, for example, found where neither group ever resided?
  3. Is there any correlation between the survival and resilience of native populations and the survival and resilience of Indigenous-language names? What other historical factors influenced the survival of Indigenous toponyms?
  4. What conclusions can we make about the long-term effects of conquest and colonization through a comparative macro-analysis or distant reading? For example, what insights emerge through the comparison of the differing historical, demographic, economic, environmental, cultural, linguistic, and - yes - even toponymic trajectories of various American regions?
  5. With the additional application of a few, select case studies, how does such micro-analysis corroborate, complicate, or challenge the picture presented by the above macro-analysis?

Some preliminary exploratory data analysis

In this section I gather some examples of preliminary -->

Identifying European Place Names

Due to the diversity and huge number of Indigenous languages in the Americas, identifying indigenouos-language place names is difficult. I hypothesize that the easier task of identifying European place names - most of which are in Spanish, English, French, and Portuguese - will allow for the assumption that the vast majority of remaining place names are indeed of Indigenous origin. There are several ways to identify European place names. For languages such as Spanish, I hypothesized that a search for Spanish definite articles (la, el, las, los) and common religious titles (san, santa) would allow for the identification of many Spanish place names. To some extent this hypothesis was correct, as the map to the right makes clear. The greatest concentration of locations on this map are in areas that were once part of Spanish America, which once stretched from the southwestern part of what is now the U.S. to Patagonia.

However, the French also use the feminine article "la" and the Portuguese also refer to female saints as "santa." These two words - the French "la" and the Portuguese "saint" explain the concentrations of such locations in Quebec, the Mississippi Valley, and Brazil. Other anomalistic spatial patterns are the result of unique historical circumstances or modern corporate expansion. For the former, in the northwestern-most corner of the U.S. the explorer Francisco de Eliza named a group of islands between Vancouver Island and present-day Seattle "San Juan." For the latter, many places in the interior of the U.S. Northwest are actually La Quinta Inns. Obviously, I need to subset out modern business names from the GeoNames database.

begins with Spanish article
Locations of all place names beginning with "la", "el", "las", "los", or "san".

Quechua Suffixes: Traces of Inka Expansion?

As a scholar of early colonial Peruvian and Andean history, outside of my home state of Wisconsin, I am most familiar with Peruvian place names. Quechua place names, for example, can often be identified by some common suffixes. These include: "marca (marka)," "bamba (pampa)," "mayo (mayu)", and "tambo (tampu)" to name a few (here I give traditional Spanish spellings with modern Quechua orthography in parentheses). I am currently working on creating a map of place names with these suffixes. The distribution of such places names certainly will reflect to some extent the distribution of the Quechua language. However, I hypothesize the widespread expansion of the Inka Empire from Chile to Colombia would have dispersed such place names beyond the territory where Quechua speakers reside today. This is a work in progress....

Locations of all place names ending with "tambo" or "tampu" (A waystation or inn for travelers found at regular intervals on the roads of the Inka Empire).

Hints of Indigenous Languages: Locative Morphemes

The clearest identifiers of the linguistic origins of a place name are of course suffixes and prefixes. Often, however, units of language may be found within words as opposed to at one end or another. I recently noticed a fellow native Midwesterner, who happenes to be a data scientist, has also found the ubiquity of Indigenous place names in the Midwest fascinating. This blogger mapped the spatial distribution of the syllable "wau" which happens to be the root of my hometown (Waukesha) and the middle of the nearest sizable city (Milwaukee). This blogger's work is here.

I decided to recreate the same type of map using the GeoNames dataset. Using the simple regular expression ".*wau.*" I identified 1657 place names in the Western Hemisphere that include this syllable or morpheme (defined as the smallest meaningful unit of language that cannot be divided).

wau W Hem
Locations of all occurrences of the syllable or morpheme "wau" in place names within the Americas. Click to see a larger image.

A careful review of this map reveals several patterns. Most obviously, confirming the work of the above-mentioned blogger and my own experiences, place names containing the morpheme "wau" are most commonly found in the Midwest. Taking a closer look shows the concentration of these places throughout eastern Wisconsin and northeastern Illinois. In fact, of the 1657 occurrences of this morpheme across the Western Hemisphere, nearly half are found in these two states (710 or 42.8% appear in Wisconsin and 101 or 6.1% appear in Illinois). "Wau" is commonly found in Potawatomi territory as becomes apparent after comparing my map to the right with this map showing Indigenous territory in the Great Lakes region during the 1760s.

wau in Midwest
Locations of place names with "wau" laid over a "heat map" showing the densest concentrations of such place names.

However, the larger map of the Western Hemisphere indicates a few other concentrations of such place names. Interestingly, in many cases elsewhere in North America, place names with "wau" still have ties to the Potawatomi heartland. Many locations throughout the west in particular are named after Milwaukee. The most obvious example is Milwaukie, Oregon. There is also "Milwaukee Peak" in Colorado, "Milwaukee Junction" in South Dakota, and "Milwaukee Bridge" in Montana. Even my much smaller home town of Waukesha has experienced its own diaspora: there is a "Waukesha Pass" in Montana and a "Waukesha Spring" in Washington. I could not find any more information on these other Waukeshas, but it is easy to imagine settlers moving west out of Wisconsin naming new locations they come upon based on the Wisconsin town. This is especially the case for Waukesha Spring, WA, as its Wisconsin counterpart rose to fame in the mid to late nineteenth century as the site of springs said to have curative powers.

Outside the Potawatomi heartland, there is at least one other dense concentration of place names containing "wau" with local origins. In Guyana, the GeoNames database records the presence of 102 such place names. Furthermore, these 102 places are concentrated in one small region near the country's southwestern border with Brazil. The vast majority of these places are assigned the feature class "H", signifying a hydrological feature. A little research reveals "wau" means river in the Guyanese Indigenous language of Wapishana.3

wau W Hem
Locations of place names with "wau" in Guyana.