Toponymia Americana

Indigenous and European Place Names across the Americas

Exploring the hidden histories behind places and their names

Toponymia Americana

I grew up in Waukesha, Wisconsin. Surrounding Waukesha are towns with names such as Pewaukee, Mukwonago, Muskego, and Oconomowoc. The nearest large city is Milwaukee. The state itself borders Minnesota, Iowa, Illinois, and Michigan (well the Upper Peninsula anyways).

What do these place names all have in common?

All are Indigenous place names that have survived centuries of colonial removal and modern (neocolonial) erasures. The pronunciation and location of these names has changed over time. For some, the original meaning has been lost or lost in translation. For others, the meaning and importance of these names is preserved by descendant communities, if anyone bothers to ask.

This tension between Indigenous resilience and colonial and modern erasures defines nearly every corner of the Americas. Studying these place names, thus offers the chance to dismantle several intellectual and academic barriers, including those dividing the local and the global; North and Latin America; the Indigenous and Western.

This project examines what can be learned by applying a 'big data' approach to place names. The creation of large digital datasets and the tools to process and analyze these datasets have revolutionized other fields of study, such as demographic history, public health - and for textual study, corpus linguistics - to name a few.

Identifying European Place Names

Place names that begin with Spanish articles ("El", "La", "Los", or "Las") or the titles of saints ("San" or "Santa").

Due to the diversity and large number of Indigenous languages in the Americas, identifying indigenous-language place names is difficult. I hypothesize it will be easier task to identify European place names. We can assume by identifying and setting aside European names in the Americas - most of which are in Spanish, English, French, and Portuguese - the vast majority of remaining place names will be of Indigenous origin.

There are several ways to identify European place names. For languages such as Spanish, I hypothesized that a search for Spanish definite articles (la, el, las, los) and common religious titles (san, santa) would allow for the identification of many Spanish place names. To some extent this hypothesis was correct, as the map to the right makes clear. The greatest concentration of locations on this map are in areas that were once part of Spanish America, which once stretched from the southwestern part of what is now the U.S. to Patagonia.

However, the French also use the feminine article "la" and the Portuguese also refer to female saints as "santa." These two words - the French "la" and the Portuguese "saint" explain the concentrations of such locations in Quebec, the Mississippi Valley, and Brazil. Other anomalistic spatial patterns are the result of unique historical circumstances or modern corporate expansion. For the former, in the northwestern-most corner of the U.S. the explorer Francisco de Eliza named a group of islands between Vancouver Island and present-day Seattle "San Juan." For the latter, many places in the interior of the U.S. Northwest are actually La Quinta Inns. Obviously, I need to subset out modern business names from the GeoNames database.

Linguistic Clues encoded in place names

Various linguistic and historical clues lie buried among these 4 million place names. An analysis of these names in full provides some interesting results. For example, Springfield - ubiquitous in Anglo America - is found in 95 place names including 74 locations in the U.S., 10 in Canada, 10 in Jamaica, and 1 in Barbados.

Many more clues are contained within the linguistic parts of these names. For this reason, I analyze morphemes - the shortest meaningful unit of language. The challenge lies in searching for such morphemes.

Today, approximately 10 million people speak Quechua in the Andes. We can trace its spread across much of the region through place names. Many place names in the region end with common Quechua suffixes like "bamba (pampa)," "mayo (mayu)", and "tambo (tampu)," which mean, respectively: plain/valley, river, and roadside inn.

Applying regular expressions algorithms to this place names dataset allows us to conduct more complex searches. Regular expressions allow a researcher to define a search pattern. For example, in this case, I want to search for all place names that end with the above suffixes. Through regular expressions we can extract common prefixes / suffixes or morphemes specific to a particular Indigenous or European language. To illustrate, the regular expression ".*(pampa)|(bamba)(\s.*){0,1}$" identifies all text strings that end in "pampa" - the common Quechua place-name suffix for "plain" or "valley" - and its Hispanicized equivalent: "bamba." To translate this regular expression, it matches all strings that contain the sub-string "pampa" or "bamba" following any combination of letters (".*") and which are followed by a space and other words ("(\s.*)") or marks the end of the string ("$"). [The expression "(\s.*){0,1}" means pampa/bamba can be followed by a space and more text ("\s.*") or the end of the string ("$"). More specifically, "{0,1}" states pampa/bamba may be followed by zero or one instance of space and more text. What is implicitly excluded therefore is pampa/bamba being followed by other letters and thus not forming the end of the word.]

It's not a surprise to find a dense concentration of these place names within the boundaries of the former Inka Empire....