Language: Unclassified

Have you ever tried to make your own language? It’s fairly difficult. If so, chances are you haven’t gotten very far past basic vocabulary. But, if you work hard and beat the odds, you might come up with a language that gains a following and stakes its claim somewhere in the world.

We have many types of constructed languages or conlangs all around the globe. There are secret languages, or argots, like pig latin, which are entirely based on an existent language (in this case, English) and are very understandable after an explanation and a bit of practice. There are also artistic languages like Dothraki or Klingon that are consciously constructed from scratch with their own systems of grammar and vocabulary that are deliberately unlike any other in order to sound otherworldly. In addition to these, there are auxiliary (international) languages like Esperanto, a type of constructed lingua franca, which aim to encourage communication between speakers of different languages via simple and “universal” structures that draw elements from languages all over the world or in a designated region. That’s just the tip of the conlang iceberg. Whether or not any of these languages are realistically useful is debatable; they all seem to have a purpose and they have all achieved that purpose to some degree, regardless of how many non-Native speakers they’ve enticed along the way to actively use them.

The subject of conlangs is fascinating and I will surely publish posts on them in the future, but there is one thing that sets them apart from other languages: they are not natural. This seems obvious, and the term natural itself is a controversial one, but they are not generated in the same way that real languages are.

“Real language” is constructed, of course. Unless you believe that a non-human creator crafted each and every tongue spoken throughout each of the six inhabitable continents, we can agree on that. But, this construction is over time, a period of years, decades, centuries, sometimes millennia, and it actively evolves and transforms as the communication of its speakers has to adapt to their circumstances. This construction is often slow-going, predictable in some ways and arbitrary in others. Speakers don’t always agree on how to speak about or describe an object, an emotion, a period of time, a unit of measurement, an abstract idea, or any other complicated facet of language and human thought. Geography and natural events force people to move their physical homes and groups are split and isolated from one another. New machines, art, social constructs, and other creations are left needing new names and descriptors in different societies. The rise of modern language likely wasn’t as simple as the story of Babel. It was complicated, very complicated. There is a whole field of linguistic science dedicated to it, and there is still much to learn.

There exist what linguists call “language families,” sets of languages (both living and extinct) that have common origins. The term “family” is both a scientific and relational one. Two given languages can have a common parent language and they would be what we could think of as “language siblings.” The languages that they parent would then be “language cousins,” and so on. Just like a family tree, after a few generations, the relationships can become widely complicated. Just as your third cousin twice-removed might look nothing like you, related languages can be unrecognizable without properly investigating or documenting the pedigree. There is still a common ancestor there, even if the two of you have had a sizable amount of disparate genetic influencers that have made your similarities minimal. Language works in a very similar way.

There are at least 13 widely accepted and thoroughly investigated language families. Among them, the number of languages has a large range, just as human families do. The Austronesian family has more than 1,250 languages, while the Dravidian family has 86. These families divide into subgroups, sort of like smaller families, which get smaller and smaller. It should be noted that there is not nor will there probably ever be a universal agreement on what a language really is and what makes languages distinct. This might sound silly since if you are a monolingual English speakers (believe me, English is strangely unique), but what makes a dialect versus a language is a huge and contentious discussion. Bosnian, Serbian, and Croatian are extremely similar, probably more similar than Spanish and its dialect of Catalan, or even the different dialects of English. For mostly political and cultural reasons, they are separated and categorized as different, even though speakers of all three can understand each other almost perfectly, just as Brits, Aussies, and Americans can. When trying to create linguistic family trees, this complicates things.

English is part of the Indo-European language family. Further, it is a Germanic language, unlike Spanish, French, and Italian, which are Romance languages. If these classifications seem complicated, it is because they are. See below (and zoom in) and try to find English or a language you know on the most recent Indo-European family tree, a chart that even ancestry.com probably couldn’t fathom.

Indo-European family tree

As you can see, languages have changed drastically over time. Certain events or processes, such as colonization, globalization, and the dawn of the digital age, have complicated things tremendously, bringing languages into contact that never would have met previously. Because of this, there exist many types of “World Englishes,” most notably, Spanglish. And as the Earth’s people change and certain languages give way to new ones, other languages are left behind or isolated in the process. This brings us to a very special phenomenon: unclassified languages.

An unclassified language is one that cannot easily fit into a genetic family tree. These are very rare because, as we’ve discussed, languages are born from one another. It is the linguistic equivalent to a John Doe, a language that exists, but we don’t know where it came from or to whom it’s related. It had to come from somewhere. Like a human being, it needed parents to be born, yet identifying those parents has proven to be a difficult feat. It’s true that there are cases of single languages that are not related to any other language, such as Basque or Ainu. These language isolates form their own, single-member language families. What distinguishes them from unclassified languages is that they are (or were) widely spoken and there is a large amount of evidence for them that confirms their uniqueness. An unclassified language may prove to be a language isolate upon further study, but currently, there isn’t enough evidence to make a decision about it. Unclassified languages are also separate from extinct languages, ones that no longer have any speakers. They might also be extinct, but there are several unclassified languages that are still spoken, albeit with a small population of speakers.

So how is this possible? There seems to be several reasons for it. The first is that sometimes we don’t have the opportunity to interpret a language at all. There are languages, such as now-extinct Cypro-Minoan, for which we have data and have tried but failed to decipher. There are other small languages that have become extinct recently before we were able to collect data, such as Nagarchai of India. Additionally, some languages exist that we as outsiders do not have the invitation to learn more about. The languages of uncontacted peoples that live away from the rest of the world, such as Sentinelese or Himarima, are not available to us and might never be.

North Sentinel Island in the Andaman Islands below India

North Sentinel Island in the Andaman Islands below India

The next group of languages that are unclassified are those that we don’t have enough data for to make an informed decision. Some of these languages might be so endangered that they only have a handful of speakers, like Kujarge of Chad or Lufu of Nigeria. They also might be entirely extinct, with their final speakers being the only ones able to provide data to linguists, such as Luo, which only had one speaker left in 1995. Or we might just have some records, but not enough to truly determine which family the language belongs to or where they belong within the families that they are hypothesized to be a part of. Bung, a language in Cameroon, is one of these. There are also a number of languages that are from centuries ago that are mentioned in historical sources and which we have evidence for, but it is too difficult to pinpoint when they went extinct or how they relate to the languages around them. Minoan, Gutian, and Khitan (Liao), which was the official language of the Liao Empire in Northeast China, are examples of these.

While unclassified languages are separate from language isolates, there are some languages that seem to be isolates, but which are so small in number of native speakers and have such a low amount of non-native speakers, that it is hard to distinguish them as such. Kwaza, a language in Brazil, has only around 54 speakers, and it has no acknowledgments in historic sources, and is starkly different from the languages in the family it was assumed to be a part of. Jalaa is a Nigerian language that is now extinct, but is alive in the memories of the younger generations whose elders spoke it and is also unlike any of the languages in the region. Bangime is a Malian language with roots that match the languages of West Africa, but with vocabulary that is very different. It has been described as an “anti-language,” a language that is used specifically to inhibit outsiders from understanding it, and the name that the speakers have chosen for themselves literally translates to “the hidden people.” Laal is a language that is somewhat of a mix between two language families, but its endangered status makes it difficult to pin down as an isolate or as part of one (or both) families.

Lastly, there are languages that are accounted for in sources, but whose existences are doubted by linguists. Many of these languages are spoken by a single ethnic group and are identified by outsiders as separate languages when they might actually be dialects of a nearby language. Oropom, Nemadi, and Wutana are examples of this. Of course, this brings us back to the conversation concerning what a language is and what a dialect is. Then, there are those languages which belong to legendary stories. The Trojan language is perhaps the most well-known, but there really is no evidence that it ever really existed, and so it can’t be classified.

Unclassified languages are very uncommon, but that doesn’t mean that they don’t exist. Whether all of these languages are isolates that really are unlike any other language or none of them are, they are outliers that linguists have trouble relating to other languages. Language classification is important for language preservation and to prevent the extinction of both use and recognition. It allows us to acknowledge where we’ve come from and how creative and generative the human race can naturally be when such a thing is required of us.

Linguists are far from perfect, so maybe they are missing the mark somewhere and some of these languages will be classified in the near future. After all, even entire proposed language families are found to be misguided and inaccurate (such as Altaic), so it’s very possible the understanding of our linguistic genetics will change as we learn more. Regardless, these languages are mysterious and a huge source of wonderment for many historical linguists. Who knows, maybe some are deliberately trying to stay unknown, desiring to stay “hidden” or unbothered, like the Himarima or Bangime. We might never complete the worldwide family tree; I doubt it will stop us from trying.

Previous
Previous

Translation Not Found

Next
Next

Tonal Languages & Music