Archive for March, 2023

Detail #434: Number, Clusivity, Personal Pronouns

Wednesday, March 29th, 2023

I imagine this might actually be something that exists in a language. Consider first and second person pronouns and number. Normally, the number marked is the number of the group discussed. I.e. when I say 'we', I might very well be the single person present who belongs to this 'we'. Of course, there's clusivity which can clarify this, but let's consider plural 'you'. This may very well be uttered towards a single person who represents a group that mostly is absent.

Is there any language that encodes both the number of the group it refers to, as well as the number of persons currently present out of that group?

In part, however, this might even be a bit redundant, and we could introduce a further complication beyond the redundancy.

The obvious uses are:

1-singular-plural: I who am the only person present, and some people
2-singular-plural: you who are the only person present, and some people

Is the second slot meant to signify number of non-present, or is it meant to include 'the full number of referents'? These two give different interpretations:

interpretation 1: 1-plural-plural: I, and some people
interpretation 2: 1-plural-plural: I, and some people who are present, and some other people

Thus, we here have two options: conflate the distinction whenever several members of the group are present, or distinguish them thus:

1-plural: I and some people who are present
1-plural-plural: I, and some people who are present and some people who aren't

Naturally, this should be easy to extend to duals and trials.

An interesting simple approach for a conlang could be this though: just have singular and plural, and distinguish by number of addressees.Also, 'I' can mean 'we' if only I, out of the whole group, am present.

Word break conventions and emergent typology

Thursday, March 16th, 2023

I've been doing a lot of free-form writing in Koa this year and it's been a pretty revealing experience. There's nothing that exposes semantic gaps and structural shortcomings like trying to write complex, expressive prose; initially all my writing felt unbelievably clumsy, with none of the grace, sophistication or subtlety that I try to embody when I write in other languages I know well. After a month or two, though, I feel like I'm starting to find my voice in Koa -- or maybe more accurately, Koa is finding its own nascent voice.

This is really the first constructed language in which I've navigated this process and it's fascinating (and intimidating): coming from a place of only having written single, unconnected example sentences, how does the language in question construct, say, a whole paragraph? How does it flow structurally? I feel so practiced in other areas of language design, but here I'm just doing my best to move through it all in an intuitive way without getting hung up on my own anxiety. Someday I'll have to try to actually articulate some of these emergent principles, but I think they need time to emerge further first.

In the mean time, another thing that came up as I began keeping a regular journal in Koa was a discovery I only made when I tried to read what I'd written later on. For one thing, I knew theoretically that production and comprehension were different disciplines, but I wasn't quite prepared for just how unpracticed I was at understanding my own language. It makes sense: I'd really never had the opportunity to try to interpret speech or writing coming at me before! In response I added a word recognition module to my vocab learning program; previously it had only been testing me on production in the target language.

More surprisingly, though, it turned out that the way I've always represented Koa is kind of hard to parse. Here's an example block of text written in the traditional style:

Ta lai la ka ásulo ta la ko vúakupu e ko mivami, sii, ta mene la ko kóuva e tule lai la ni. Ni si vima poli lo kopato ve hua i cu misucu, ala he lopu poka i pea pono e ka lila ni sai i si kali. E ka tana i kali i koe ka sena. Hala kehe nu lu nike la ko mova ka kecu, ka nu lu ete la ko mupea ka háote nu ne kene koa.

As soon as I start to read it my eyes sort of go out of focus; with such a rapid stream of little words it's hard for me to keep track of where I am in the text, let alone where I am in the syntactic tree. As a result, over the past month I've been experimenting with writing roots with their particles attached to them. The precise rules about what should be attached and what should be left separate are still developing, but the essence of the system has come together nicely. Here's what that previous paragraph looks like with the new conventions:

Talai lakaásulota lakovúakupu e komivami, sii, tamene lakokóuva e tule lai laní. Nisivima poli lokopato ve hua i cumisucu, ala helopu poka i pea pono e kalílani sai i sikali. E katana i kali i koe kasena. Hala kehe nulunike lakomova kakecu, kanuluete lakomupea kaháotenu nekene koa.

Even though this was unfamiliar, I instantly found it massively easier to parse. Allison said that made sense to the extent that there were many more word shapes now for the brain to grab onto; it's also entirely clear which particles belong to which roots, and morpheme clusters mirror natural intonation groups. Here's an attempt to articulate the principles of the system.

1. Particles whose scope is a predicate -- regardless of how complex it is is -- are written together with that predicate. This may require the use of additional accentuation where possessive pronouns and directionals are suffixed to the root.

ninasitemuláheta = "I couldn't make him leave"

2. Particles whose scope is a clause with a pronominal subject are joined joined to that clause (but see point 6)

nisánota lakomutulu kakúmumani  = "I said it to make my teacher angry"

3. Particles whose scope is a clause with a full subject NP are separated from surrounding words

nitovo ko le Kéoni i cutule = "I hope that John will come"

4. Predicate clusters -- compounds and incorporated objects -- are written together, but plain adjectival phrases are not joined to their head nouns

kalopuviko = "the weekend," but
kapasano vime = "the last statement"

5. Pronominal particles follow the same rules as predicates when used as the head of an NP, but must be marked with an accent.

laní = to me
nahunú = none of us

6. Certain particles, principally with clause-level scope, are always written separately: i, e when it means "and," au, ai, ha when it means "if," ve when used as a complementizer, and ko when used alone as a complementizer (this list may not be exhaustive). Le is also separated from its head to avoid muddlement with capitalization and foreign words.

One point of uncertainty: when a particle is written separately from its head but is itself within the scope of other particles, are those particles also separated or should they be attached to the "frontmost" one? For example, which of the following should be the convention?

nisánota lakole Kéoni i cutule
nisánota lako le Kéoni i cutule
nisánota la ko le Kéoni i cutule
"I said it so that John would come"

I'm not sure yet; I'll get back to you after more experimentation. I suspect a standard will shape itself over time.

A bunch of this, incidentally, may actually be an artifact of trying to smoosh Koa into an alphabetic writing system. If the language could be written with a syllabary rather than an alphabet, and if there were some marking that identified the stressed syllable of predicates -- in other words, if predicates were instantly differentiated visually from particles -- then there would be a much closer match between writing and Koa's native structure.

But what, then, is Koa's native structure? I had always thought of it as a basically isolating language, but one thing that really surprised me when I first saw text written with these new conventions is how...agglutinating it looks. I'm sort of shocked that I've never asked this question before, but...where does the structure of Koa really fit, typologically?

The language is certainly about as close as you can get to monoexponential in that each morpheme is (theoretically) encoding one and only one semantic, and since I've been thinking of all particles and predicates as individual "words," my unconsidered classification of isolating seemed justified. But looking at forms like this one from above...

"I couldn't make him leave"

...I really wonder on what grounds I would not call that a "word." A word constituting a complete sentence, with seven morphemes, which a Turkish speaker could feel right at home with. And if that resemblance isn't just incidental but in fact diagnostic, then classifying those first five morphemes as "particles" is obscuring something important: they're actually prefixes. Occupying slots, in a specific order. Like an agglutinative language.

I'm actually not sure how to make a ruling on this, and more thought and research may be required. Some of those particles certainly can stand on their own in certain contexts -- nate "no, I can't," or keka sa? ni "who is it? me" -- and maybe more revealingly, the pronouns can appear to be gapped: 

"the one who couldn't make him leave"

"the one I couldn't make leave"

On the other hand I've vigorously maintained previously that gapping is in fact not the best explanation for these structures despite the fact that it's possible to draw the trees that way. It may be that this new word break convention and the kinds of apparent agglutinative "words" it produces is itself also obscuring some of the true nature of the base structures. Ultimately this is not a question of graphical representation -- whether we write ni na si te mu lahe ta or ninasitemuláheta -- but what's really happening below the surface. And I'm starting to tie my brain in knots which is a pretty clear sign that I need to put down this problem for a bit.

More to come, clearly.

A parting of the ways

Wednesday, March 8th, 2023

My decision last month to remove all predicate roots beginning with /j/ from the Koa lexicon sparked a significant artistic crisis. I tried to accept the replacements, but as time passed I was confronted with a growing feeling that this change was not okay. I loved those proscribed roots, loved the variation in syllable structure that they provided, and realized that I would like Koa less without them; worse than that, that it would feel like it had lost some of the essence of itself. It would feel like it was no longer mine.

I was clearly right when I said that this phoneme had no place given Koa's charter, but it just doesn't matter: apparently at this point the language has developed such a strong sense of itself, especially after all the vocabulary creation and writing that's been going on this year, that honoring that personality is actually more important. The charter was supposed to be an inspiration, not a prison, and the fact is that I love what Koa has become so much that I would rather change the limits than stifle the language to fit within them.

This may seem like a lot of fuss over 20 roots and a marginal phoneme, but this is the first time I've ever consciously and intentionally prioritized aesthetics over the language's ease or clarity. It's uncomfortable, but also unquestionably the right decision.

Emboldened by this I've found myself thinking crazy thoughts, like considering adding another consonant phoneme. I experimented with [ŋ] and was shocked to discover that I actually loved it, and that it "felt" like Koa despite the fact that it would be completely off the deep end charter-wise. I don't know that I'll really go down that path, but it's sort of a wonderful thing that after 23 years there is something that Koa "feels like" to such a clear extent that it can begin to direct its own course into territory I'd never imagined.

Over the weekend I reinstated all my exiled vocabulary. It was a tremendous relief. Honestly I think I would have died on that hill for iolo alone.

Taadži Liguistics

Wednesday, March 1st, 2023

Lauren Kuffler is a computational geneticist and hobbyist conlanger. They are a Ph.D. candidate in Mammalian Genetics at Tufts Graduate School of Biomedical Sciences, focusing on the 3D context of genetic-epigenetic interactions affecting gene expression. They have a lifelong interest in linguistics. The Taadži language is their first conlang to escape private notebooks. They have been working on the language and its associated worldbuilding for two years.

Tade Taadži is the representative conlang of an ongoing worldbuilding project, focusing on a culture that arises from dispossessed peoples transported to an isolated archipelago. This article will provide a brief historical context for the language, describe its grammar and demonstrate its logo-phonetic writing system with example sentences and an illuminated text. Notable features include an extensive system of ligatures in formal texts, and a five-gender personal pronoun system.

Version History

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License