Good names form Galois connections

✏ 2024-04-01 ✂ 2024-04-01

Objects, concepts, and expressions
Maps of meaning
Names as communication channels
Names as Galois connections
Conclusion
Related work

You can know the name of that bird in all the languages of the world, but when you’re finished, you’ll know absolutely nothing whatever about the bird. You’ll only know about humans in different places, and what they call the bird. So let’s look at the bird and see what it’s doing—that’s what counts.
Richard Feynman, What Do You Care What Other People Think?

I recently read a chapter on naming things in The Programmer’s Brain by Felienne Hermans. There is a lot of research on the topic of names in programming. Researchers found that good names improve code comprehension and that poor names correlate with buggy code and proposed a few heuristics for choosing good names See, for example, How Developers Choose Names by Dror G. Feitelson et al., and To camelcase or to under_score by Dave Binkey et al. .

The book provides a few recommendations on how to create good names There are many books with naming heuristics ranging from classics such as Code Complete by Steve McConnell and Clean Code by Robert C. Martin to more recent Naming things. by Tom Benner. , but it left me unsatisfied. I felt that I already knew how to choose good names, but I wanted to have a good mathematical model for what good names are. This article is the result of my exploration.

Objects, concepts, and expressions

I will defend the thesis that thought has a kind of priority over words and that the concept has a kind of priority over its expression.
Johan Granström, Tretise on Intuitionistic Type Theory, p. 2

In the beginning was the Word. No, scratch that. There is overwhelming evidence that the world existed for billions of years before our ancestors uttered their first words. In the beginning was the Thing.

But then came the Word, right? Nah. Living creatures created mental models of their environments long before the first language appeared. Even soulless and speechless neural networks seem to recognize concepts (such as Chihuahua or muffin) in raw sensory input. The Concept came after the Thing.

Unfortunately, concepts are internal to the mind holding them. Until we figure out how to make telepathy work, we must use some encoding to communicate concepts between minds. That’s where the Word comes into the picture. Words are arbitrary symbols representing concepts.

I’ll borrow terminology from Tretise on Intuitionistic Type Theory and use object to refer to things existing in the world (we won’t need them much, though), concept to refer to mental representations of these objects, and expression to refer to words or sentences used to communicate concepts.

Maps of meaning

The distinction between objects, concepts, and expressions is almost as old as Western philosophy. Unfortunately, it misses one crucial point: concepts and their expressions don’t map one-to-one. Most concepts have multiple expressions; most words map to several concepts (any dictionary is living proof). For example, consuming food can be labeled as eating or dining. And the word go can refer to walking or a board game. More precisely, expressions and concepts form a binary relation.

We must add another ingredient to the picture to tackle the problem of names: the context. Context refers to all the other things that influence the interpretation of an expression: surrounding words in the sentence, the task the reader is trying to accomplish, and the reader’s culture and background. The amount of information in the context is vast as it includes the strengths of all the neuron connections in the reader’s brain.

The mismatch of the writer’s and the reader’s context, generally known as the curse of knowledge, is one of the reasons why writing is so hard. The writer must imagine being in the reader’s shoes when structuring the material, deciding which metaphors to use, and expressing herself. The narrower the reader’s profile, the better job a writer can do. That explains why there is no one perfect book for every subject. The reader’s identity is no less important than the book’s contents.

The context turns the many-to-many relation between concepts and words into two maps. The first map, encode : Concept → Expression, defines how the author picks words to communicate the concept. The second map, decode : Expression → Concept, defines how readers interpret these words.

Names as communication channels

No matter where you go, you usually find that Claude Shannon was there and solved your problem fundamentally and elegantly. Your only solace is that sometimes, his solution was not constructive.

In information theory, a channel is a theoretical construction that allows sending messages through time and space. A channel takes messages from a set X as inputs and produces messages from a set Y as outputs. Channels are noisy: they distort the messages passing them; that’s why the channel’s input and output sets might differ. We represent the noise mathematically by using conditional probabilities: p(y | x) is the probability of receiving message y if the sender sent message x.

When you think of a channel, you probably imagine a cable transmitting electrical signals or how the space around you buzzes with electromagnetic waves. Channels, however, take many forms. Any storage device, such as a hard drive, a notebook, or the brain’s memory system, is a channel transmitting messages through time.

We can view words as communication channels passing concepts from one mind to another through time. As we saw in the Maps of meaning section, these channels are fundamentally noisy: the output concept might differ from the sender’s input depending on the word choice and the receiver’s context.

One fundamental characteristic of a channel is how much its output resembles the input. Mutual information is the value quantifying this resemblance mathematically.

This view gives us the criteria for a good name: it’s the channel maximizing mutual information. In the following formula, C_s is the sender’s concept, C_r is the receiver’s concept, E is the set of all possible concept expressions, and I(C_r; C_s) is the mutual information computed for expression e.

best_name(C_s) = argmax_{e ∈ E} I(C_r; C_s)

The channel interpretation of names is quantitative but not particularly satisfying because it involves probabilities that are hard to compute. Furthermore, the only way to know you’ve found a good expression for your concept is to compare it with all other expressions. Could we find a more elegant way to capture the essence of a good name?

Names as Galois connections

Galois connections do something really cool: they tell you the best possible way to recover data that can’t be recovered. More precisely, they tell you the best approximation to reversing a computation that can’t be reversed.
John Baez, Applied Category Theory Course

We already know that encode and decode are not perfect inverses of each other. The next best thing we can hope for is that these maps form a relaxed version of inversion: a Galois connection.

Galois connection is a mathematical concept capturing a relationship between four objects: two partially ordered sets A and B and two monotonic maps F and G between these sets. Partial order means that we can compare some elements of each set using the ≤ operation that satisfies a few natural laws (reflexivity, antisymmetry, and transitivity). Not all elements must be comparable. A function F is monotonic if it preserves relative order: if x ≤ y, then F(x) ≤ F(y).

Functions F: A → B and G: B → A form a Galois connection if they satisfy the following relation: F(a) ≤ b ⇔ a ≤ G(b) for all a ∈ A and b ∈ B.

This definition is abstract and took me a while to appreciate, so let’s apply it to the problem of good names step by step. We start with the sets in question and their orderings.

Our first set is the set of expressions E. For expressions e₁, e₂ ∈ E, we say that e₁ ≤ e₂ if e₁ at least as expressive as e₂ given the reader context. If the expressions are equally expressive, we define the shorter element as smaller. The reader context is a crucial component of the ≤ operator. For example, cat is more expressive than felis catus in a regular conversation, but it might be the other way around in a scientific paper. This paragraph doesn’t do a good job at defining expression ordering, but it’s ok: the whole point of the Galois connection is to make expression comparison precise through its connection with better-defined objects.

Our second set is the set of concepts C. If we have concepts c₁, c₂ ∈ C, we say that c₁ ≤ c₂ if c₁ is at least as specific as c₂. For example, cat ≤ animal: since every cat is an animal, there are at least as many animals as cats. The idea of being more or less specific part might sound handwavy to you, but we can make it precise using methods of formal concept analysis.

The functions forming the Galois connection are the concept decoding function decode : E → C and the concept encoding function encode : C → E. These functions are monotone because encode maps more specific concepts to more expressive names, and decode maps more expressive names to more specific concepts. The connection becomes decode(e) ≤ c ⇔ e ≤ encode(c).

Even the specialized equivalence looks unintuitive, so let’s see how it holds with an example. Imagine we want to name a variable holding the amount of usd in a bank account (that’s our concept c). Would value be an appropriate variable name?

Let’s see what our connection tells us: decode(value) is a vague concept, less specific than the bank account balance. Hence, decode(value) > c, so value is not a good name. I would prefer usd_balance, which decodes to a more specific concept.

But wait, wouldn’t bank_account_usd_balance be an even better name? It depends on the context! A shorter name is preferable if the reader knows that we are dealing with bank accounts in a piece of code. At least, that’s how I defined the expression ordering; your mileage may vary.

Conclusion

In this article, we explored the difference between concepts and their expressions, determined the crucial role of the reader’s context in the interpretation, and finished with heavy mathematical machinery that pinpoints good names using a variant of Occam’s razor: A good name is as succinct as possible but still specific enough to convey the intended concept.

How do we apply this knowledge? For me, the main lesson is that managing the reader’s context is crucial for effective communication:

Many resources recommend that the variable name length should be proportional to the size of the scope containing it See, for example, Google’s Go styleguide. Bob Martin mentions this rule of thumb in his book Clean Code.. This recommendation makes a lot of sense if we view it though the prism of context management: if the variable definition is in the view of sight, the reader won’t have any trouble decoding it, no matter how short it is. The same idea applies to names that are part of the programming lore, such as i and j for loop counters.
Create a project glossary and make it obligatory for all contributors to read. Consistently use names from the glossary.
Pause and review your variable identifiers once you finished writing your code. What would be the decoding of each identifier if you saw it for the first time? Would a newhire understand these names?

Happy naming!

Concise and Consistent Naming by Florian Deißenböck and Markus Pizka provides a similar, though slightly simpler, formal model of variable names in section 3.2.