The numeric tower fiasco
✏ 2023-11-22 ✂ 2023-11-22Object-oriented programming is an exceptionally bad idea which could only have originated in California.
Introduction
Object-oriented programming
Object-oriented
is a vague term that means different things to different people.
This article focuses on the class-based flavor that the most popular programming languages (Java, Python, C++, C#) implement.
is still the dominant paradigm in the software industry.
The typical advice on modeling the problem domain is to fit your entities into hierarchies where specific types inherit common attributes and behaviors from base
classes.
That idea looked intuitive and attractive to me when I first encountered it.
I spent countless hours learning about the good
object-oriented design and trying to apply these ideas in practice.
The practice showed that real engineering problems rarely fit into rigid class hierarchies. I grew more frustrated with object-oriented programming and eventually entirely gave up on it. This article describes one example that stuck with me from the start of my programming career: modeling the hierarchy of number types.
The numeric tower
God made the integers, all the rest is the work of man.
I like math, so one of the first structures I tried to model with classes was the most precisely specified and well-studied hierarchy in human history: the numerical tower.
Mathematicians discovered The primary motivation for introducing integers was to create a numeric space where equations of the form a + x = c, where a and c are naturals, always have a solution. Rationals play the same role for equations of the form a × x = c. quite a few types of numbers; usually, each new type is the most natural extension of the previous one:
- Naturals (ℕ) are the numbers we use for counting: 0, 1, 2, ….
- Integers (ℤ) extend naturals to include negative whole numbers: 0, 1, -1, 2, -2, ….
- Rational numbers (ℚ) add common fractions where the numerator and denumerator are whole numbers: 1, ½, -½, ⅓, -⅓, ….
The tower goes up and includes real numbers, complex numbers, quaternions, etc.
The OOP design
In this section, we’ll design a class hierarchy modeling the numbers using an imaginary object-oriented programming language.
We’ll limit the tower to only two number types to keep the example simple: Naturals
and Integers
.
Most people (me included) will instinctively reach for the class structure where Natural
is the base class, and Integer
extends that base.
I’m sure you immediately spotted a problem with this design: we inverted the hierarchy, breaking the is a
relationship!
We claim that all integers are also naturals, but that’s false.
The design violates the Liskov substitution principle: passing an Integer
to a function that expects a Natural
can produce incorrect results:
> Natural.make(5) + Integer.make(10, true) // 5 + -10
15 // ouch
Enlightened by this failure, we reverse the inheritance hierarchy.
Integer
becomes our base class, and Natural
extends it.
Once the initial excitement dissipates, we face a new issue.
The simplest case, the Natural
numbers, must have the same representation as the more complex type, Integers
.
The problem becomes more apparent if we add more types to the tower, such as Rational
.
Instead of starting with the simplest case and building from it, we started with the most complicated one and made everything else a special case.
That’s why the original incorrect design was so compelling: we moved from simpler to more complex types, not the other way around!
Another design option is to give up concrete type hierarchies and clamp all number types under a single interface.
This approach is the most versatile, but it has many drawbacks:
- The programmer must write a lot of boilerplate code.
- All types in the hierarchy need to know about one another. Adding a new type requires changing all other types.
- Adding a new operation on numbers requires modifying all the classes.
It’s time to abandon our blunt OOP tools and start from the first principles.
The functional design
The object-oriented tradition obsesses with encoding is a
relations among types as a class hierarchy.
In this section, we’ll use the approach mathematicians employ: we start with simple structures and combine them to form more complex objects.
This approach is similar to using composition instead of inheritance in the OOP world.
Our tool of choice will be the Haskell programming language, but the same principles will work in all languages supporting algebraic data types and functions (even Rust would do, though it would be less pretty).
In our first attempt, we will model each numeric type with a separate Haskell type.
- According to Peano, a natural number is either a zero or a successor of a smaller natural number. This definition is equivalent to the unary numeral system, which is inefficient for computation but convenient for demonstration.
- An integer number is either a non-negative natural number n or a negative number -1 - n, where n is a natural number. Note that with this definition, each integer has a unique representation.
-
We encode the
is a
relation between numeric types as a pure total function. OOP languages usually generate these type-casting functions automatically and apply them implicitly.
Note how modular this implementation is.
Naturals can live in a separate module and don’t need to know about the existence of integers.
We can easily change the implementation of naturals (e.g., use the more efficient binary representation) or add more numeric types (e.g., Rational
s) without breaking other code.
However, the design is not yet satisfactory: we lost the ability to operate uniformly on numbers of different types.
We can add a universal Number
type to reclaim this property.
This design might seem like a re-iteration of the Number
interface story from the OO world, but it’s not:
- We have many opportunities to reduce the boilerplate code to the bare minimum. We will discuss these opportunities shortly.
-
The module implementing
Naturals
doesn’t need to know aboutIntegers
orNumbers
. - We don’t need to modify existing types when we add new functions operating on numbers.
To address the boilerplate issue, we’ll introduce the numeric type promotion operation. When we add two numbers of different types, we convert the simpler type to the more complex one using the previously discussed type conversion functions. We then apply the binary operator dealing with numbers of the same promoted type. Finally, we demote the result to the simplest type that can hold the value.
The impl_binary_op
function captures the promote/execute/simplify pattern.
Concrete binary operators (plus
, minus
, etc.) must deal only with numbers of the same type.
With this, we get a clean and concise model of numbers on a computer.
The code is easy to extend to support more numeric types, which is left as an exercise for the reader
I find adding the Complex
type particularly interesting.
Hint: you might need two different generic Number
types.
.
Conclusion
I find OOP technically unsound… I find OOP philosophically unsound… I find OOP methodologically wrong.
When I use a tool and get unsatisfactory results, I blame myself for not using the tool correctly. At first, I felt the same about the numeric tower case, but now I’m sure the problem is not me; it’s the class hierarchies.
The approach of piling classes on top of one another and trying to make this stack coherent is fundamentally flawed.
It fails spectacularly even on tiny examples where the problem domain is mathematically specified.
The numeric tower is one example, but there are more.
The circle-ellipse problem is a good one,
but my favorite is how inheriting ColorPoint
from Point
breaks the transitivity property of the equals
method:
There is no way to extend an instantiable class and add a value component while preserving the
equals
contract, unless you’re willing to forgo the benefits of object-oriented abstraction.
If something doesn’t work on small examples, why do we expect it to work in large codebases? When was the last time you looked in a complex class hierarchy and admired how easy to understand the code was?
Experts often recommend favor composition over inheritance
E.g., Joshua Bloch, Effective Java
, third edition, item 18, p. 87.
.
I agree with them and favor algebraic data types and simple functions as the ultimate form of composition.