Stepanov’s biggest blunder
✏ 2025-07-13 ✂ 2025-07-13- The curious case of adjacent difference
- Three central problems of calculus
- Symmetry vs pragmatism
- Appendix: deltas in q
The curious case of adjacent difference
If you have ever tried using the std::adjacent_difference
algorithm in c++,
I’m sure it left you puzzled.
As the name suggests, this algorithm computes differences between adjacent elements of the input sequence,
but it does one more thing: it copies the first element of the input sequence into the output sequence unmodified.
The following example demonstrates how to apply the algorithm to delta-compress a postings list of document identifiers that contain a search term
(the example is contrived since Google developed much more sophisticated posting list compression techniques).std::adjacent_difference
algorithm.
The compressed version might require less memory when encoded using variable-length integers.
#include <iostream>
#include <numeric>
#include <vector>
// prints:
// 12586 426 548 110 566
int main() {
// Sorted list of documents containing a word.
std::vector<int> postings{{12586, 13012, 13560, 13670, 14236}};
std::vector<int> compressed(postings.size());
std::adjacent_difference(
postings.cbegin(),
postings.cend(),
compressed.begin()
);
for (auto n : compressed) std::cout << n << " ";
std::cout << std::endl;
}
This extra copy makes the algorithm less generic and practical,
since a difference of two values of an arbitrary type is likely to have a different type.
For example, the difference between two timestamps is not a timestamp but a duration;
and subtracting two unsigned integers might require a signed integer.
The following example illustrates this limitation: the code attempts to compute durations between timestamps in a container, but the compiler rejects it.std::adjacent_difference
over a vector of timestamps leads to a compiler error.
#include <chrono>
#include <numeric>
#include <vector>
int main() {
using clock = std::chrono::steady_clock;
auto start = clock::now();
std::vector<clock::time_point> timestamps{
start,
start + std::chrono::seconds(5),
start + std::chrono::seconds(15)
};
std::vector<clock::duration> durations;
std::adjacent_difference(
timestamps.cbegin(),
timestamps.cend(),
std::back_inserter(durations)
);
}
Why did Alex Stepanov design adjacent_difference
in such a way?
Was it a mistake?
No, it was a deliberate design choice; the original sgi stl documentation provides the reasoning behind it:
The reason it is useful to store the value of the first element, as well as simply storing the differences, is that this provides enough information to reconstruct the input range. In particular, if addition and subtraction have the usual arithmetic definitions, then
adjacent_difference
andpartial_sum
are inverses of each other.
std::adjacent_difference
and std::partial_sum
are inverses:
computing partial sums over the adjacent differences of a sequence yields that sequence, and vice versa.std::partial_sum
undoes the effects of std::adjacent_difference
, and vice versa.
In this and other sequence examples, each line transforms the line above it.
sequence 1 1 2 3 5
adjacent_difference 1 0 1 1 2
partial_sum 1 1 2 3 5
This symmetry has a beautiful connection to the fundamental theorem of calculus, and it helped me understand both the algorithms and the theorem more intuitively. The following section explores this connection in detail.
Three central problems of calculus
According to Steven Strogatz
Steven Strogatz, Infinite Powers
, p. 144
, calculus has three central problems:
- given a curve, find its slope everywhere (the forward problem),
- given a curve’s slope everywhere, find the curve (the backward problem),
- given a curve, find an area under it (the area problem).
The fundamental theorem of calculus connects all three problems and states that the area under a slope of a curve on an interval is the difference of the curve height evaluated at the ends of this interval:
Calculus deals with continuous curves, but we will tackle these problems through the lens of discrete value sequences. We’ll view a sequence as a function over natural numbers and denote the discrete derivative of as and its antiderivative as , where
corresponds to differences between adjacent elements in the source sequence, and corresponds to partial sums. The discrete variant of the fundamental theorem becomes:
This formula has an off-by-one issue: if , we have to access the minus first element, so, for convenience, we define .
I like how Kyne Santos explained this result:
imagine you’re hiking up a mountain, going from point A to point B, and you want to find out the overall change in elevation. So let’s say that point A, the starting point, is 500 meters above sea level. In the first hour, you ascend 100 meters. In the second hour, you descend 50 meters, and in the third and final hour you ascend 200 meters to arrive at point B, which is 750 meters above sea level. The question is, what’s the overall change in elevation? Well, there’s two ways to go about it. You can find the final elevation, which is 750 meters, and just subtract the starting elevation, which was 500 and the difference between 750 and 500 is 250 meters. Or you can add up the little changes along the way. So in the first hour, we climbed 100 meters, and then we descended 50, and then we climbed another 200 so 100 minus 50 plus 200 is 250 meters. And these two approaches represent the two sides of the equation in the fundamental theorem of calculus.
Finding slopes with adjacent differences
The slope between two discrete values is their difference, so taking the difference between neighboring points of a sequence solves the forward problem, yielding slopes for a sequence of elements.
sequence 1 3 5 7 9
slopes 2 2 2 2
This operation loses information: we cannot recover the original sequence from its slopes. In calculus, taking a derivative is also a lossy operation since functions and (where is an arbitrary constant) have the same derivative .
Recovering the original with partial sums
The backward problem is the most interesting of the three. Since the sequence of slopes had one less item than the source sequence, there is not enough information to recover the latter. We know the changes between points, but not the first value to apply the changes to. Once we know it, we can compute the rest using partial sums.
slopes 2 2 2 2
sequence C C+2 C+4 C+6 C+8
The lossiness of differentiation is the reason for introducing an arbitrary constant
when computing an indefinite integral in calculus:
.
In the discrete case, this constant corresponds to the first value in the original sequence;
that’s why std::adjacent_difference
preserves the first element in its output.
Finding areas with partial sums
The area
of a sequence is the sum of its elements.
The sequence of partial sums solves this problem
because it allows us to sum the items on any subinterval of the original sequence in a single step:
(as I previously mentioned, we define
to be zero).
sequence 1 3 5 7 9
partial sums 1 4 9 16 25
Partial sums don’t introduce or lose any information. We can always recover the original sequence using the stl definition of adjacent differences:
sequence 1 3 5 7 9
partial sums 1 4 9 16 25
adjacent diff 1 3 5 7 9
Symmetry vs pragmatism
The connection between std::partial_sums
and std::adjacent_difference
is aesthetically pleasing,
but I find the design of the latter algorithm unfortunate:
-
std::adjacent_difference
is significantly less generic than its lossy version (i.e., a version that doesn’t copy the first element) would be, as it forces the output element type to match the input element type. The few times I needed to compute pairwise differences, the semantics ofstd::adjacent_difference
stood in the way, and I ended up writing a custom loop. I’m not alone. Luckily, thepairwise_transform
adapter from c++23 doesn’t make the extra copy. -
Derivatives in calculus are lossy,
so the definition of
std::adjacent_difference
doesn’t exactly correspond to a derivative. Forcing the symmetry between discrete algorithms breaks the symmetry with their continuous counterparts.
I disagree with Stepanov’s design choice, but I’m glad it made me question his intentions and find these hidden connections. api design is hard, especially when you try to express novel ideas, such as efficient generic programming, in a new programming language, as c++ was at the time. Just like Einstein’s cosmological constant, Stepanov’s extra copy turned out to be useful after all.
Appendix: deltas in q
Just like std::adjacent_difference
,
the deltas
function from the q
programming language preserves the first item of its input:
deltas 1 4 9 16
1 3 5 7
However, the deltas
function operates slightly differently from its c++ cousin:
Instead of copying the first item verbatim,
it prepends a seed value of zero to the sequence before computing pairwise differences.
deltas 1 4 9 16 == (1 - 0) (4 - 1) (9 - 4) (16 - 9) == 1 3 5 7
q
defines deltas
in terms of a more general Each Prior operator as -':
.
This direct form picks the seed based on the operation (zero for subtraction, one for multiplication, etc.)
and allows the caller to override the default through its left argument.
(*':) 2 3 4 / 1 is the identity for *.
2 6 12
1950 -': 1952 1954 1960 / Use 1950 as the seed instead of zero.
2 2 6
This design preserves symmetry with partial sums
but avoids the type mismatch between the input and output sequences, achieving both elegance and pragmatism.