Debug like Feynman, test like Faraday

  2024-09-26

Some of the finest software engineers I met didn’t study computer science. They graduated with a physics degree and taught themselves programming Some people from this camp that I haven’t met in person: Stephen Wolfram, the creator of Mathematica and Wolfram Alpha, Bartosz Milewski, the author of Category Theory for Programmers, and David Roundy, the author of the Darcs version control system. . These people think differently than most engineers:

Inspired by these traits, I started to look for ways to apply insights from physics to software engineering. These disciplines have much in common: physics is about understanding and describing complex systems by looking at their behavior, which is an integral part of software engineering (another part is crafting systems that someone has to understand). The main difference is that our systems are human-made and sometimes come with source code.

This article explores how to apply physics methods to improve your software engineering skills: valuing data, investing in informal and formal models, and using the scientific method in all aspects of the trade.

Scientific method

It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong.

Richard Feynman

For most of human history, people tried to understand the world by telling compelling stories. We believed that planets move on surfaces of spheres made of quintessence, and that all illnesses come from the disbalance of bodily fluids.

Our approach to science changed drastically around the seventeenth century with the rise of skepticism and empiricism. This revolution gave us our finest tool for understanding the world—the scientific method. The process is deceptively simple:

  1. Start with a theory as a premise and deduce a testable hypothesis.
  2. Conduct an experiment testing the hypothesis.
  3. You refuted the theory if the results disagree with your prediction. Otherwise, squeeze out another prediction from your theory and repeat the process.

If you like certainty, the scientific method is rather disappointing. It can never establish whether a theory is true; it can only discard flawed theories. By applying the method repeatedly, we arrive at progressively less wrong theories.

That’s all great, but what does it have to do with programming? For one, the scientific method is at the heart of efficient debugging. We devise an idea for what the bug might be, observe the program behavior to test it, and either discard the idea or devise a new experiment. Andreas Zeller, the author of GNU Data Display Debugger, wrote an entire book on systematic debugging. In later sections, we will see more applications of the method to software engineering.

Think like a scientist

Record your observations

Remember, kids, the only difference between screwing around and science is writing it down.

It’s only a few days before the next major release, and the application you work on crashed on you. You open the bug tracker and search for the error message it spat out. What a luck! You found an issue with a promising title. Full of hope, you open the ticket; all you can see is status: done, resolution: fixed. It seems you are going to have a long day.

Few things are more disappointing than opening a fixed bug report and learning nothing from it. Merely recording your work turns you from a dabbler into a scientist. Did you try something that didn’t work? Record your adventure in the ticket (reporting negative results helps avoid publication bias). Found the culprit? Document the problem details and the solution for posterity.

Consistently recording your work might seem like a waste of time, but it’s usually not:

Make your bug reports and user stories an interesting read.

Test causality

The invalid assumption that correlation implies cause is probably among the two or three most serious and common errors of human reasoning.

Stephen Jay Gould, The Mismeasure of Man (Revised & Expanded), p.178

Phew, hunting down that bug took you a while. You fix the code and add a unit test demonstrating the problem. The test is green, and your change lands on the main branch. The next day, the problem appears again. How could that happen?

Many programmers write tests after they fix the code. I’m sure you did. I certainly did many times. This approach feels good but defies basic logic. If we want to demonstrate that our fix causes the test to pass, we need two propositions:

  1. The test fails without the code fix (no fix implies a failed test).
  2. The test passes with the code fix (a fix implies a passed test).

What else could cause the test to pass if not your fix? I don’t know, neither do you. The test code might have a bug or not trigger the relevant code path in the way you expected. You don’t know until you try.

Apologists of test-driven development are onto something, though I’ve never heard the causality argument from them. If you can’t force yourself to write tests before fixing the code (I can’t), check that removing the fix makes the test fail. I wish there existed tools checking that each test we write used to fail before it became green.

Know your data

About 40 seconds after the explosion the air blast reached me. I tried to estimate its strength by dropping from about six feet small pieces of paper before, during and after the passage of the blast wave. […] The shift was about 2½ meters, which, at the time, I estimated to correspond to the blast that would be produced by ten thousand tons of t.n.t.

A guy from the ops team stops by your desk. You feel uncomfortable because you forgot his name again.

Hi! I need to order machines for that new project you are working on. How many servers do we need? Where do we place them? How much disk and ram should they have?

Your ears turn red.

Ahm, how could I know? I’m a software guy!

Software engineers are obsessed with code. We stare at it for hours daily. We can fight with a religious zeal about how we place braces or what character we use for indentation. We can mull over a variable name for half an hour.

The most precious substance, the data flowing through our code, gets little attention. It hides from us behind well-named variables. It lives in files we never look at.

Why is data important anyway? Shouldn’t our code work on arbitrary inputs? Imagine you need to sort a large pile of cards with numbers printed on them. What algorithm would you use? Would you use the same algorithm with a billion cards? Would you use the same algorithm if all the numbers were zero or one? Knowing the shape and distribution of inputs and outputs helps you find more straightforward and efficient designs, discover optimizations, and spot anomalies during debugging sessions.

Besides the system inputs and outputs, there are two more natural data sources:

This data is a prerequisite for back-of-the-envelope calculations that can help you make decisions quickly and efficiently. If you’re new to systematic data collection, the Mature Optimization Handbook will help you to get started. The next time that ops guyJosh. His name is Josh. comes to your desk, you will have an answer for him.

Debug mental models

So the guy says, What are you doing? You come to fix the radio, but you’re only walking back and forth! I say, I’m thinking!

Richard Feynman, Surely You’re Joking, Mr. Feynman!, He Fixes Radios by Thinking!

Your team lead rushes into your cubicle to tell you the production system barely works. You look at the symptoms and have no clue where to start. You launch the debugger and step through the code. Two hours later, you still have no clue.

Debuggers are immensely useful. Launching a debugger should be your first reaction if your application crashes with a core dump or hangs forever. However, using a debugger is hard or impossible in some areas of computing. For example, distributed systems pose the following challenges:

If using a debugger isn’t an option, creating a good mental model is your best bet. As Linus Torvalds put it:

It’s that you have to look at the level above sources. At the meaning of things. Without a debugger, you basically have to go the next step: understand what the program does. Not just that particular line.

Or as Rob Pike wrote about Ken Thompson:

When something went wrong, I’d reflexively start to dig in to the problem, examining stack traces, sticking in print statements, invoking a debugger, and so on. But Ken would just stand and think, ignoring me and the code we’d just written. After a while I noticed a pattern: Ken would often understand the problem before I would, and would suddenly announce, "I know what’s wrong." He was usually correct. I realized that Ken was building a mental model of the code and when something broke it was an error in the model.

Mental models are essential to the scientific method. Raw observations do not exist; we always interpret observations based on a theory or a model. A good model will help you map an observation to a potential cause and develop a new hypothesis to test.

How do you build a good mental model? I usually rely on one of the following methods:

Formalize your models

In every department of physical science there is only so much science, properly so-called, as there is mathematics.

Immanuel Kant

You are proud of your multi-threaded lock-free queue implementation. Your colleagues reviewed it thoroughly and couldn’t find any flaws, and hours of testing did not reveal any issues. You deployed your optimization in production, and the servers stopped handling transactions after two weeks.

Physics requires knowledge and an intuitive understanding of a lot of hairy math. Physicists rely on analogies and intuition in their daily work, but equations and experiments are what matters in the end. You leave your mark in theoretical physics by deriving an equation named after you According to Feynman, when he first met Dirac in 1946 during Princeton University’s Bicentennial Celebration, Dirac’s first words were, I have an equation. Do you have one too? .

The prevalent wisdom is that programmers don’t need math. You can make much money in the industry even if you’ve never attended an abstract algebra class. But as you grow professionally, you eventually hit the ceiling. Math can elevate you to the next level.

Luckily, most software systems need only basic math: set theory, first-order logic, and discrete math. One way to benefit from math is to turn your vague mental model of the system into a precise formal specification. Even if the spec ends up so simple it borders on the trivial, this exercise will give you a crystal-clear understanding of what your system should do. As Leslie Lamport puts it in Specifying Systems, Mathematics is nature’s way of letting you know how sloppy your writing is.

But it gets better. Once you have a formal specification, you can feed it to a computer and ask interesting questions, such as Can my queue algorithm stall forever? tla+ is a powerful and accessible toolbox that can help you write and check formal specifications. It is the tool that the Amazon Web Services team chose to verify their critical systems See How Amazon Web Services Uses Formal Methods, Communications of the acm, vol. 58 no. 4, pages 66-73. .

At DFINITY, I witnessed tla+ uncovering gnarly bugs in designs that initially looked straightforward. Cheng Huang, a principal engineering manager at Microsoft, reports a similar experience:

I wish I had learned about tla+ much earlier in my career. Unlike other formal methods I tried, this tool is relatively easy to pick up. As a side effect, it teaches you to reason about your systems in terms of state machines, an essential skill for a software engineer.

Question your method

There’s nothing quite as frightening as someone who knows they are right.

Michael Faraday

We are living in the best of times. The ideas of rapid iteration, agile development, and continuous delivery are taking over the world. Software shops are churning features at a stunning rate.

We are living in the worst of times. Our software is bloated, slow, and buggy. Our interfaces are overly complicated, often on the border of being enraging. Our users are frustrated and depressed Try searching Google images for frustration. How many pictures have computers on them? .

Any change requires us to ask two crucial questions: what we do and how we do it. Most development methodologies focus on the how and ignore the what. They aim to achieve results consistently and efficiently. This emphasis is misplaced: It makes no difference how efficiently you make things if they aren’t essential The same idea applies to personal productivity: You become productive by clearing up room for work that matters, not optimally packing tasks into your schedule. .

How do we decide what features are worth shipping? We apply our familiar tool, the scientific method. When someone requests feature X, we:

  1. Explain why the feature or the product will make a difference. This explanation is the theory that we want to test.
  2. Look for the most straightforward experiment that could falsify the default hypothesis that feature X is a bad idea, such as analyzing the usage statistics or building a prototype.
  3. Experiment: Implement the most straightforward version of the feature and gather user feedback.
  4. Analyze the outcome. Kill the feature with fire if the users don’t care; polish it if they love it.

Once we decide what to do, the how becomes relevant again, so we apply the method to the development process:

  1. Start with the simplest process imaginable.
  2. You inevitably hit a problem (several people doing the same work, poor development speed, etc.). Come up with an idea for addressing it (introduce a weekly planning meeting, change the code review process, etc.).
  3. Experiment with the idea for a few weeks. Keep the new process if it mitigates the problem.

Be critical of your method. Challenge the status quo. Ask yourself, Why am I doing this? Is it worth my time?

Conclusion

We software engineers are applied scientists. Our experiments are cheap and easy to replicate; we can iterate and learn from our mistakes quicker than researchers in more traditional fields. However, we also tend to ignore scientific methodology that can help us build better systems faster.

The next time you debug that legacy system, be proud of who you are: a scientist a few steps away from discovery. And behave like one.

Acknowledgements

Thanks to Nikolay Komarevskiy for his suggestions for the original version of this article.

Similar articles