ckBTC internals: event log

✏ 2023-05-01 ✂ 2023-05-01

Introduction
Motivation
Solution
- What is an event?
- Testing
Resources

Introduction

The chain-key Bitcoin (ckBTC) project became publicly available on April 3, 2023. The ckBTC minter smart contract is the most novel part of the product responsible for converting Bitcoin to ckBTC tokens and back. This contract features several design choices that some developers might find insightful. This article describes how the ckBTC minter, which I will further refer to as the minter, organizes its storage.

Motivation

The minter is a complicated system that must maintain a lot of state variables:

Unspent Transaction Outputs (utxos) the minter owns on the Bitcoin network, indexed and sliced in various ways (by account, state, etc.).
ckBTC to Bitcoin conversion requests, indexed by state and the arrival time.
Pending Bitcoin transactions fulfilling the withdrawal requests.
Fees owed to the Know Your Transaction (KYT) service providers.

How to preserve this state across canister upgrades?

On the one hand, weHere and further, I’ll use we to indicate the ckBTC development team. didn’t want to marshal the entire state through stable memory. We preferred to avoid pre-upgrade hooks altogether to eliminate the possibility that the minter becomes non-upgradable due to a bug in the hook.

On the other hand, we didn’t want to invest much time crafting an efficient data layout using the stable-structures package. All our data structures were in flux; we didn’t want to commit to a specific representation too early.

Luckily, the problem has peculiarities we could exploit. All of the minter’s state modifications are expensive: minting ckBTC requires the caller to invest at least a few dollars into a transaction on the Bitcoin network. Withdrawal requests involve paying transaction fees. In addition, the volume of modifications is relatively low because of the Bitcoin network limitations.

Solution

The minter employs the event sourcing pattern to organize its stable storage. It declares a single stable data structure: an append-only log of events affecting the canister state.

Each time the minter modifies its state, it appends an event to the log. The event carries enough context to allow us to reproduce the state modification later.

On upgrade, the minter starts from an empty state and replays events from the log. This approach might sound inefficient, but it works great in our case:

The number of events is relatively low because most involve a transfer on the Bitcoin network.
The cost of replaying an event is low. Replaying twenty-five thousand events consumes less than one billion instructions, which is cheaper than submitting a single Bitcoin transaction.
We can pause and resume the replay process to spread the work across multiple executions if the number of events goes out of hand.

Furthermore, the event-sourcing approach offers additional benefits beyond the original motivation:

The event log provides an audit trace for all state modifications, making the system more transparent and easier to debug.
The event log is easy to replicate to other canisters and off-chain. It’s a perfect incremental backup solution.
We can change the state representation without worrying about backward compatibility. Only the event types stored in stable memory must be backward-compatible.

What is an event?

One crucial aspect of the design is the log record’s content.

The brute-force approach is to record as events the arguments of all incoming update calls, all outgoing inter-canister calls, and the corresponding replies. This approach might work, but it takes a lot of work to implement and requires a complicated log replay procedure.

Differentiating between requests (sometimes called commands) and events is a better option. Requests come from the outside world (ingress messages, replies, timers) and might trigger canister state changes. Events record effects of productive requests on the canister state.

Let’s use the minter’s update_balance flow as an example:

A user calls the get_btc_address the obtains a Bitcoin address corresponding to the user’s principal.
The user transfers Bitcoin to that address and waits for the transaction to get enough confirmations on the Bitcoin network (as of May 2023, the minter requires at least twelve confirmations).
The user calls the update_balance endpoint on the minter.
The minter fetches the list of its utxos matching the user account from the Bitcoin canister and checks whether there are new items in the list.
The minter mints ckBTC tokens on the ledger smart contract for each new utxo and reports the results to the user.

That’s a lot of interactions! Luckily, the only significant outcome of these actions is the minter acquiring a new utxo. That’s our event type: minted(utxo, account).

If any of the intermediate steps fails or there are no new utxos in the list, the original request doesn’t affect the minter state. Thus a malicious user cannot fill the minter’s memory with unproductive events. Creating an event requires sending funds on the Bitcoin network, and that’s a slow and expensive operation.

Testing

Event sourcing introduces new sources of bugs:

The programmer can implement the state transition logic but forget to emit the corresponding event.
The event replay logic can produce different results than the original state update.

We run our integration tests on a debug version of the minter canister to address this issue. After each update call, the debug version replays the log and compares the resulting state to the current canister state. If the log becomes inconsistent with the internal state, the test causing the divergence must fail.

⊕ Enabling self-checks in the debug canister version to ensure that the event log captures all state transitions.

#[update]
fn update_balance(arg: UpdateBalanceArg) -> UpdateBalanceResult {
    let result = update_balance_impl(arg);

    #[cfg(feature = "debug_version")]
    if let Err(msg) = replay_event_log().check_equals_to(&canister_state) {
      ic_cdk::trap(&msg);
    }

    result
}

The minter’s self-checking feature helped us catch a few severe bugs early in development.

Resources

Minter’s eventlog.rs module contains even type definitions and the log replay procedure.
Designing Data-Intensive Applications by Martin Kleppmann has an in-depth explanation of the event sourcing pattern (Chapter 11: Stream Processing, p. 457).
The Internet Computer Wiki has a detailed article about ckBTC minter.