ckBTC internals: event log

  2023-05-01

Introduction

The chain-key Bitcoin (ckBTC) project became publicly available on April 3, 2023. The ckBTC minter smart contract is the most novel part of the product responsible for converting Bitcoin to ckBTC tokens and back. This contract features several design choices that some developers might find insightful. This article describes how the ckBTC minter, which I will further refer to as the minter, organizes its storage.

Motivation

The minter is a complicated system that must maintain a lot of state variables:

  • Unspent Transaction Outputs (utxos) the minter owns on the Bitcoin network, indexed and sliced in various ways (by account, state, etc.).
  • ckBTC to Bitcoin conversion requests, indexed by state and the arrival time.
  • Pending Bitcoin transactions fulfilling the withdrawal requests.
  • Fees owed to the Know Your Transaction (KYT) service providers.

How to preserve this state across canister upgrades?

On the one hand, weHere and further, I’ll use we to indicate the ckBTC development team. didn’t want to marshal the entire state through stable memory. We preferred to avoid pre-upgrade hooks altogether to eliminate the possibility that the minter becomes non-upgradable due to a bug in the hook.

The traditional canister state management scheme. The canister applies state transitions Ti to its states Si on the Wasm heap (designated with a circle) and marshals the state through stable memory on upgrades. This approach requires state representations to be backward compatible only within the scope of a single upgrade. The number of instructions needed for an upgrade is proportional to the state size.

On the other hand, we didn’t want to invest much time crafting an efficient data layout using the stable-structures package. All our data structures were in flux; we didn’t want to commit to a specific representation too early.

Managing the canister state in directly stable memory. The canister applies state transitions Ti to its states Si persisted in stable memory. This approach trades the flexibility of state representation for the predictability and safety of upgrades. The number of instructions needed for an upgrade is constant.

Luckily, the problem has peculiarities we could exploit. All of the minter’s state modifications are expensive: minting ckBTC requires the caller to invest at least a few dollars into a transaction on the Bitcoin network. Withdrawal requests involve paying transaction fees. In addition, the volume of modifications is relatively low because of the Bitcoin network limitations.

Solution

The minter employs the event sourcing pattern to organize its stable storage. It declares a single stable data structure: an append-only log of events affecting the canister state.

Each time the minter modifies its state, it appends an event to the log. The event carries enough context to allow us to reproduce the state modification later.

Managing the canister state using the event log. As in the traditional scheme, the canister applies state transitions Ti to its states on the Wasm heap, but it also records the state transitions to an append-only log data structure in stable memory. The post_upgrade hook applies all recorded transitions to the initial state to recompute the last snapshot. The number of instructions required for an upgrade is proporional to the number of state transitions. Note the absense of pre_upgrade hook.

On upgrade, the minter starts from an empty state and replays events from the log. This approach might sound inefficient, but it works great in our case:

  • The number of events is relatively low because most involve a transfer on the Bitcoin network.
  • The cost of replaying an event is low. Replaying twenty-five thousand events consumes less than one billion instructions, which is cheaper than submitting a single Bitcoin transaction.
  • We can pause and resume the replay process to spread the work across multiple executions if the number of events goes out of hand.

Furthermore, the event-sourcing approach offers additional benefits beyond the original motivation:

  • The event log provides an audit trace for all state modifications, making the system more transparent and easier to debug.
  • The event log is easy to replicate to other canisters and off-chain. It’s a perfect incremental backup solution.
  • We can change the state representation without worrying about backward compatibility. Only the event types stored in stable memory must be backward-compatible.

What is an event?

One crucial aspect of the design is the log record’s content.

The brute-force approach is to record as events the arguments of all incoming update calls, all outgoing inter-canister calls, and the corresponding replies. This approach might work, but it takes a lot of work to implement and requires a complicated log replay procedure.

Differentiating between requests (sometimes called commands) and events is a better option. Requests come from the outside world (ingress messages, replies, timers) and might trigger canister state changes. Events record effects of productive requests on the canister state.

Requests such as ingress messages come from the outside world and can trigger zero or more events.

Let’s use the minter’s update_balance flow as an example:

  1. A user calls the get_btc_address the obtains a Bitcoin address corresponding to the user’s principal.
  2. The user transfers Bitcoin to that address and waits for the transaction to get enough confirmations on the Bitcoin network (as of May 2023, the minter requires at least twelve confirmations).
  3. The user calls the update_balance endpoint on the minter.
  4. The minter fetches the list of its utxos matching the user account from the Bitcoin canister and checks whether there are new items in the list.
  5. The minter mints ckBTC tokens on the ledger smart contract for each new utxo and reports the results to the user.

That’s a lot of interactions! Luckily, the only significant outcome of these actions is the minter acquiring a new utxo. That’s our event type: minted(utxo, account).

If any of the intermediate steps fails or there are no new utxos in the list, the original request doesn’t affect the minter state. Thus a malicious user cannot fill the minter’s memory with unproductive events. Creating an event requires sending funds on the Bitcoin network, and that’s a slow and expensive operation.

Testing

Event sourcing introduces new sources of bugs:

  • The programmer can implement the state transition logic but forget to emit the corresponding event.
  • The event replay logic can produce different results than the original state update.

We run our integration tests on a debug version of the minter canister to address this issue. After each update call, the debug version replays the log and compares the resulting state to the current canister state. If the log becomes inconsistent with the internal state, the test causing the divergence must fail.

Enabling self-checks in the debug canister version to ensure that the event log captures all state transitions.
#[update]
fn update_balance(arg: UpdateBalanceArg) -> UpdateBalanceResult {
    let result = update_balance_impl(arg);

    #[cfg(feature = "debug_version")]
    if let Err(msg) = replay_event_log().check_equals_to(&canister_state) {
      ic_cdk::trap(&msg);
    }

    result
}

The minter’s self-checking feature helped us catch a few severe bugs early in development.

Resources

Similar articles