Candid for engineers

  2023-06-22

Introduction

Candid is the primary interface definition language for smart contracts hosted on the Internet Computer.

Most prevalent data-interchange formats, such as Protocol Buffers and Thrift, come straight from engineering departments.

Candid is different. Candid is a child of programming language designers who grew it from first principles. As a result, Candid makes sense but might feel alien to most engineers.

This article is an introduction to Candid I wish I had when I started using it.

Candid overview

As any interface definition language, Candid has multiple facets.

One facet is the textual format defining the service interface. This facet is similar in function to the gRPC system. Another facet is the binary format for encoding service requests and responses. This facet is analogous to the Protocol Buffers serialization format.

Though Candid is similar to gRPC on the surface, there is an essential distinction between the two systems.

gRPC builds strictly on top of the Protocol Buffers format. Service method definitions can refer to message definitions, but messages cannot refer to services.

Candid, on the other hand, ties the message format and service definition language into a knot. Service method definitions can refer to data types, and data types can refer to services. Services can accept as arguments and return references to other services and methods. The Candid team usually calls such designs higher-order cases. The Candid overview article introduces a higher-order function in its first example.

service counter : {
  // A method taking a reference to a function.
  subscribe : (func (int) -> ()) -> ();
}

Another distinctive Candid feature is subtyping rules for defining backward-compatible service evolution. That’s when you want formal language designers on your team.

Service definitions

Most often, developers interact with Candid through the service definition files, also known as .did files.

A .did file contains type definitions and at most one primary service definition, which must be the last clause in the .did file.

The definition of a token ledger registry service. The service keyword at the top level defines the primary service; it must appear as the last definition in the file. Note the difference between a service type definition (top) and a service definition (bottom) syntactic forms.
// A type definition introducing the Token service interface.
type Token = service {
  token_symbol : () -> (text) query;
  balance : (record { of : principal }) -> (nat) query;
  transfer : (record { to : principal; amount : nat }) -> ();
};

service TokenRegistry : {
  // Returns a reference to a token ledger service given the token symbol.
  lookup : (symbol : text) -> (opt Token) query;
}

Two syntactic forms can introduce a service definition: with and without init arguments. The technical term for a service definition with init arguments is service constructor Some implementations use the term class.

Service definitions with (top) and without (bottom) init arguments (rendered in bold font).
service Token : {
  balance : (record { of : principal }) -> (nat) query;
  // ...
}
service Token : (init_balances : vec record { principal; nat }) -> {
  balance : (record { of : principal }) -> (nat) query;
  // ...
}

Conceptually, a service constructor represents an uninitialized canister, whereas a service represents a deployed canister. Init arguments describe the value the canister maintainers must specify when instantiating the canister.

Ideally, canister build tools should produce a service constructor. If the module contains no init args, the tools should use the form service : () -> {…}. Canister deploy tools, such as dfx deploy, should use the init args to install the canister, and use the service as the public metadata, stripping out the init args. As of July 2023, Motoko compiler and Rust CDK don’t follow these conventions, so people often conflate the two concepts.

Types

In addition to a rich set of primitive types, such as booleans (bool), floats (float64), strings (text), and whole numbers of various widths (nat8, nat16, nat32, nat64, int8, int16, int32, int64), Candid provides a few more advanced types and type constructors:

Candid also allows recursive and mutually-recursive types.

type tree = variant { leaf : nat; children : forest };
type forest = vec tree;

Records and variants

Records and variants are the bread and butter of working with Candid.

Records and variants have similar syntax; the primary difference is the keyword introducing the type. The meanings of the constructs are complementary, however. A record type indicates that all of its fields must be set, and a variant type indicates that precisely one field must be set.

Record and variant definitions have similar syntax but different semantics. In a record, all fields must be set. In a variant, precisely one alternative must be set.
type Employee = record {
  first_name : text;
  second_name : text;
  status : EmployeeStatus;
};

type EmployeeStatus = variant {
  full_time;
  contractor : record { contract_expires_at : opt nat };
};

Similarly to Protocol Buffers, Candid uses integers to identify fields and alternatives. Unlike Protocol Buffers, Candid doesn’t ask the programmer to map symbolic field names to integers, relying on a hash function instead. This design choice has two practical implications.

Please refer to the hashed field names section in Joachim’s article for more insight and references.

Tuples

Candid doesn’t provide first-class tuples. There are two constructs closely resembling tuples, however.

  1. Records with omitted field names act as type-level tuples. Candid language integrations, such as native Motoko support and Rust candid package, use this feature to map native tuples to Candid.
  2. Argument and result sequences in service methods behave a lot like tuples.
Tuple-like constructions in Candid: a record with tuple fields (top) and argument sequences (bottom).
// A record with tuple fields.
type Entry  = record { text; nat };
// The Entry and ExplicitEntry types are equivalent.
type ExplicitEntry = record { 0 : text; 1 : nat };

service ArithmeticService : {
  // Argument and result sequences.
  div : (divident : nat, divisor : nat) -> (quotient : nat, reminder : nat) query;
}

Note that Candid ignores argument and result names in method signatures; it relies solely on the argument position within the sequence. Extending the argument sequence with a new optional value is safe, but adding an argument in the middle will break backward compatibility. Prefer using records as arguments and result types: you’ll have more freedom to rearrange or remove fields as the interface evolves.

Using records with named fields as method arguments and results.
service ArithmeticService : {
  div : (record { divident : nat; divisor : nat })
     -> (record { quotient : nat; reminder : nat }) query;
}

See the Tuples section in Joachim’s article for more detail and advice.

Structural typing

Candid’s type system is structural: it treats types as equal if they have the same structure. Type names serve as monikers for the type structure, not as the type’s identity.

Variable bindings in Rust are a good analogy for type names in Candid. The let x = 5; statement binds name x to value 5, but x does not become the identity of that value. Expressions such as x == 5 and { let y = 5; y == x } evaluate to true.

Candid views types Point2d, modeling a point on a plane, and ECPoint, modeling a point on an elliptic curve, as interchangeable because they have the same structure.
// These types are identical from Candid's point of view.
type Point2d = record { x : int; y : int };
type ECPoint = record { x : int; y : int };

Usually, you don’t have to name types; you can inline them in service definitions (unless you define recursive types, of course). Assigning descriptive names can improve the interface readability, however.

Candid allows you to omit type names for non-recursive type definitions. Service types S1 and S2 are interchangeable.
type S1 = service {
  store_profile : (nat, record { name : text; age : nat }) -> ();
};

type UserProfile = record { name : text; age : nat };
type UserId = nat;

type S2 = service {
  store_profile : (UserId, UserProfile) -> ();
};

Subtyping

One of Candid’s distinctive traits is the use of structural subtyping for defining backward-compatible interface evolutions The Candid spec calls such evolutions type upgrades. . If a type T is a subtype of type V (denoted T <: V), then Candid can decode any value of type T into a value of type V.

Let’s inspect some of the basic subtyping rules for simple values (not functions):

Function subtyping follows the standard variance rules: function g : C -> D is a subtype of function f : A -> B if A <: C and D <: B. Informally, g must accept the same or more generic arguments as f and produce the same or more specific results as f.

Function subtyping rules. As the interface evolves, the input types become more general, while output types become more restricted.

The rules mentioned in this section are by no means complete or precise; please refer to the typing rules section of the Candid specification for a formal definition.

Understanding the subtyping rules for functions is helpful for reasoning about safe interface migrations. Let’s consider a few examples of common changes that preserve backward compatibility of a function interface (note that compatibility rules for arguments and results are often reversed).

Before we close the subtyping discussion, let’s consider a sequence of type changes where an optional field gets removed and re-introduced later with a different type.

An example of the special opt subtyping rule. Step removes the optional status field; step adds an optional field with the same name but an incompatible type. The horizontal bar applies the transitive property of subtyping, eliminating the intermediate type without the status field.
   record { name : text; status : opt variant { user;   admin   } }
<: record { name : text } 
<: record { name : text; status : opt variant { single; married } } 

record { name : text; status : opt variant { user; admin } } <: record { name : text; status : opt variant { single; married } }

Indeed, in Candid, opt T <: opt V holds for any types T and V. This counter-intuitive property bears the name of the special opt rule, and it causes a lot of grief in practice. Multiple developers reported changing an optional field in an incompatible way, causing the corresponding values to decode as null after the upgrade.

Joachim Breitner’s opt is special article explores the topic in more detail and provides historical background.

Binary message anatomy

In Candid, a binary message defines a tuple of n values and logically consists of three parts:

  1. The type table part defines composite types (records, variants, options, vectors, etc.) required to decode the message.
  2. The types part is an n-tuple of integers specifying the types (T1,…,Tn) of values in the next section. The types are either primitives (negative integers) or pointers into the type table (non-negative integers).
  3. The values part is an n-tuple of serialized values (V1,…,Vn).

The tuple values usually correspond to service method arguments or results. For example, if we call method transfer : (to : principal, amount : nat) -> (), the argument tuple will contain two values: a principal and a natural number, and the result tuple will be empty.

Example: encoding an empty tuple

Let’s first consider the shortest possible Candid message: an empty tuple.

The shell command to encode an empty tuple using the didc tool.
$ didc encode '()'
4449444c0000

We need six bytes to encode nothing. Let’s take a closer look at them.

Bytes of an empty tuple encoding.
           ⎡ 44 ⎤ D
    Magic  ⎢ 49 ⎥ I
           ⎢ 44 ⎥ D
           ⎣ 4c ⎦ L

Type table [ 00 ] number of type table entries (0)

    Types  [ 00 ] number of tuple elements (0)

Even this trivial message reveals a few interesting details.

Example: encoding a tree

Let’s consider an encoding of a rose tree with 32-bit integers in the leaves.

A definition of a rose tree data type containing 32-bit integers (top) and the Candid representation of the same type (bottom).
// Rust
pub enum Tree { Leaf(i32), Forest(Vec<Tree>) }
// Candid
type Tree = variant { leaf : int32; forest : vec Tree };

Let’s rewrite the Tree type using at most one composite type per type definition. This canonical form will help us better understand the message type table.

The canonical representation of the Tree type.
type T0 = variant { leaf : int32; forest : T1 };
type T1 = vec T0;

Let’s encode a fork with two children equivalent to Tree::Forest(vec![Tree::Leaf(1), Tree::Leaf(2)]) Rust expression using the didc tool.

The shell commands to encode a tree using the didc tool. The --defs option loads type definitions from a file; the --types option specifies the types of values in the tuple (see point in the binary message anatomy section.
$ echo 'type Tree = variant { leaf : int32; forest : vec Tree };' > tree.did
$ didc encode \
       --defs   tree.did \
       --types  '(Tree)' \
       '(variant { forest = vec { variant { leaf = 1 }; variant { leaf = 2 } } })'
4449444c026b029e87c0bd0475dd99a2ec0f016d000100010200010000000002000000

Let’s look closely at the bytes.

           ⎡ 44 ⎤ D
    Magic  ⎢ 49 ⎥ I
           ⎢ 44 ⎥ D
           ⎣ 4c ⎦ L

           ⎡ 02 ] number of entries (2)
           ⎢ 6b ] entry #0: variant type
           ⎢ 02 ] number of fields (2)
           ⎢ 9e ⎤
           ⎢ 87 ⎥
           ⎢ c0 ⎥ hash("leaf")
           ⎢ bd ⎥
           ⎢ 04 ⎦
Type table ⎢ 75 ] field #0 type: int32
           ⎢ dd ⎤
           ⎢ 99 ⎥
           ⎢ a2 ⎥ hash("forest")
           ⎢ ec ⎥
           ⎢ 0f ⎦
           ⎢ 01 ] field #1 type: see entry #1
           ⎢ 6d ] entry #1: vec type         
           ⎣ 00 ] vec item type: entry #0

    Types  ⎡ 01 ] number of tuple elements (1)
           ⎣ 00 ] type of the first element (entry #0)

           ⎡ 01 ] value #0: variant field #1 ("forest")
           ⎢ 02 ] number of elements in the vector
           ⎢ 00 ] variant field #0 ("leaf")
           ⎢ 01 ⎤
           ⎢ 00 ⎥ 1 : int32 (little-endian)
    Values ⎢ 00 ⎥
           ⎢ 00 ⎦
           ⎢ 00 ] variant field #0 ("leaf")
           ⎢ 02 ⎤
           ⎢ 00 ⎥ 2 : int32 (little-endian)
           ⎢ 00 ⎥
           ⎣ 00 ⎦

We can observe a few interesting details about binary encoding.

FAQ

Can I remove a record field?

Short answer: Sometimes you can, but please don’t.

Removing an opt field is always safe, but prefer marking it reserved instead. Reserved fields make it unlikely that future service developers will use the field name in an unexpected way.

// OK: the age field is optional.
 type User = record {
   name : text;
-  age : opt nat;
 };

 service UserService : {
  add_user : (User) -> (nat);
  get_user : (nat) -> (User) query;
 }
// GOOD: marking an opt field as reserved.
 type User = record {
   name : text;
-  age : opt nat;
+  age : reserved;
 };

 service UserService : {
  add_user : (User) -> (nat);
  get_user : (nat) -> (User) query;
 }

The answer depends on the record type variance if the field is not opt.

You can remove the field if the type appears only in method arguments but prefer marking it as reserved instead.

 service UserService : {
-  add_user : (record { name : text;  age : nat }) -> (nat);
+  add_user : (record { name : text             }) -> (nat);
 }
 service UserService : {
-  add_user : (record { name : text; age : nat      }) -> (nat);
+  add_user : (record { name : text; age : reserved }) -> (nat);
 }

You should preserve the field if the type appears in a method return type.

// BAD: the User type appears as an argument and a result.
 type User = record {
   name : text;
-  age : nat;
};

 service UserService : {
  add_user : (User) -> (nat);
  get_user : (nat) -> (User) query;
 }

Can I add a record field?

Adding an opt field is always safe.

 type User = record {
   name : text;
+  age : opt nat;
};

 service UserService : {
  add_user : (User) -> (nat);
  get_user : (nat) -> (User) query;
 }

For non-opt fields, the answer depends on the type variance.

You can safely add a non-optional field if the record appears only in method return types.

 service UserService : {
-  get_user : (nat) -> (record { name : text            }) query;
+  get_user : (nat) -> (record { name : text; age : nat }) query;
 }

Adding a non-optional field breaks backward compatibility if the record appears in a method argument.

 // BAD: breaks the client code
 service UserService : {
-  add_user : (record { name : text            }) -> (nat);
+  add_user : (record { name : text; age : nat }) -> (nat);
 }

Can I remove a variant alternative?

Changing optional variant fields is always If you use Rust, make sure you use candid package version 0.9 or higher. safe.

 // OK: changing an optional field
 type OrderDetails = record {
-  size : opt variant { tiny; small; medium; large }
+  size : opt variant {       small; medium; large }
 };
 service UserService : {
   order_coffee : (OrderDetails) -> (nat);
   get_order : (nat) -> (OrderDetails) query;
 }

If the variant field is not optional, the answer depends on the type variance.

You can remove alternatives if the variant appears only in method results.

 service CoffeeShop : {
-  order_size : (nat) -> (variant { tiny; small; medium; large }) query;
+  order_size : (nat) -> (variant {       small; medium; large }) query;
 }
 // BAD: this change might break clients.
 service CoffeeShop : {
-  order_coffee : (record { size : variant { tiny; small; medium; large } }) -> (nat);
+  order_coffee : (record { size : variant {       small; medium; large } }) -> (nat);
 }

Can I add a variant alternative?

Changing optional variant fields is always If you use Rust, make sure you use candid package version 0.9 or higher. safe.

 // OK: changing an optional field
 type User = record {
   name : text;
-  age : opt variant { child;           adult }
+  age : opt variant { child; teenager; adult }
};

 service UserService : {
   add_user : (User) -> (nat);
   get_user : (nat) -> (User) query;
 }

If the variant field is not optional, the answer depends on the type variance.

If the variant appears only in method arguments, you can safely add new alternatives.

 service UserService : {
-  add_user : (record { name : text;  age : variant { child;           adult }}) -> (nat);
+  add_user : (record { name : text;  age : variant { child; teenager; adult }}) -> (nat);
 }
// BAD: the User type appears as an argument and a result.
 type User = record {
   name : text;
-  age : variant { child;           adult }
+  age : variant { child; teenager; adult }
 };
 service UserService : {
   add_user : (User) -> (nat);
   get_user : (nat) -> (User) query;
 }

Can I change init args?

Short answer: yes.

Service init args are not part of the public interface. Only service maintainers encode the init args; service clients don’t have to worry about them. Service interface compatibility tools, such as didc check, ignore init args.

Can I extend return values?

Yes. You can safely append a new value to a method result sequence.

 service TokenService : {
-  balance : (of : principal) -> (nat) query;
+  balance : (of : principal) -> (amount : nat, last_tx_id : nat) query;
 }

Reordering arguments or results is a breaking change.

 service TokenService : {
-  balance : (of : principal) -> (amount : nat) query;
+  balance : (of : principal) -> (last_tx_id : nat, amount : nat) query;
 }

How do I specify the post_upgrade arg?

As of June 2023, the Candid service definition language does not support specifying post_upgrade arguments in the service definition.

However, there exists a workaround. Most canister management tools use the same type definition for encoding the init args and upgrade args. You can define a variant type to distinguish between these.

Using a variant type for differentiating between service init and upgrade arguments.
type ServiceArg = variant {
  Init    : record { minter : principal };
  // We might want to override the minter on upgrade.
  Upgrade : record { minter : opt principal }
};

service TokenService : (ServiceArg) -> {
  // …
}

Is Candid binary encoding deterministic?

No. Encoders have a lot of freedom in optimizing the rearranging the type table. Upgrading to a newer version of the Candid library might change the exact message bytes the library produces.

Resources

Similar articles