diff --git a/README.mediawiki b/README.mediawiki index f7e00c590f..ec24af2039 100644 --- a/README.mediawiki +++ b/README.mediawiki @@ -995,6 +995,27 @@ Those proposing changes should consider that ultimately consent may rest with th | Standard | Rejected |- +| [[bip-0181.md|181]] +| Peer Services +| Utreexo Accumulator Specification +| Tadge Dryja, Calvin Kim, Davidson Souza +| Standard +| Draft +|- +| [[bip-0182.md|182]] +| Peer Services +| Utreexo - Transaction and block validation +| Tadge Dryja, Calvin Kim, Davidson Souza +| Standard +| Draft +|- +| [[bip-0183.md|183]] +| Peer Services +| Utreexo - Peer Services +| Tadge Dryja, Calvin Kim, Davidson Souza +| Standard +| Draft +|- | [[bip-0197.mediawiki|197]] | Applications | Hashed Time-Locked Collateral Contract diff --git a/bip-0181.md b/bip-0181.md new file mode 100644 index 0000000000..f62e1a7691 --- /dev/null +++ b/bip-0181.md @@ -0,0 +1,647 @@ +``` + BIP: 181 + Layer: Peer Services + Title: Utreexo Accumulator Specification + Author: Tadge Dryja + Calvin Kim + Davidson Souza + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0181 + Status: Draft + Type: Standards Track + Created: 2025-06-18 + License: BSD-3-Clause +``` + +## Abstract + +This BIP describes the Utreexo accumulator and its operations. It lays down how to update the +accumulator as well as how to generate and verify inclusion proofs for elements in the accumulator. + +## Motivation + +The Bitcoin network is composed of a set of nodes that validate blocks and +transactions as they are received. These nodes need to keep track of the current state of the network in order to +fulfill their role. Most importantly, they must maintain a record of all coins that +have been created but not yet spent, a collection known as the UTXO set. + +This set is typically stored in a database that must be accessed frequently and cannot +be pruned. As a result, the cost of running a node is directly tied to the size +of the UTXO set. Since it can grow indefinitely, bounded only by block size, it represents a +long-term scalability concern. + +Utreexo is a dynamic accumulator that enables the UTXO set to be represented in just a few kilobytes, +by requiring peers to provide additional proof data to verify the inclusion of a UTXO in the +accumulator. This allows for the construction of extremely lightweight nodes capable of performing +the same validation as a full node, without the need to store the entire UTXO set. + +This BIP defines how the Utreexo accumulator works, defining the data structure and algorithms used to +maintain the accumulator, as well as how to generate and verify inclusion proofs for elements in the accumulator. +It does not define how the accumulator is used in the Bitcoin protocol, but rather provides a foundation for future +BIPs that will define how to integrate Utreexo into Bitcoin validation and P2P network protocol. + +## License + +This document is licensed under the BSD-3-Clause license. + +## Preliminaries + +An accumulator is a cryptographic data structure that allows for the compact representation of a set, +enabling efficient membership proofs without requiring storage of the entire set. In the context of Utreexo, +the accumulator tracks the current set of unspent transaction outputs (UTXOs). + +The Utreexo accumulator is based on an append-only Merkle tree design introduced in [^1], +which provides logarithmic-sized inclusion proofs. Utreexo extends this design to support dynamic updates, +specifically enabling deletions from the set—a requirement for tracking UTXO spends in Bitcoin. +To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log_2(N))$, +where N is the number of elements ever added to the set, while still keeping proof sizes small and verification efficient. + +## Merkle Forest + +The Utreexo accumulator consists of a set of Merkle trees: specifically, perfect binary trees with $2^n$ elements, +where each node in the tree contains a 32-byte hash. The elements being stored appear at the leaves—the bottom layer of the tree. +The topmost node is referred to as the "root," while nodes located between the leaves and the root are called "intermediate nodes." + +Any integer number of elements ($N$) can be represented as a forest of such trees. On average, a set of N elements will require +approximately $\frac{log_2(N)}{2}$ trees. The number and sizes of trees are determined by the binary representation of $N$: +each 1-bit corresponds to a tree, and its position in the binary encoding determines the size of that tree. + +For example, a forest with 5 elements (binary `0b101`) would consist of two trees: one with 4 elements (representing the 2nd bit) +and one with 1 element (representing the 0th bit). A forest with 8 elements (`0b1000`) would require only a single 8-element tree, +as 8 is a power of 2. + +More generally, for any N, the number of trees equals the number of set bits (1s) in the binary representation of N. +The size of each tree corresponds to the power of two represented by the position of each set bit. +For example, the decimal number 21 (binary `0b10101`) contains three 1-bits, meaning three trees are needed in the forest: +a 16-element tree ($2^4$), a 4-element tree ($2^2$), and a 1-element tree ($2^0$), with gaps at the 8-element ($2^3$) +and 2-element ($2^1$) positions. + +Each of the hashes in the forest can be referred by an integer label. This labeling is a convention we find easiest +to use but does not directly affect the design of the accumulator; other labelling systems could also work and be +translated to this one. + +We label positions starting at `0` on the bottom left, incrementing as we traverse the bottom row from left to right, +and then continue on to higher rows. There may be gaps in the label numbers when moving up a row; the label +numbers are "padded out" to the next perfect tree that could encompass the entire forest. + +For example, a forest with 8 leaves will have a single tree and positions will be labeled like this: + +``` +14 +|---------------\ +12 13 +|-------\ |-------\ +08 09 10 11 +|---\ |---\ |---\ |---\ +00 01 02 03 04 05 06 07 +``` + +While a forest with 7 leaves will look like this: + +``` + +|---------------\ +12 +|-------\ |-------\ +08 09 10 +|---\ |---\ |---\ |---\ +00 01 02 03 04 05 06 +``` + + +When adding another leaf to the accumulator when it's already allocated $2^N$ leaves will result in +the accumulator resizing to hold $2^{N+1}$ leaves. For example, when adding a leaf to the accumulator +state here: + +``` +14 +|---------------\ +12 13 +|-------\ |-------\ +08 09 10 11 +|---\ |---\ |---\ |---\ +00 01 02 03 04 05 06 07 +``` + +The new accumulator will look like so: + +``` + +|-------------------------------\ +28 +|---------------\ |---------------\ +24 25 +|-------\ |-------\ |-------\ |-------\ +16 17 18 19 +|---\ |---\ |---\ |---\ |---\ |---\ |---\ |---\ +00 01 02 03 04 05 06 07 08 +``` + +The new accumulator with all the positions: + +``` +30 +|-------------------------------\ +28 29 +|---------------\ |---------------\ +24 25 26 27 +|-------\ |-------\ |-------\ |-------\ +16 17 18 19 20 21 22 23 +|---\ |---\ |---\ |---\ |---\ |---\ |---\ |---\ +00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 +``` + +# Definitions + +- `hash` refers to a vector of 32 byte arrays. +- `[]hash` refers to a vector of `hash`. +- `acc` refers to the Utreexo accumulator state. An `acc` is comprised of: + - `roots` refers to the roots of the Merkle Trees. Represented as `[]hash`. + - `numleaves` refers to the number of total leaves added to the accumulator. Represented as uint64. +- `root` refers to the top `hash` in a tree in the `acc`. +- `proof` is an inclusion proof for elements in the accumulator. It's comprised of two fields: + - `targets` are the positions of the elements being proven. Represented as a vector of uint64. + - `proof` are the hashes needed to hash the roots. Represented as a `[]hash`. `proof` MUST be in ascending order by the node positions. + The proof is considered invalid otherwise. + +# Specification + +The hash function SHA512/256[^2] is used for the hash operations in the accumulator. + +An Utreexo accumulator implementation MUST support these 3 operations: Addition, Verification, and Deletion. + +## Utility Functions + +The following utility functions are required for performing accumulator operations: + +**parent_hash(left, right):** Returns the hash of the concatenation of two child hashes (`left` and `right`). +If either child is `nil`, the result is simply the non-`nil` child (treated as if the tree has a single child at that position). +if both children are `nil`, the result is `nil`. + +Implementation: + +```python +def parent_hash(left: bytes, right: bytes) -> bytes: + if right is None and left is None: return None + if left is None: return right + if right is None: return left + + return sha512_256(left + right) +``` + +**treerows(numleaves):** Returns the minimum number of bits required to represent `numleaves - 1`. This corresponds to the height of the largest tree in the forest. Returns `0` if `numleaves` is `0`. + +The reason for taking the minimum number of bits required for `numleaves-1` and not `numleaves` is because when `numleaves` is a power of two, we'd get an off-by-one error. + +The accumulator with `numleaves=4` is illustrated below. The highest tree is at height `2` thus `treerows(4)` should return `2`. +If we take the minimum number of bits required for `numleaves` we'll get `3`, which is not the value we want. +If we take the minimum number of bits required for `numleaves-1` we get the correct value of `2`. + +``` +row 2: 06 + |-------\ +row 1: 04 05 + |---\ |---\ +row 0: 00 01 02 03 +``` + +Implementation: + +```python +def treerows(numleaves: int) -> int: + if numleaves == 0: return 0 + return (numleaves - 1).bit_length() +``` + +**is_right_sibling(position):** Returns `true` if the given `position` corresponds to a right sibling. +A position is on the right side if its least significant bit (LSB) is set (i.e., `position & 1 == 1`). +And it is the right sibling of a **given node** if all bits but the LSB are identical. + +Implementation: + +```python +def is_right_sibling(position: int) -> bool: + return (position & 1) == 1 +``` + +**right_sibling(position):** Returns the position of the right sibling of the given `position`. +If `position` is already on the right side, it returns `position` unchanged. +Otherwise, turning on the least significant bit moves the position to the right side. + +Implementation: + +```python +def right_sibling(position: int) -> int: + return position | 1 +``` + +**sibling(position):** Returns the position of the sibling of the given `position`. +If `position` is on the right side, it returns the left sibling by turning off the least significant bit. +If `position` is on the left side, it returns the right sibling by turning on the least significant bit. + +Implementation: + +```python +def sibling(position: int) -> int: + return position ^ 1 +``` + +**parent(position, total_rows):** Returns the parent position of the given `position` in an accumulator with `total_rows` tree rows. + +Implementation: + +```python +def parent(position: int, total_rows: int) -> int: + return (position >> 1) | (1 << total_rows) +``` + +**root_position(numleaves, row, total_rows):** Returns the position of the root at the specified `row` +in an accumulator with `numleaves` leaves and `total_rows` rows. Returns an undefined (garbage) value if +no root exists at the given row. This can be calculated as: + +Implementation: + +```python +def root_position(numleaves: int, row: int, total_rows: int) -> int: + if row < 0 or row > total_rows: + raise ValueError("Row must be between 0 and total_rows inclusive") + + mask = (2 << total_rows) - 1 + before = numleaves & (mask << (row + 1)) + shifted = (before >> row) | (mask << (total_rows + 1 - row)) + shifted & mask +``` + +**root_present(numleaves, row):** Returns `true` if there is a root at the specified `row` +in an accumulator with `numleaves` leaves. + +Implementation: + +```python +def root_present(numleaves: int, row: int) -> bool: + return numleaves & (1 << row) != 0 +``` + +**detect_row(position, total_rows):** Returns the row at which the given `position` resides +in an accumulator with `total_rows` rows. + +Implementation: + +```python +for row in range(total_rows, -1, -1): + rowbit = 1 << row + if rowbit & position == 0: return total_rows-row +``` + +**isroot(position, numleaves, total_rows):** Returns `true` if the given `position` corresponds to a root +in an accumulator with `numleaves` leaves and `total_rows` rows. +It has the following precondition: + +Implementation: + +```python +def isroot(position: int, numleaves: int, total_rows: int) -> bool: + row = detect_row(position, total_rows) + return root_present(numleaves, row) && position == root_position(numleaves, row, total_rows) +``` + +**getrootidx(numleaves, position):** Returns the index (within the accumulator's root list) +of the root that will be affected when deleting the given `position`. + +Implementation: + +```python +def getrootidx(numleaves: int, position: int) -> int: + idx = 0 + for row in range(tree_rows(numleaves), -1, -1): + if not root_present(numleaves, row): + continue + pos = position + for _ in range(detect_row(position, tree_rows(numleaves)), row): pos = parent(pos, tree_rows(numleaves)) + if isroot(pos, numleaves, tree_rows(numleaves)): + return idx + idx += 1 +``` + +**getrootidxs(numleaves, positions):** Returns a list of indexes corresponding to the roots in the accumulator state +that will be affected when deleting the given set of `positions`. +This is a wrapper around **getrootidx**, applied to each position in the input list. + +Implementation: + +```python +def getrootidxs(numleaves: int, positions: [int]) -> [int]: + return [getrootidx(numleaves, pos) for pos in positions] +``` + +The following utility functions are required for [Utreexo - Peer Services](bip-0183.md): + +**max_possible_pos_at_row(row, total_rows):** Returns the greatest position the row can have in the given total rows. + +Implementation: + +```python +def max_possible_pos_at_row(row: int, total_rows: int) -> int: + mask = (2 << total_rows) - 1 + return ((mask << (total_rows - row)) & mask) - 1 +``` + +**is_root_position(position, num_leaves, row):** Returns if the given position is a root with the passed in num_leaves and row. + +```python +def is_root_position(position: int, num_leaves: int, row: int) -> bool: + root_present = (num_leaves & (1 << row)) != 0 + root_pos = root_position(num_leaves, row, tree_rows(num_leaves)) + return root_present and root_pos == position +``` + +**proof_positions(targets, num_leaves):** Returns all the positions of the proof hashes that are required to validate the given targets. + +```python +def proof_positions(targets: [int], num_leaves: int) -> [int]: + targets.sort() + + next_targets = [] + proof_positions = [] + + total_rows = tree_rows(num_leaves) + for row in range(total_rows + 1): + i = 0 + while i < len(targets): + target = targets[i] + + if target > max_possible_pos_at_row(row, total_rows): + i += 1 + continue + + if row != detect_row(target, total_rows): + i += 1 + continue + + if is_root_position(target, num_leaves, row): + i += 1 + continue + + if i + 1 < len(targets) and right_sib(target) == targets[i + 1]: + parent_pos = parent(target, total_rows) + next_targets.append(parent_pos) + targets[i] = parent_pos + i += 2 # skip the sibling + continue + + # Sibling is a needed proof position + proof_positions.append(sibling(target)) + parent_pos = parent(target, total_rows) + next_targets.append(parent_pos) + targets[i] = parent_pos + i += 1 + + targets.sort() + + return proof_positions +``` + +### CalculateRoots + +Both the Verification and Deletion operations depend on the Calculate Roots function. + +- Inputs: + - `acc.numleaves`. + - `[]hash` that are the hashes for the `proof.targets`. + - `proof`. + +The passed in `[]hash` and `proof.targets` should be in the same order. The element at index `i` in `[]hashes` should +be the hash for element at index `i` in `proof.targets`. Otherwise the returned roots will be invalid. + +The calculate roots algorithm is defined as `CalculateRoots(numleaves, []hash, proof) -> calculated_roots`: + +- Check if length of `proof.targets` is equal to the length of `[]hash`. Return early if they're not equal. +- map `proof.targets` to their hash. +- Sort `proof.targets`. +- Loop until `proof.targets` are empty: + - Pop off the first target in `proof.targets`. Pop off the associated `hash` as well. + - If the target is a root, we append the current position's `hash` to the calculated_roots vector and continue. + - Check if the next target in `proof.targets` is the right sibling of the current target. If it is, grab its hash as the sibling hash. Otherwise the next hash in `proof.proof` is the sibling hash. Raise error if `proof.proof` is empty. + - Figure out if the sibling hash is on the left or the right. + - Apply *parent_hash* to the current position's `hash` and the sibling `hash` with regards to their positioning. + - Calculate parent position. + - Insert parent position into the sorted `proof.targets`. + - Map parent hash to the parent position. +- Return calculated_roots + +The algorithm implemented in python: + +```python +def calculate_roots(numleaves: int, dels: [bytes], proof: Proof) -> [bytes]: + if not proof.targets: return [] + if len(proof.targets) != len(dels): return [] + + position_hashes = {} + for i, target in enumerate(proof.targets): + position_hashes[target] = None if dels is None else dels[i] + + calculated_roots = [] + sortedTargets = sorted(proof.targets) + while sortedTargets: + pos = sortedTargets.pop(0) + cur_hash = position_hashes.pop(pos) + + if isroot(pos, numleaves, tree_rows(numleaves)): + calculated_roots.append(cur_hash) + continue + + parent_pos, p_hash = parent(pos, tree_rows(numleaves)), bytes + if sortedTargets and right_sibling(pos) == sortedTargets[0]: + sib_pos = sortedTargets.pop(0) + p_hash = parent_hash(cur_hash, position_hashes.pop(sib_pos)) + else: + proofhash = proof.proof.pop(0) + p_hash = parent_hash(proofhash, cur_hash) if is_right_sibling(pos) else parent_hash(cur_hash, proofhash) + + position_hashes[parent_pos] = p_hash + bisect.insort(sortedTargets, parent_pos) + + return calculated_roots +``` + +## Addition + +Addition adds a leaf to the accumulator. The added leaves are able to be verified of their +existence with an inclusion proof. + +Inputs: + - `acc`. + - `hash` to be added. + +The Addition algorithm Add(`acc`, `hash`) is defined as: + +- From row 0 to and **including** `treerows(acc.numleaves)` + - Break if there's no root at this row. + - remove the last root from `acc.roots`. + - Calculate the parent hash of the removed root and the `hash` to be added using *parent_hash*. + - Make the result from `parent_hash` the new `hash`. +- Increment `acc.numleaves` by 1. +- Append `hash` to `acc.roots`. + +The algorithm implemented in python: + +```python +def add(self, hash: bytes): + for row in range(tree_rows(self.numleaves)+1): + if not root_present(self.numleaves, row): break + root = self.roots.pop() + hash = parent_hash(root, hash) + + self.roots.append(hash) + self.numleaves += 1 +``` + +## Verification + +- Inputs: + - The accumulator state. + - `[]hash` that are the hashes for the `proof.targets`. + - `proof`. + +The Verification algorithm `Verify(acc, []hash, proof) -> bool` is defined as: + +- Raise error if length of `[]hash` differ from `proof.targets`. +- Get modified_roots from `CalculateRoots(acc.numleaves, []hash, Proof)`. +- Get `root_idxs` from `getrootidxs`. +- Raise error if the length of `modified_roots` and `root_idxs` do not match. +- Attempt to match roots in modified_roots with roots in `acc`. Raise error if we don't find all the roots in the modified_roots in `acc`. +- Return `true`. + +The algorithm implemented in python: + +```python +def verify(self, dels: [bytes], proof: Proof) -> bool: + if len(dels) != len(proof.targets): + raise("len of dels and proof.targets differ") + + root_candidates = calculate_roots(self.numleaves, dels, proof) + root_idxs = getrootidxs(self.numleaves, proof.targets) + + if len(root_candidates) != len(root_idxs): + raise("length of calculated roots from the proof and expected root count differ") + + for i, idx in enumerate(root_idxs): + if self.roots[idx] != root_candidates[i]: + raise("calculated roots from the proof and matched roots differ") + + return true +``` + +## Deletion + +Deletion removes leaves from the accumulator. The deletion algorithm takes in a `proof` but it does not +verify that the proof is valid. It assumes that the passed in proof has already passed verification. + +- Inputs: + - The accumulator state. + - `proof`. + +The Deletion algorithm `Delete(acc, Proof) -> acc` is defined as: + +- Get the modified indexes of the roots `root_idxes` from `getrootidxs`. +- Get modified_roots from `Calculate_Roots(acc.numleaves, []positions, Proof)`. +- Replace the matching indexes from the `root_idxes` in `acc.roots` with `modified_roots`. + +The algorithm implemented in python: + +```python +def delete(self, proof: Proof): + modified_roots = calculate_roots(self.numleaves, None, proof) + root_idxs = getrootidxs(self.numleaves, proof.targets) + for i, idx in enumerate(root_idxs): + self.roots[idx] = modified_roots[i] +``` + +## Rationale + +**Why use a hash-based accumulator instead of something more powerful (e.g., RSA accumulators[^3], class groups[^4], etc.)?** + +While RSA accumulators and similar constructions offer significant advantages in proof size—often allowing a +single proof to cover an entire block's worth of UTXOs—the trade-offs in proof generation cost and latency are +substantial. In RSA-based designs, creating a proof for any given UTXO at arbitrary times can be computationally +intensive, especially as the number of UTXOs grows. + +Utreexo's design is driven by the need for *bridge nodes*: nodes that maintain backward compatibility with existing +Bitcoin nodes and wallets that do not use Utreexo. A bridge node must be able to generate an inclusion proof for +any UTXO at any time, with low latency, to support lightweight clients and peer nodes. + +With the Utreexo hash-based accumulator, bridge nodes can produce such proofs efficiently. Proofs are derived +directly from the current accumulator state, and crucially, when a bridge node processes and verifies a new +block, it can update its state such that all previously known proofs remain valid or are efficiently updatable +with no additional per-proof recomputation cost. In other words, proof maintenance happens "for free" +as part of normal block validation. + +By contrast, group-of-unknown-order accumulators (like RSA or class groups) typically require significant +computation to update all outstanding proofs whenever the accumulator changes, making the bridge node use +case effectively infeasible at Bitcoin scale. + +There are additional disadvantages to non-hash-based accumulators, including: + + - Some require trusted setups (though not all). + - They introduce new cryptographic assumptions, which Bitcoin users may be hesitant to adopt. + - Many are not quantum-safe, whereas hash-based designs like Utreexo inherit Bitcoin's existing reliance on hash functions. + +That said, the most critical blocker remains the computational overhead of proof maintenance at scale. + +If future accumulator designs emerge that allow low-latency, per-UTXO proof generation for arbitrary historical +elements (without excessive computation), we would gladly reconsider. Much of the engineering effort behind +Utreexo lies not in the specific accumulator design, but in integrating any accumulator cleanly into +Bitcoin's validation and P2P protocol layers. + +**Why use the Utreexo Merkle-tree-like accumulator instead of a sparse Merkle tree[^5]?** + +While sparse Merkle trees are a functional alternative, they introduce significant inefficiencies in proof size and caching behavior. + +Bitcoin transaction patterns exhibit strong locality: most spends occur from recently created UTXOs. Utreexo's +forest-based Merkle accumulator takes advantage of this. Newly created UTXOs from a block are appended to +the bottom right of the forest, keeping them clustered. + +As a result, when the next block spends UTXOs, the deletions often target elements also near the bottom right, and +their proofs tend to overlap at very low tree heights, minimizing proof size and hashing cost. + +By contrast, a sparse Merkle tree would spread new UTXOs uniformly across a massive keyspace, making proof paths +diverge much higher in the tree. This would lead to larger proof sizes, and worse cache locality, increasing computational +and I/O costs for proof generation and verification. + +While sparse trees may simplify some aspects of implementation, the proof size and performance characteristics of +Utreexo’s locality-aware forest design make it better suited for Bitcoin’s UTXO model and transaction patterns. + + +## Reference Implementations + +In Python - https://github.com/utreexo/pytreexo + +In Rust - https://github.com/mit-dci/rustreexo + +In Go - https://github.com/utreexo/utreexo + +## Backward Compatibility + +The Utreexo accumulator is a new data structure to the existing Bitcoin protocol and does not pose any backwards compatibility issues. + +## Related Work + +[UHS: Full-node security without maintaining a full UTXO set](https://gnusha.org/pi/bitcoindev/CAApLimjfPKDxmiy_SHjuOKbfm6HumFPjc9EFKvw=3NwZO8JcmQ@mail.gmail.com/) + +[The TXO bitfield](https://gnusha.org/pi/bitcoindev/CA+KqGkpa0=O-ob6SsxST6bHwHu9hTnS16wnpNusrbc8nXVEouA@mail.gmail.com/) + +[AssumeUTXO](https://github.com/bitcoin/bitcoin/blob/master/doc/design/assumeutxo.md) + +## Acknowledgements + +We thank Pieter Wuille for originally discussing about the idea of an accumulator with a feasible bridge node for Bitcoin on the beaches of the Caribbean with Tadge Dryja. +We also thank BOB Spaces for lending the space to draft this BIP. + +## References + +[^1]: Reyzin, Leonid, and Sophia Yakoubov. "Efficient asynchronous accumulators for distributed PKI." International Conference on Security and Cryptography for Networks. Cham: Springer International Publishing, 2016. https://eprint.iacr.org/2015/718 +[^2]: Gueron, Shay, Simon Johnson, and Jesse Walker. "SHA-512/256." 2011 Eighth International Conference on Information Technology: New Generations. IEEE, 2011. https://ieeexplore.ieee.org/abstract/document/5945260 +[^3]: Josh Benaloh and Michael de Mare, "One-way accumulators: A decentralized alternative to digital signatures," EUROCRYPT 1993. +https://link.springer.com/chapter/10.1007/3-540-48285-7_24 +[^4]: Boneh, Dan, Benedikt Bünz, and Ben Fisch. "Batching techniques for accumulators with applications to IOPs and stateless blockchains." Advances in Cryptology–CRYPTO 2019: 39th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 18–22, 2019, Proceedings, Part I 39. Springer International Publishing, 2019. +[^5]: Szydlo, Michael. "Merkle tree traversal in log space and time." International Conference on the Theory and Applications of Cryptographic Techniques. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004. diff --git a/bip-0182.md b/bip-0182.md new file mode 100644 index 0000000000..fed060e05a --- /dev/null +++ b/bip-0182.md @@ -0,0 +1,369 @@ +``` + BIP: 182 + Layer: Peer Services + Title: Utreexo - Transaction and block validation + Author: Tadge Dryja + Calvin Kim + Davidson Souza + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0182 + Status: Draft + Type: Standards Track + Created: 2023-10-01 + License: BSD-3-Clause + Requires: 181 +``` + +## Abstract + +This BIP defines the rules for validating blocks and transactions using the +Utreexo accumulator. It is important to note that this BIP does not define the +Utreexo accumulator itself, for that see [Utreexo Accumulator Specification](bip-0181.md). This document is only concerned with +the general rules for validating blocks and transactions using the Utreexo, +so that all Utreexo nodes can stay in consensus with one another. + +## Motivation + +Although Utreexo in its current form is not proposed as a soft fork, it is essential that +all implementations adhere to a consistent workflow when performing consensus-critical +operations. This BIP defines that workflow, along with the specific rules and their +required ordering. + +There are five consensus-critical components when using the Utreexo accumulator to +represent the UTXO set: + + - 1: The serialization format of each UTXO ("leaf data"). + - 2: The hash function used to hash the leaf data. + - 3: Which transaction outputs are excluded from the accumulator. + - 4: The order of operations for the additions and deletions in the accumulator. + - 5: The format of the UTXO proof. + +A discrepancy in any of the five components above will result in a divergent +accumulator state, leading to consensus incompatibilities. + +## License + +This BIP is licensed under the BSD 3-clause license. + +## Specification + +### Node Hashes + +During a node's normal operation, it will need to compute the leaf hash for UTXOs +being added or removed from the accumulator. The leaf hash is a 32-byte hash that +is computed using the SHA-512/256 hash function. See [UTXO Hash Preimages](#utxo-hash-preimages) for the +details on how to compute the leaf hash. + +Unless otherwise specified, all fields are in little-endian format. + +#### UTXO Hash Preimages + +Individual UTXOs are represented as 32-byte hashes in the Utreexo accumulator. To obtain this +hash, you must compute the SHA-512/256 hash of the following data: + +| Name | Type | Description | +| ------------------ | ------------------------ | ----------------------------------------- | +| Utreexo_Tag_V1 | 64-byte array | The version tag to be prepended to the leafhash. | +| Utreexo_Tag_V1 | 64-byte array | The version tag to be prepended to the leafhash. | +| BlockHash | 32-byte array | The hash of the block in which this tx was confirmed. | +| TXID | 32-byte array | The transaction's TXID | +| Vout | 4-byte unsigned integer | The output index of this UTXO | +| Header code | 4-byte unsigned integer | The block height and iscoinbase. This is a value obtained by left shifting the block height that confirmed this transaction by one bit, and then OR-ing it with 1, only if this transaction is a coinbase. | +| Amount | 8-byte unsigned integer | The amount in satoshis for this UTXO | +| Output script size | varint | The output script length in bytes | +| Output script | variable byte array | The output script of the UTXO | + +Each field being defined as follows: + +##### Version tag + +We use [tagged hashes](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki#user-content-Design) for the hashes committed in the accumulator for versioning +purposes. This is added so that if there are changes in the preimage of the +hash, the version tag helps to avoid misinterpretation. + +The Utreexo version tag is the SHA512 hash of the string `UtreexoV1`, which is represented as the vector +`[85 116 114 101 101 120 111 86 49]` and hex `0x5574726565786f5631`. (The resulting 64-byte output is +`5b832db8ca26c25be1c542d6cceddda8c145615cff5c35727fb3462610807e20ae534dc3f64299199931772e03787d18156eb3151e0ed1b3098bdc8445861885`). + +##### Blockhash + +We commit to the hash of the block which confirms the UTXO. This +is not currently used in the validation code, but could be used at a future +version to increase the work required for collision attacks. +A valid blockhash requires a large amount of work, which would prevent an +attacker from performing a standard cycle-finding collision attack in $2^{n/2}$ +operations for an n-bit hash. + +This could allow a later or alternate version to use shorter truncated hashes, +saving bandwidth and storage while still keeping Bitcoin's $2^{128}$ security. + +##### TXID + +The TXID is the transaction ID of the transaction that created this UTXO. + +##### VOUT + +The output index of the UTXO in the transaction. + +##### Header code + +This field stores the block height and a boolean for marking that the UTXO was +created by a coinbase transaction. Mostly serves to save space as the coinbase +boolean can be stored in a single bit. + +This field is a value obtained by left shifting the block height that +confirmed this transaction by one bit, and then setting the least significant bit to 1 only +if it's part of a coinbase transaction. The code to do that is like so: + +``` +header_code = block_height +header_code <<= 1 +if IsCoinBase { + header_code |= 1 // only set the bit 0 if it's a coinbase. +} +``` + +The block height is needed as during transaction validation, it is used during +the check of BIP-0065 CLTV. In current nodes, the block height is stored locally +as a part of the UTXO set. Since Utreexo nodes get this data from peers, we need +to commit to the block height to avoid security vulnerabilities. + +The boolean for coinbase outputs is needed as they may not be spent before having 100 confirmations. +This data is also currently stored locally as a part of the UTXO set for current nodes. + +##### Amount + +This field is added to commit to the value of the UTXO. With current nodes, this +is stored in the UTXO set but since we receive this in the proof from our peers, +we need to commit to this value to avoid malicious peers that may send over the +wrong amount. + +##### Output script size + +As the output script ("scriptPubKey" in Bitcoin Core and btcd) is a variable length byte array, we prepend it with the +length. + +##### Output script + +This field is added to commit to the output script of the UTXO. With current +nodes, this is stored in the UTXO set but since we receive this in the proof +from our peers, we need to commit to this value to avoid malicious peers that +may send over the wrong output script. + +#### Hash function + +The leaf data is hashed with SHA-512/256, which gives us a 32 byte hash. +It was chosen over SHA-256 due to the faster performance on 64 bit systems. + +#### Excluded UTXOs from the accumulator + +Not all transaction outputs are added to a node's UTXO set. Normal Bitcoin nodes +only form consensus on the set of transactions, not on the UTXO set, so different +nodes can omit different outputs and stay compatible as long as those outputs are +never spent. Utreexo nodes, however, do require explicit consensus on the UTXO set +as all proofs are with respect to the Merkle roots of the entire set. + +For this reason, we define which UTXOs are not inserted to the accumulator. Any +variations here will result in Utreexo nodes with incompatible proofs. + +##### Provably unspendable transaction outputs + +There are outputs in the Bitcoin network that we can guarantee that they cannot +be spent without a hard-fork of the network. The following output types are not +added to the accumulator: +- Outputs whose output script starts with an OP_RETURN (0x6a) +- Outputs with an output script larger than 10,000 bytes + +##### Same block spends + +Often, UTXOs are created and spent in the same block. This is allowed by Bitcoin +consensus rules as long as the output being spent is created by a transaction earlier +in the block than the spending transaction. +In Utreexo, nodes inspect blocks and identify which outputs are being created +and destroyed in the same block, and exclude them from the accumulator and proofs. + +There's no need to provide proofs for outputs which have been created in the same +block. Adding and then immediately removing the output from the accumulator would be +possible but doesn't serve any purpose - once outputs are spent, their past existence +cannot be proven with the Utreexo accumulator (and SPV proofs already provide that). + +For these reasons, outputs which are spent in the same block where they are created +are omitted from the accumulator, and those inputs are omitted from block proofs. + +#### Order of operations + +The Utreexo accumulator lacks associative properties during addition and the +ordering of which UTXO hash gets added first is consensus critical. For +the modification of the accumulator the steps are as follows: + +1. Batch remove the UTXOs that were spent in the block based on the algorithm + defined in [Utreexo Accumulator Specification](bip-0181.md). Deletions itself are order-independent. +2. Batch add all non-excluded outputs in the order they're included in the + Bitcoin block. Additions are order-dependent. + +The removal and the addition of the hashes follow the algorithms defined in +[Utreexo Accumulator Specification](bip-0181.md). + +#### Format of the UTXO proof + +The UTXO proof has 2 elements: the accumulator proof and the leaf data. The +leaf data provides the necessary UTXO data for block validation that would be +stored locally for non-Utreexo nodes. Non-Utreexo nodes store this data (under "chainstate/" for Bitcoin Core) +but since utreexo nodes don't this data, it must be provided. + +The accompanying accumulator proof proves that the given leaf data are committed +in the accumulator. Without this accumulator proof, the Utreexo nodes would not have +a way to ensure that the given UTXO data exists in the UTXO set. The accumulator proof +and the verification ensures that Utreexo nodes are not fooled into accepting transactions +whose input UTXOs do not exist. + +[Accumulator Proof](bip-0181.md#definitions) is defined in BIP-0181, and contains two elements: + +1. A vector of positions of the UTXO hashes in the accumulator. +2. A vector of hashes required to hash up to the roots. + +For (1), positions are in the order of the leaves that are being proven in +the accumulator. These are all the inputs in the natural blockchain order that +excludes the same block spends. + +The UTXO hash preimages follow the same ordering as (1) in the accumulator +proofs. Each of the positions in (1) refer to the UTXO hash preimage in the same +index. + +| Field Name | Data Type | Byte Size | Description | +| ------------------- | ------------------- | --------- | ---------------------------------------- | +| Accumulator Proof | variable byte array | variable | The Utreexo proof as defined in BIP-0181 | +| UTXO hash preimages | variable byte array | variable | The UTXO data needed to validate all the transaction in the block | + +#### UTXO proof validation + +For each block, the UTXO proof must be provided with the bitcoin block for +validation to be possible. Without the UTXO proof, it's not possible to +validate that the inputs being referenced exists in the UTXO set. + +The end result of the UTXO proof validation results in the vector of UTXO +hash preimages that are required to perform the rest of the consensus +validation checks. Note that the resulting data from the UTXO proof validation +is the same data that would normally be fetched from the locally stored UTXO +set. + +The order of operations for the UTXO proof validation are: + +1. Hash the UTXO preimages. +2. Verify that the UTXO preimages exist in the accumulator with the verification + algorithm specified in BIP-0181. + +### BIP-0030 + +[`BIP-0030`](https://github.com/bitcoin/bips/blob/master/bip-0030.mediawiki) is an added +consensus check that prevents duplicate TXIDs. This check and the historical violations +of this check affect the consensus validation for Utreexo nodes. + +### BIP-0030 and BIP-0034 consensus check + +Before `BIP-0030`, the Bitcoin consensus rules allowed for duplicate TXIDs. If two +transactions shared a same TXID, the transaction outputs of the succeeding +transaction would overwrite the previously created UTXOs. It was assumed that +TXIDs were unique but it was trivially easy to create a duplicate transaction that was +exactly the same, resulting in a duplicate `TXID` for coinbase transactions by re-using +the same bitcoin address. + +`BIP-0030` check is a consensus check that enforces that newly created transactions +do not have outputs that overwrite an existing UTXO. + +`BIP-0034` introduces a rule that requires the block height to be included in the coinbase field +of the coinbase transaction. The main reason for the change was to make +coinbase transactions unique so that the expensive check of going through the +UTXO set wouldn't be needed. However, there were blocks in the past that had +random bytes that could be interpreted as block heights. The lowest implicated block +heights are: 209,921, 490,897, and 1,983,702. + +Up until block 209,921 the BIP-0030 checks are performed for non-Utreexo nodes. +Since Utreexo nodes only keep the UTXO set commitment, it's not possible to +perform the `BIP-0030` check. In theory, those blocks can't be reorged, because +of checkpoints, that goes back to block height 295,000 with the block hash +`00000000000000004d9b4ef50f0f9d686fd69db2e03af35a100370c64632a983`. Any chain that +doesn't include this block at height 295,000 isn't valid as removing this check +would be a hard-fork. We note, however, that after version `0.30`, Bitcoin Core +will remove the checkpoints[^1], as they are not needed anymore to prevent attacks +against nodes during Initial Block Download. This is effectively a hard-fork, +that will probably never actually happen, however. + +Block 1,983,702 is the first block that Utreexo nodes would be in danger of a +consensus failure due to the inability to perform the BIP-0030 checks if someone were +to reuse coinbase transaction from block 164,384. However, this block will happen in roughly +21 years from now, and some mitigations have been proposed [^2]. + +### Historical BIP-0030 violations + +There were two UTXOs that were overwritten by repeated transactions: +`e3bf3d07d4b0375638d5f1db5255fe07ba2c4cb067cd81b84ee974b6585fb468:0` at block height 91,722 +`d5d27987d2a3dfc724e359870c6644b40e497bdc0589a033220fe15429d88599:0` at block height 91,812 + +Since the leaf hashes that are committed to the Utreexo accumulator commit to +the block hash as well, all the leaf hashes are unique and the two historical +violations do not happen with how the UTXO set is represented with the Utreexo +accumulator. To be consensus compatible with clients that retain only the second +occurrences of these outputs, the leaves representing the corresponding first UTXOs in the Utreexo accumulator +are hardcoded as unspendable. + +These two leaf hashes encoded in hex string are: + + 1. `84b3af0783b410b4564c5d1f361868559f7cf77cfc65ce2be951210357022fe3` + 2. `bc6b4bf7cebbd33a18d6b0fe1f8ecc7aa5403083c39ee343b985d51fd0295ad8` + +(1) represents the UTXO created at block height 91,722 and (2) represents the +UTXO created at block height 91,812. + +## Rationale + +**Why use the Utreexo accumulator to keep track of UTXOs instead of a key-value database like leveldb?** + +There's two main advantages to using the Utreexo accumulator instead of a key-value database like leveldb: + + 1. Puts a cap on the UTXO set growth. + 3. Performance gains with the elimination of random reads/writes. + +### Puts a cap on the UTXO set growth + +The UTXO set is the collection of unspent transaction outputs and is used to verify blocks. +As the amount of Bitcoin users grow, the UTXO set grows with it. + +The UTXO set is currently around 10GB in 2025 and with pruning that's all it takes to maintain a full node. +However, as the UTXO set grows, the disk storage requirement will grow along with it and increase the barrier to running a full node. + +Currently, the UTXO set size is $O(log(N))$ where $N$ is the number of UTXOs. +By utilizing the Utreexo accumulator, we're able to cap the UTXO set growth at $O(log_2(N))$. + +### Performance gains with the elimination of random reads/writes + +For block validation, there's 3 main bottlenecks: + + 1. The internet speed that determines how fast the block is downloaded. + 2. The disk speed that determines how fast the relevant UTXOs for the block are fetched from the disk. + 3. The CPU speed that determines how fast the signatures will be validated. + +During the default Initial Block Download with assumevalid on, the slowest operation is (2). +This is because fetching and storing the UTXOs are random disk reads and writes, the slowest part of a disk operation. + +However when a node is utilizing the Utreexo accumulator, nothing needs to be fetched from the disk. +Instead, the random disk reads and writes are replaced with Utreexo verification which are just hash calculations. + +## Reference Implementation + +[Utreexod](https://github.com/utreexo/utreexod): A full node implementation with Utreexo support, written in Golang. + +[Floresta](https://github.com/vinteumorg/floresta): A lightweight Utreexo client, written in Rust. + +## Backward Compatibility + +Utreexo nodes are fully backwards compatible with current nodes as they will follow the same chain tip as the current nodes. +Similarly, Utreexo nodes will only consider currently valid transactions for mempool acceptance. + +## Acknowledgements + +We thank BOB Spaces for lending the space to draft this BIP. + +## References + +[^1]: https://groups.google.com/g/bitcoindev/c/qyId8Yto45M +[^2]: https://delvingbitcoin.org/t/great-consensus-cleanup-revival/710 diff --git a/bip-0183.md b/bip-0183.md new file mode 100644 index 0000000000..f8f4a2d3ee --- /dev/null +++ b/bip-0183.md @@ -0,0 +1,570 @@ +``` + BIP: 183 + Layer: Peer Services + Title: Utreexo - Peer Services + Author: Tadge Dryja + Calvin Kim + Davidson Souza + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0183 + Status: Draft + Type: Standards Track + Created: 2024-08-08 + License: BSD-3-Clause + Requires: 181, 182 +``` + +## Abstract + +Utreexo creates a compact representation of the UTXO set that only takes a couple of kilobytes. +When spending a transaction, one must provide an inclusion proof for the UTXOs being spent. +This BIP defines the networking-layer changes needed to allow nodes to exchange the inclusion proofs. +This document **does not** describe how to validate blocks and transactions using the provided data, see [BIP-0182](bip-0182.md) for more details. + +## Motivation + +Utreexo nodes require the inclusion proof to fully validate blocks and transactions. +Each block has a corresponding inclusion proof with it and this inclusion proof for blocks up to height 906,937 requires an additional 631.85GB, which is roughly 40GB less than the size of the block data. +Each transaction also has a corresponding inclusion proof with it and for normal transaction relay, the proof is roughly 3 times the size of the transaction. +It's still reasonable for a single node to download this extra data but a little caching goes a long way in reducing the amount of data that one has to download. +We define the new P2P messages for the inclusion proofs to support caching to reduce bandwidth while also allowing a high bandwidth, low-latency usage. + +## License + +This BIP is licensed under the BSD-3-Clause license. + +## Overview + +### Requirements and Compatibility + +Nodes implementing Utreexo can choose which messages to support. +There are a number of configurations possible, and this BIP does not restrict nodes to any subsets of messages. + +That said, there are three likely types of nodes: +1. Compact State Nodes (CSNs) +2. Bridge nodes +3. Archive nodes + +CSNs have the goal of minimizing data storage and download while performing block validation. +Archive and bridge nodes store more data and provide this data to CSNs. + +Bridge nodes are nodes that can add inclusion proofs to mempool transactions, support the same set of messages as CSNs, and should in fact be indistinguishable from CSNs on the network. +Archive nodes are able to serve the blocks and the inclusion proofs. However, they are not able to generate the inclusion proofs as they do not keep the full UTXO set. + +Note that the archive and bridge capabilities of a node are separate; a bridge node can be bridge only, without previous block proof data, and an archive node doesn't need to be able to bridge. + +The one exception to this flexibility is that archive nodes must provide both the blocks and the inclusion proofs. +While theoretically possible to split these two resources, the blocks are quite small relative to the block proofs, and it simplifies clients to be able to rely on being able to request both over the same connection. + +### Pre-P2P: Bridge Building + +When introducing Utreexo into an existing network, there are two things needed before CSNs can operate. +First, archive nodes need to build proofs for old blocks to serve during the initial block download (IBD). +Second, nodes need to build and maintain the UTXO merkle forest, and an index of outpoints to leaves of that forest, so that they can build proofs for new transactions. +Both of these processes happen without any p2p messages by taking an already existing, synchronized archive full node and going through its stored block data. + +Once an archive and bridge node have been established, CSNs download blocks and inclusion proofs to IBD and maintain sync with the bitcoin network. + +### Initial Block Download + +Conventionally, IBD is done by a headers-first block download, in which the node downloads all the Bitcoin block headers, verifies that they connect, and follows up by by downloading the block data for validation. + +Below image illustrates how a non-Utreexo node would perform the IBD. + +![Non-Utreexo IBD](bip-0183/non-utreexo-initial-block-download.png) + +Utreexo nodes still perform the headers-first phase. +However, in addition to blocks, they also require the inclusion proof for UTXOs spent in that block. +Hence, a Utreexo node will send a `getutreexoproof` message along with the `getdata` message for a given block. +This flow is the simplest change and allows a Utreexo node to validate and perform IBD but this method does require downloading about two times as much data as a conventional node due to the inclusion proof for a block being roughly the same size as the block itself. + +Below image illustrates how a Utreexo node would perform the IBD. + +![Utreexo node IBD](bip-0183/utreexo-initial-block-download.png) + +For Utreexo nodes with memory to spare, we introduce a `TTL` message that will have a time-to-live value for each of the outputs in a given block. + +With these TTL values, a node receiving the `TTL` message will be able to determine which output to cache with the Clairvoyant algorithm[^1] which allows the IBD-ing node to reduce the bandwidth required in syncing the node in the most efficient way possible. + +The node will have the block and the TTLs for the outputs of the given block which it can then use to cache parts of the inclusion proof and only request the needed parts of an inclusion proof for future blocks. + +We note that it is feasible for a node to receive incorrect TTL values from malicious nodes and this can negatively impact the bandwidth savings. +This can be mitigated by either: + +1. Avoiding downloading TTL values for blocks too far into the future since the damage done will be greater. +2. Rely on the pre-committed *TTL accumulator* in the node software. + The TTL accumulator has TTLs for each of the blocks accumulated. + With this accumulator, the node can check if the received TTL value is valid or invalid by checking for its existence in the TTL accumulator. + +The TTL accumulator is described in detail in the section [Commitment scheme for TTL messages](#commitment-scheme-for-ttl-messages) below. + +Below image illustrates how a Utreexo node would perform a bandwidth efficient IBD. + +![Bandwidth efficient Utreexo node IBD](bip-0183/bandwidth-efficient-utreexo-initial-block-download.png) + +### Transaction relay + +Non-Utreexo transaction relay is done by sending an inv message with the hash of the transaction and a type field that denotes that this hash represents a transaction. +If the node receiving the inv does not have a transaction matching that hash, the node then requests the transaction using a getdata message. + +Below image illustrates how a non-Utreexo node would relay transactions. + +![Non-Utreexo TX relay](bip-0183/non-utreexo-tx-relay.png) + +The transaction relay for Utreexo nodes doesn't add any extra round trips. +However, it does include extra inventory vectors in the inv message. + +We introduce a new inventory vector type called `utreexoproofhash`, which makes up the extra information that a Utreexo node will receive. + +A hash with the type `utreexoproofhash` represents four Utreexo merkle tree positions, each of them little-endian serialized and taking up 8 bytes in the 32-byte hash. +When sending an inv message to a Utreexo node for a transaction, we append `utreexoproofhash` inventory vectors to represent the merkle tree positions for each of the UTXOs being referenced in the inputs of the transaction. +The Utreexo merkle tree positions are explained in detail in [Utreexo Accumulator Specification](bip-0181#Merkle Forest). +Since the hash in an inventory vector is always 32 bytes, any unused space will be padded with the max uint64 value of 18446744073709551615. + +With these merkle tree positions for the UTXOs referenced in the inputs, we can calculate the needed positions of the merkle hashes to them. +These positions are then sent over in the `getdata` message as an another inventory vector. + +Below image illustrates how a Utreexo node would relay transactions. + +![Utreexo TX relay](bip-0183/utreexo-tx-relay.png) + +There may be cases where the transaction is referencing more than 4 merkle positions. +In this case, the extra positions are added as another inventory vector. +There can be as many additional inventory vectors for the `utreexoproofhash`es as needed. +An inventory vector of type `utreexoproofhash` will be ignored if it's not prepended with an inventory vector of type `transaction`. + +Below image illustrates how a Utreexo node would relay transactions with multiple inventory vectors of the type `utreexoproofhash`. + +![Utreexo TX relay multiple Utreexo proof hash vectors](bip-0183/utreexo-tx-relay-with-multiple-proofhash-inventory-vectors.png) + +It's possible to have an inv message with multiple txs as well. +Note that an inventory vector of type `utreexoproofhash` MUST be appended to the `tx` inventory vector. + +Below image illustrates how a Utreexo node would relay multiple transactions. + +![Utreexo TX relay with multiple txs](bip-0183/utreexo-tx-relay-with-multiple-txs.png) + +### Block Propagation + +Legacy block propagation without Compact Blocks comprises of three steps: + +1. Node A sends an inv message or a block header to Node B. +2. Node B makes a getdata request for the block. +3. Node A sends the block data to Node B. + +Below image illustrates how a non-Utreexo node would relay blocks without using Compact Blocks. + +![Non-Compact-Block Block Propagation](bip-0183/non-compact-block-block-propagation.png) + +The same block propagation with Utreexo nodes will look like so: + +1. Node A sends an inv message or a block header to Node B. +2. Node B makes a getdata request for the block. +3. Node B makes a getutreexoproof request for the block. +4. Node A sends the block data to Node B. +5. Node A sends the inclusion proof to Node B. + +Note that while Node A sent the inv or the blockhash to Node B, Node B is free to ask for the Utreexo proof from a node other than Node A. +This allows a Utreexo node to be notified of new blocks from non-Utreexo nodes. + +Since there's no PoW required for the inclusion proof, the block may be valid and the proof may be invalid. +If the block header validation passed while the full block validation fails, Node B should request the inclusion proof from a different peer. +If the new proof and the block pass validation, we can conclude that Node A is malicious and ban the peer. + +Below image illustrates how a Utreexo node would relay blocks without using Compact Blocks. + +![Non-Compact-Block Block Propagation with Utreexo Nodes](bip-0183/non-compact-block-utreexo-block-propagation.png) + +Since the inclusion proof is cached for each of the transaction in the mempool, it's possible to omit the proof hashes for the input UTXOs that we can already prove on our own. +This method looks like so: + +1. Node A sends an inv message or a block header to Node B. +2. Node B makes a getdata request (MSG_UTREEXO_SUMMARY) for the given blockhash. +3. Node A sends the utreexoblocksummary message to Node B. +4. Node B calculates which proof hashes and leafdatas it needs to prove this block. +5. Node B makes a getdata request for the block to Node A. +6. Node B makes a getutreexoproof request for the block to Node A. +7. Node A sends the block data to Node B. +8. Node A sends the requested inclusion proof data to Node B. + +As with the getutreexoproof message, Node B is free to ask for the utreexoblocksummary message from a node other than Node A. +Since there's no commitment to anything in a utreexoblocksummary message, the information given in it can be false. +Should the block fail to validate with this propagation, Node B should request the full proof from a different peer. +Should the proof and the block pass validation, we can conclude that Node A is malicious and ban the peer. + +All of the above propagation works the same with Compact Block propagation as well. +The requester would need to send a getdata request (MSG_UTREEXO_SUMMARY) after the Compact Block propagation has concluded for high-bandwidth Compact Block propagation and after the header/inv message was received from the broadcasting peer. + +Below image illustrates how a Utreexo node would relay blocks in a bandwidth efficient manner without using Compact Blocks. + +![Bandwidth Saving Non-Compact-Block Block Propagation with Utreexo Nodes](bip-0183/bandwidth-saving-non-compact-block-utreexo-block-propagation.png) + +## Specification + +Several new data structures and messages are introduced to make the IBD and tx relay possible. +All structures are little-endian encoded unless otherwise noted. + +### New data structures + +#### Compact leaf data + +For a CSN to learn the data associated with a UTXO, it must ask for it from a peer that has it. +To authenticate this data, it is committed into the accumulator, and therefore cannot be changed by the peer. +The committed data is defined in [Utreexo - Transaction and block validation](bip-0182#UTXO Hash Preimages), but for some information in the leaf data, the receiving peer might already have it, so sending it again is a waste of bandwidth. +To save that bandwidth, we only send a Compact Leaf Data, that contains all missing information for the receiving peer to reconstruct the full leaf data. +A compact leaf data is defined as: + +| Field | type | Description | +|--------------|------------------------------|-----------------| +| header code | uint32 | This is a value obtained by left shifting the block height that confirmed this transaction, and then OR-ing it with 1, only if this transaction is a coinbase. | +| amount | int64 | The amount in sats locked on this output | +| scriptPubkey | reconstructable scriptPubkey | The scriptPubkey in a reconstructable format, see [Reconstructable Script](#Reconstructable-Script) for more details | + +#### Reconstructable Script + +For some script types (e.g. `ScriptHash`, `PubkeyHash`, `WitnessScriptHash`, `WitnessPubkeyHash`) the actual locking condition is not in the scriptPubkey, but a hash of it. +The script which is evaluated is provided as an element of the scriptSig or witness data. + +Therefore, we can safely omit the locking script hash from the UTXO data and reconstruct it from the witness or scriptSig. + +A Reconstructable Script is a tagged union that lets nodes recreate the script without necessarily providing redundant information. +If we can reconstruct the committed hash from the transaction data, we just say which type should we expect. +Only if the actual script cannot be reconstructed from transaction data, like in the case of taproot outputs, we send the actual script. + +The serialization and tag values are given below: + +| Field | Type | Description | Required | +|---------|-------------------------|-------------------|--------------------------| +| tag | 1-byte unsigned integer | Script type | yes | +| length | varint | The script length | only if tag type is 0x00 | +| script | variable-length vector | The actual script | only if tag type is 0x00 | + +The possible values for the tag are: + +| Value | Script Type | +|-------|---------------------| +| 0x00 | Other | +| 0x01 | Pubkey Hash | +| 0x02 | WitnessV0PubkeyHash | +| 0x03 | ScriptHash | +| 0x04 | WitnessV0ScriptHash | + +#### TTL Info + +For all UTXOs that get added to the Utreexo merkle forest, a TTL info exists for it and includes information necessary for efficiently caching and requesting proofs. +The TTL value provides information to determine which leaves should be cached and the death position is used to calculate which positions in the merkle forest we need to prove a block. + +| Field | Type | Description | +|----------------|--------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| TTL | varint | The time-to-live value of a leaf in the Utreexo merkle forest. The value is determined by the amount of leaves that were added to the accumulator since its creation | +| death position | varint | The position in the Utreexo merkle forest when the leaf was removed | + +#### Utreexo TTL + +| Field | Type | Description | +|--------------|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| block height | uint32 | The time-to-live value of a leaf in the Utreexo merkle forest. The value is determined by the amount of leaves that were added to the accumulator since its creation | +| length | varint | The length of the TTLs | +| TTLs | vector of TTL infos | The TTL Info for the UTXOs that are added to the Utreexo merkle forest in blockchain ordering. See [BIP-0182](bip-0182.md#excluded-utxos-from-the-accumulator) for the UTXOs that are not added to the Utreexo merkle forest | + +### New Messages + +#### MSG_UTREEXO_PROOF + +`MSG_UTREEXO_PROOF` is all the data required for a CSN or archive node using the Utreexo accumulators to validate a Bitcoin block. + +Its `cmdString` for P2PV1 is `uproof`. +Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `29`. + +| Field | Type | Description | +|--------------------------------|------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------| +| blockhash | 32 byte vector | The hash of the block that this inclusion proof proves | +| length of the proof hashes | varint | The length of the proof hashes | +| proof hashes | vector of 32 byte vectors | The hashes requested by MSG_GET_UTREEXO_PROOF. MUST be in tree order | +| length of the target locations | varint | The length of the target locations | +| target locations | vector of varint values | The Utreexo merkle tree locations of the leafdatas. MUST be in blockchain order. MUST include all the locations or none of the locations | +| length of the leafdatas | varint | The length of the leafdatas | +| leafdatas | vector of compact leafdatas | The preimage of the committed UTXOs requested by the MSG_GET_UTREEXO_PROOF. MUST be in blockchain order. See compact leaf data for details | + +The proof hashes MUST be in merkle forest tree ordering. +See BIP [Utreexo Accumulator Specification](bip-0181.md#Merkle Forest) for an explanation on how each of the hashes in the merkle forest are positioned. + +Each of the target location represents the position of the leaf data at the same index. +While each leaf data represent a UTXO in a given block, not all are added as per [Utreexo - Validation Layer](bip-0182.md#Excluded UTXOs from the accumulator). + +#### MSG_GET_UTREEXO_PROOF + +`MSG_GET_UTREEXO_PROOF` is a message to request the inclusion proof for a given block. + +Its `cmdString` for P2PV1 is `getuproof`. +Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `30`. + +| Field | Type | Description | +|---------------------------|-----------------------------|--------------------------------------------------------------------| +| blockhash | 32 byte vector | The hash of the bitcoin block that we want the inclusion proof for | +| include all | boolean | A boolean value to request all parts of the inclusion proof | +| proof request bitmap | variable-length byte vector | A bitmap of the requested proof hashes | +| leaf data request bitmap | variable-length byte vector | A bitmap of the requested leafdatas | + +The bitmaps here are formatted as big-endian and padded to the nearest byte, with 1 meaning a request for the proof hash or the leaf data, and 0 meaning omit the proof hash or the leaf data. + +Since there's one corresponding leaf data per target location, it's trivial to generate a bitmap for the leafdatas. + +Using the [proof_positions](bip-0181.md#utility-functions) function, it's possible to generate the positions of the needed proof hashes for a given set of targets. +With these positions, we can set the bit in the bitmap for the hashes we require. + +#### MSG_UTREEXO_TTLS + +`MSG_UTREEXO_TTLS` is the requested group of Utreexo TTLs that includes the proof hashes needed to validate that the given TTLs were committed in the provided binary. + +Its `cmdString` for P2PV1 is `uttls`. +Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `31`. + +| Field | Type | Description | +|----------------------------|-------------------------------------|-----------------------------------------------| +| length of the Utreexo TTLs | varint | The length of the Utreexo summaries | +| Utreexo TTLs | vector of Utreexo summaries | The vector of the requested Utreexo summaries | +| length of the proof hashes | varint | The length of the proof hashes | +| proof hashes | vector of 32 byte hashes | The vector of the requested proof hashes | + +#### MSG_GET_UTREEXO_TTLS + +`MSG_GET_UTREEXO_TTLS` is used to request a MSG_UTREEXO_TTLS message. + +Its `cmdString` for P2PV1 is `getuttls`. +Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `32`. + +| Field | Type | Description | +|----------------------|--------|----------------------------------------------------------------------------------------------------------------------| +| Version | uint32 | The height of the committed TTL accumulator. It's used to specify which accumulator the TTL should be proved against | +| Start height | uint32 | The first block which the TTL message will be provided for | +| Max receive exponent | uint8 | Denotes how many TTLs should be provided in total. The provided TTL count will be $2^{Max Receive Exponent}$ | + +#### MSG_UTREEXO_SUMMARY + +`MSG_UTREEXO_SUMMARY` is the data needed to calculate the missing merkle forest positions required to validate a given block. + +Its `cmdString` for P2PV1 is `usummary`. +Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `33`. + +| Field | Type | Description | +|----------------------------|-------------------------|-----------------------------------------------------------------------------------------------------------------| +| blockhash | 32 byte vector | The hash of the block that this Utreexo block summary is for | +| num adds | varint | The count of leaves added to the accumulator on the block this Utreexo block summary is for | +| length of target locations | varint | The length of the target locations | +| target locations | vector of uint64 values | The Utreexo merkle tree locations of the leafdatas. MUST be in blockchain order. MUST include all the locations | + +#### MSG_UTREEXO_TX + +`MSG_UTREEXO_TX` is the non-Utreexo Bitcoin transaction appended with the inclusion proof. + +Its `cmdString` for P2PV1 is `utreexotx`. +Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `34`. + +| Field | Type | Description | +|----------------------------|------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| transaction | MSG_TX | The bitcoin transaction. Unconfirmed inputs are marked by shifting the index by 1 and setting the LSB | +| length of the proof hashes | varint | The length of the proof hashes | +| proof hashes | vector of 32 byte hashes | The vector of the requested Utreexo summaries | +| length of the leafdatas | varint | The length of the leafdatas | +| leafdatas | vector of compact leafdatas | The preimage of the leafdatas referenced in the bitcoin transaction. MUST be in the order of the referenced inputs. Unconfirmed inputs do not have a corresponding leaf data. See compact leaf data for details | + +For each of the referenced inputs in the Bitcoin transaction, we mark each unconfirmed UTXO by setting the index of its outpoint: + +``` +index <<= 1 +if IsUnconfirmed { + index |= 1 // only set the bit if the UTXO referenced by the output is unconfirmed +} +``` + +This step is required because if the unconfirmed UTXO is not explicitly marked, then a malicious peer can omit the leaf data for a confirmed UTXO and mislead us into believing that the transaction is an orphan. + +#### MSG_UTREEXO_ROOT + +`MSG_UTREEXO_ROOT` is the utreexo accumulator state at a given height with a proof to a utreexo accumulator of the utreexo roots. + +Its `cmdString` for P2PV1 is `uroot`. +Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `35`. + +| Field | Type | Description | +|----------------------------|------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| numleaves | varint | The number of leaves that was ever added to the accumulator at this block height. See [numleaves](bip-0181.md#Definitions) | +| target | varint | The position of the utreexo root in the optional accumulator of the utreexo roots | +| blockhash | 32 byte vector | The blockhash for this utreexo accumulator state | +| length of the root hashes | varint | The length of the root hashes | +| root hashes | vector of 32 byte hashes | The utreexo roots for the UTXO set at the blockhash. See [roots](bip-0181.md#Definitions) | +| length of the proof hashes | varint | The length of the proof hashes | +| proof hashes | vector of 32 byte hashes | The proof hashes needed to validate with the pre-committed utreexo accumulator of the utreexo roots | + +This message is for implementing an out-of-order block validation node[^2] or softchains[^3]. + +Because the size of the state needed to validate blocks is so small with Utreexo, nodes can perform IBD in parallel and out of order. + +For example, a computer could divide the task of validating 800,000 blocks into 100 tasks of 8,000 blocks each: blocks 1 through 800, 800 through 1600, 1600 through 2400, and so on. + +In order start the 1600 through 2400 IBD task, however, the node should know what the state of the utxo set is at block 1600, so that it can validate and modify the accumulator. + +In order to do this, the binary can provide "linkup hints", where the state of the accumulator is given for a desired block hash. + +While giving the state of the system might seem at first glance to be introducing a trust assumption, these are not trusted states. +The node performing IBD tries out the state given for a block height, but checks that when that state is reached from the thread "below" that it properly links up, with the accumulator state arrived at through full validation matching the state given. +If that link up does not successfully happen, the IBD process should halt. + +These hints are statements of fact that are hard-coded into the program itself, and if they are false all bets are off about the program. + +Archive nodes create a forest of Linkup hints, so that they can prove, with respect to the Linkup forest roots in a node performing IBD, what their binary has claimed the utxo accumulator state to be at any block height. + +#### MSG_GET_UTREEXO_ROOT + +`MSG_GET_UTREEXO_ROOT` is used to request a utreexo accumulator state at a given height. + +Its `cmdString` for P2PV1 is `geturoot`. +Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `36`. + +| Field | Type | Description | +|----------------------------|-------------------------|------------------------------------------------------------------------------------------------------------------| +| blockhash | 32 byte vector | The hash of the block that the requested utreexo root message is for | + +### New Inventory Types + +#### MSG_UTREEXO_PROOF_HASH + +Defined as `6`. + +- It's used in the `inv` and `getdata` messages to communicate positions in the Utreexo merkle forest. +- The communicated positions MUST be in order of the referenced UTXO in the inputs. +- Unconfirmed UTXOs for the given transaction will NOT have a position associated with it. +- MUST be appended to another `invvect` of type `MSG_TX`, `MSG_WITNESS_TX`, `MSG_UTREEXO_TX`, or a `MSG_WITNESS_UTREEXO_TX`. +- Ignored if an `invvect` of type `MSG_UTREEXO_PROOF_HASH` is not pre-pended by any of the above 4 `invvect` types. + +#### MSG_UTREEXO_SUMMARY + +Defined as `7`. + +It's used in the `getdata` messages to communicate the block hash of the desired Utreexo summary. + +#### MSG_UTREEXO_FLAG + +Defined as `1 << 24`. + +It can be set with `MSG_TX` and `MSG_WITNESS_TX` to indicate in `getdata` messages that a Utreexo tx is desired. + +#### MSG_UTREEXO_TX + +Defined as `16777217` or `1 << 24 | 1`. + +Used to indicate in a `getdata` message that a Utreexo tx is desired. + +#### MSG_WITNESS_UTREEXO_TX + +Defined as `1090519041` or `1 << 30 | 1 << 24 | 1`. + +Used to indicate in a `getdata` message that a witness Utreexo tx is desired. + +### Commitment scheme for TTL messages + +We choose an arbitrary height `X` and go through each of `TTL info` in all the the `Utreexo TTL` values up until that height. + +If the TTL in the `TTL info` is greater than the [numleaves](bip-0181.md#Definitions) value of the Utreexo accumulator at the chosen height `X`, we reset the `death position` and the `TTL` values to their default of 0. +Then these `Utreexo TTL` values are hashed with the hash function SHA512/256[^4] and added in height order to the commitment Utreexo accumulator. + +Note that this commitment Utreexo accumulator is separate from the Utreexo accumulator being used to represent the UTXO set. + +The resulting [numleaves](bip-0181.md#Definitions) and [roots](bip-0181.md#Definitions) are committed into the distributed binary which then the nodes opting in can use to validate that the `Utreexo TTL` values received from peers was generated in the same way as the described commitment scheme. + +### Signaling + +This BIP allocates two new service bits: + +| Field | Type | Description | +|----------------------|----------------|-------------------------------------------------------------------------------------------------------------------------------------| +| NODE_UTREEXO | 1 << 12 | Nodes that signal this bit MUST be able to propagate inclusion proofs for new blocks and transactions and for their other advertised services. Nodes signaling NODE_UTREEXO and NODE_NETWORK_LIMITED MUST serve inclusion proofs for the last 288 blocks. Nodes signaling NODE_UTREEXO and NODE_NETWORK MUST serve inclusion proofs for all historical blocks. +| NODE_UTREEXO_ARCHIVE | 1 << 13 | Nodes that signal this bit MUST be able to serve historical inclusion proofs for all blocks. These nodes do not have to serve historical blocks. + +`NODE_UTREEXO` signals that the node understands Utreexo and will serve inclusion proofs for advertised txs and blocks. + +`NODE_UTREEXO_ARCHIVE` is specifically for nodes that only keep the historical inclusion proofs for all the blocks. +This bit is to allow for nodes that *only* serve the historical inclusion proofs. + +Example cases: + +| Used Service bits | Description | +|------------------------------------------------- |--------------------------------------------------------------------------------------------------------------------------------------| +| NODE_NETWORK, NODE_UTREEXO_ARCHIVE, NODE_UTREEXO | Historical blocks + inclusion proofs for historical blocks + inclusion proofs for txs and new blocks | +| NODE_NETWORK_LIMITED, NODE_UTREEXO | Latest 288 blocks + inclusion proofs for latest 288 blocks + inclusion proofs for txs and new blocks | +| NODE_UTREEXO | Inclusion proofs for txs and new blocks | +| NODE_UTREEXO_ARCHIVE | Inclusion proofs for historical blocks | + +## Rationale + +**Why is there a separate NODE_UTREEXO_ARCHIVE service bit from the NODE_UTREEXO service bit?** + +For archive nodes, we wanted the ability for a node to keep just the historical Utreexo proofs since the historical blocks can be served by any archival nodes. +In order to differeniate nodes that serve just the historical Utreexo proofs and not the blocks, we needed to create a separate service bit. + +**Why is there a need for the MSG_UTREEXO_PROOF message? Couldn't there be a MSG_UTREEXO_BLOCK message where the Utreexo proof data is included along with the Bitcoin block data?** + +It's for the same reason as why we have a separate NODE_UTREEXO_ARCHIVE service bit. + +We wanted to allow for a node that only keeps the historical Utreexo proofs. +If we have only have a MSG_UTREEXO_BLOCK message, all Utreexo archive nodes would need to keep the block data as well. + +**Why is there a need for TTL messages?** + +We wanted there to be a caching method that has the same security as [assumed valid](https://bitcoincore.org/en/2017/03/08/release-0.14.0/#assumed-valid-blocks) that would help Utreexo nodes save bandwidth during the initial block download. + +The TTL data in the TTL message allows each individual Utreexo node to calculate which leaves to cache with the Clairvoyant algorithm[^1], allowing for the most optimal memory utilization. +This data could have been provided in the software distribution itself, but to save on binary size, we instead put the roots of the TTL accumulator in the binary and propagate the actual TTL data in the Bitcoin P2P network with the TTL message. +Since the TTL data is committed in the TTL accumulator, a Utreexo node can validate that the received TTL message is included in the TTL accumulator. + +**Why are the positions in the Utreexo merkle forest communicated via inventory vectors instead of a separate message?** + +We decided to communicate the positions in the Utreexo merkle forest by inventory vectors instead of a separate message to avoid an extra round trip during the transaction propagation. + +As mentioned above in [Transaction Relay](#transaction-relay), non-Utreexo nodes propagate a transaction in these 3 steps: + + 1. Receive the inventory message for the transaction. + 2. Send a getdata message for the transaction. + 3. Receive the transaction. + +The Utreexo nodes follow the same 3 steps because of the new MSG_UTREEXO_PROOF_HASH. +If we were to implement the following with a separate message, we would add a round trip and the entire transaction propagation would look like these 5 steps: + + 1. Receive the inventory message for the transaction. + 2. Send a message to get the positions in the Utreexo merkle forest for the transaction. + 3. Receive the positions in the Utreexo merkle forest. + 4. Send a getdata message for the transaction. + 5. Receive the transaction. + +This adds delays in transaction propagation and we decided on using extra inventory vectors to communicate the positions. + +**Why is there a need for the Utreexo root message?** + +The Utreexo root message serves two purposes: + + 1. Provide the Utreexo root data to nodes doing out of order block validation. + 2. Provide the Utreexo root data to Softchain clients. + +The out of order block validation requires the Utreexo roots at the previous block in order to validate a given block. +While this data can be provided in the software itself, this would lead to the binary being bloated. +For the above reason, we chose the Bitcoin peer to peer network as the method of providing this data to such nodes. + +For Softchain clients, it is vitally important to get the correct Utreexo roots for a given block in order for the fraud proof mechanism to function properly. +We saw the Bitcoin P2P network as the best method of distributing the Utreexo roots. + +Along with the aforementioned use-case in (1) we decided that the Bitcoin P2P network was the best method of distributing the Utreexo roots. + +## Backwards Compatibility + +This change introduces a new primitive that doesn't interact with existing protocols. + +## Acknowledgements + +The original idea for the Reconstructable Script was detailed in [Cory Field's UHS](https://gnusha.org/pi/bitcoindev/CAApLimjfPKDxmiy_SHjuOKbfm6HumFPjc9EFKvw=3NwZO8JcmQ@mail.gmail.com/) under the section "TxIn De-duplication". + +## References + +[^1]: https://en.wikipedia.org/wiki/Page_replacement_algorithm#The_theoretically_optimal_page_replacement_algorithm +[^2]: https://blog.bitmex.com/out-of-order-block-validation-with-utreexo-accumulators/ +[^3]: https://gist.github.com/RubenSomsen/7ecf7f13dc2496aa7eed8815a02f13d1#softchains-sidechains-as-a-soft-fork-via-proof-of-work-fraud-proofs +[^4]: https://eprint.iacr.org/2010/548.pdf diff --git a/bip-0183/bandwidth-efficient-utreexo-initial-block-download.png b/bip-0183/bandwidth-efficient-utreexo-initial-block-download.png new file mode 100644 index 0000000000..3893c35b7a Binary files /dev/null and b/bip-0183/bandwidth-efficient-utreexo-initial-block-download.png differ diff --git a/bip-0183/bandwidth-saving-non-compact-block-utreexo-block-propagation.png b/bip-0183/bandwidth-saving-non-compact-block-utreexo-block-propagation.png new file mode 100644 index 0000000000..5288b7c68e Binary files /dev/null and b/bip-0183/bandwidth-saving-non-compact-block-utreexo-block-propagation.png differ diff --git a/bip-0183/non-compact-block-block-propagation.png b/bip-0183/non-compact-block-block-propagation.png new file mode 100644 index 0000000000..6d399ddf75 Binary files /dev/null and b/bip-0183/non-compact-block-block-propagation.png differ diff --git a/bip-0183/non-compact-block-utreexo-block-propagation.png b/bip-0183/non-compact-block-utreexo-block-propagation.png new file mode 100644 index 0000000000..69de7614eb Binary files /dev/null and b/bip-0183/non-compact-block-utreexo-block-propagation.png differ diff --git a/bip-0183/non-utreexo-initial-block-download.png b/bip-0183/non-utreexo-initial-block-download.png new file mode 100644 index 0000000000..ba29ac7b70 Binary files /dev/null and b/bip-0183/non-utreexo-initial-block-download.png differ diff --git a/bip-0183/non-utreexo-tx-relay.png b/bip-0183/non-utreexo-tx-relay.png new file mode 100644 index 0000000000..d981f67a7d Binary files /dev/null and b/bip-0183/non-utreexo-tx-relay.png differ diff --git a/bip-0183/utreexo-initial-block-download.png b/bip-0183/utreexo-initial-block-download.png new file mode 100644 index 0000000000..982c75b5bf Binary files /dev/null and b/bip-0183/utreexo-initial-block-download.png differ diff --git a/bip-0183/utreexo-tx-relay-with-multiple-proofhash-inventory-vectors.png b/bip-0183/utreexo-tx-relay-with-multiple-proofhash-inventory-vectors.png new file mode 100644 index 0000000000..76380d7486 Binary files /dev/null and b/bip-0183/utreexo-tx-relay-with-multiple-proofhash-inventory-vectors.png differ diff --git a/bip-0183/utreexo-tx-relay-with-multiple-txs.png b/bip-0183/utreexo-tx-relay-with-multiple-txs.png new file mode 100644 index 0000000000..cba838da2c Binary files /dev/null and b/bip-0183/utreexo-tx-relay-with-multiple-txs.png differ diff --git a/bip-0183/utreexo-tx-relay.png b/bip-0183/utreexo-tx-relay.png new file mode 100644 index 0000000000..4158c7d26f Binary files /dev/null and b/bip-0183/utreexo-tx-relay.png differ