De-Satoshize the buildMerkleTree function:

- Clarify the terminology in the existing explanation.
- Add an explanation of what the point of the structure is.
- Note how non-power-of-two transaction list sizes are handled.
- Rename variables to be more helpful than i,i2,j etc.
- Add a more detailed explanation of each step of the algorithm.
This commit is contained in:
Mike Hearn
2011-06-24 16:18:06 +00:00
parent 65bb4a20f8
commit 66e596a8eb

View File

@@ -281,36 +281,56 @@ public class Block extends Message {
}
private List<byte[]> buildMerkleTree() {
// The merkle hash is based on a tree of hashes calculated from the transactions:
// The Merkle root is based on a tree of hashes calculated from the transactions:
//
// merkleHash
// /\
// / \
// root
// / \
// / \
// A B
// / \ / \
// tx1 tx2 tx3 tx4
// t1 t2 t3 t4
//
// Basically transactions are hashed, then the hashes of the transactions are hashed
// again and so on upwards into the tree. The point of this scheme is to allow for
// disk space savings later on.
// The tree is represented as a list: t1,t2,t3,t4,A,B,root where each entry is a hash.
//
// This function is a direct translation of CBlock::BuildMerkleTree().
// The hashing algorithm is double SHA-256. The leaves are a hash of the serialized contents of the
// transaction. The interior nodes are hashes of the concenation of the two child hashes.
//
// This structure allows the creation of proof that a transaction was included into a block without having to
// provide the full block contents. Instead, you can provide only a Merkle branch. For example to prove tx2 was
// in a block you can just provide tx2, the hash(tx1) and B. Now the other party has everything they need to
// derive the root, which can be checked against the block header. These proofs aren't used right now but
// will be helpful later when we want to download partial block contents.
//
// Note that if the number of transactions is not a power of two the last tx is repeated to make it so (see
// tx3 above). A tree with 5 transactions would look like this:
//
// root
// / \
// / \
// 1 6
// / \ / \
// 2 3 4 5
// / \ / \ / \ / \
// t1 t2 t3 t4 t5 t5 t5 t5
ArrayList<byte[]> tree = new ArrayList<byte[]>();
// Start by adding all the hashes of the transactions as leaves of the tree.
for (Transaction t : transactions) {
tree.add(t.getHash().hash);
}
int j = 0;
// Now step through each level ...
for (int size = transactions.size(); size > 1; size = (size + 1) / 2) {
// and for each leaf on that level ..
for (int i = 0; i < size; i += 2) {
int i2 = Math.min(i + 1, size - 1);
byte[] a = Utils.reverseBytes(tree.get(j + i));
byte[] b = Utils.reverseBytes(tree.get(j + i2));
tree.add(Utils.reverseBytes(doubleDigestTwoBuffers(a, 0, 32, b, 0, 32)));
int levelOffset = 0; // Offset in the list where the currently processed level starts.
// Step through each level, stopping when we reach the root (levelSize == 1).
for (int levelSize = transactions.size(); levelSize > 1; levelSize = (levelSize + 1) / 2) {
// For each pair of nodes on that level:
for (int left = 0; left < levelSize; left += 2) {
// The right hand node can be the same as the left hand, in the case where we don't have enough
// transactions to be a power of two.
int right = Math.min(left + 1, levelSize - 1);
byte[] leftBytes = Utils.reverseBytes(tree.get(levelOffset + left));
byte[] rightBytes = Utils.reverseBytes(tree.get(levelOffset + right));
tree.add(Utils.reverseBytes(doubleDigestTwoBuffers(leftBytes, 0, 32, rightBytes, 0, 32)));
}
j += size;
// Move to the next level.
levelOffset += levelSize;
}
return tree;
}