Blockchain Byte - Week 25 Verification & Addition of Blocks
Recap
During Week 24, we explained Merkle Tree along with Merkle Root and their application in a Blockchain. To summarize :
- Merkle Root is created by hashing each pair of nodes which form part of a Merkle Tree till it reaches the root.
- Merkle Root is thus representative of all the data in that Merkle Tree
- Merkle Trees are used to summarize all the transactions in a block & provides a very efficient way to identify whether a transaction is included in a block
This week let us understand Block Verification / Validation & addition of these blocks to be part of the Blockchain.
Block Validation
Once the blocks are created and the miners solve the proof-of-work problem, the next step is the independent validation of each new block by every node in the blockchain. Each node performs a series of tests on a newly created block to validate it before propagating to its peers in the network.
In normal books of account, a transaction has to be authorized to be part of the ledger. This is the crypto equivalent of an authorization.Here, we double check the transaction parameters (like amount, beneficiary, transferor details etc) to see if everything is right before we commit the transaction to the system.
Similarly, when a node receives a new block, it validates the block by checking it against certain criteria (valid block data structure, acceptable block size etc) which have to be met for the validation checks to be completed; otherwise the block is rejected & will not form part of the decentralized ledger.
We also mentioned about consensus in Week 9 & Week 10. Through consensus protocols, each node should validate the block & agree for it to be added to the blockchain and be part of the decentralized ledger.
What happens if a miner adds a fraudulent or invalid transaction(s) to a block?
The independent validation of each block by every node ensures that the miners cannot cheat. If they do cheat, their blocks will be rejected & they will lose their reward (E.g., the newly minted bitcoin earned through the coinbase transaction - Refer Weeks 14 & Week 15 for more on Incentives) & all that electricity & processing power expended to solve the proof-of-work will be wasted.
Block Addition
Once the nodes validate a block, the block has to be added to the blockchain. We know that a blockchain is nothing but a series of blocks "chained" to each other. So, all that the nodes need to do is find the last block in the blockchain and then chain this new block to it.. right?
Well, that is the gist of how a block is added but there are some more nuances to it as we will see below.
When a node receives a new block, post validation it will try to connect it to the existing blockchain. This is done by referring to the PREVIOUS BLOCK HASH in it's block header which is a reference to that block's parent.
Then the node will attempt to find that parent in the existing blockchain. The parent block will most of the time be the last block in the blockchain which means the new block will extend the blockchain.
There could be scenarios where a valid block is received but no parent is found in the blockchain. In that case, that new block is called an ORPHAN block. Such blocks are saved in a special pool called orphan pool where they are placed till their parent blocks are found.
Once the parent is found, the orphan block is pulled up from the orphan pool and linked to the parent and hence part of the blockchain.
So, why should the parent block be missing and be found after the orphan block?
Let us assume two transactions occuring one after the other-
- A transfers funds (say, 100 BTC) to B &
- B transfers funds(say, 50 BTC) to C.
So the order of the transactions should (in accounting terms) be :
Transaction 1
DEBIT A's wallet BTC 100
CREDIT B's wallet BTC 100
Transaction 2
DEBIT B's wallet BTC 50
CREDIT C's wallet BTC 50
Now, assume Transaction 1 & Transaction 2 are mined in two separate blocks - Block 1 & Block 2 respectively. Assume Block 2 is mined first and then Block 1 gets mined immediately after.
Due to this, the nodes receive the blocks in reverse order which means the nodes receive Block 2 first and then Block 1.
Thus Block 2 placed in an orphan pool till Block 1 is found out and then both the blocks are linked to the blockchain.
This is how at a high level, blocks are validated and added to a blockchain. Once it is added, it becomes embedded into the blockchain & is immutable. What does immutable mean?
Immutable means : Not capable of or susceptible to change
(Source : Merriam Webster)
Thus, once a transaction is part of a blockchain, it is impossible to alter that transaction subsequently. To be more accurate, it is not impossible but very very difficult to do so. We will get into that later but for now let us understand that transactions in a blockchain are immutable & as more blocks become added, it becomes more and more difficult to alter a transaction. Can you guess why could that be?
Changing a transaction in a block changes it's block header which contains a hash of all the transactions in that block.
This hash is used to link that block to the next block and so on. Hence, all the block headers of subsequent blocks would be changed and invalidate all those blocks & as more blocks are added, more blocks would be invalidated.
The below visual makes it more clear
So, is it all hunky dory in crypto world? Can I close my bank accounts and start using my wallet and cryptocurrencies for day to day activities? Well, not quite yet.
Blockchain is a FOUNDATIONAL TECHNOLOGY. It has the potential to create new foundations for our economic & social systems. But it could take years, maybe decades for the technology to be mainstream.
Back to the present -
Do all nodes "see" the same blocks or transactions for verification at the same time?
Are the nodes perfectly in sync with each other at all times?
What will happen if there is a breakdown in network for some time and the node has to "catch up" with the ledger updates? or if it is a new node - how does it's copy of the ledger "catch up" to the decentralized one?
How do the nodes reach a consensus in an open network without relying on each other?
To answer the above & more, we will deep dive into Byzantine Fault Tolerance (BFT) & Nakamoto Consensus next week.