Blockchain Byte - Week 26 Node Synchronization

Blockchain Byte - Week 26 Node Synchronization
Photo by Shubham Dhage on Unsplash

Photo by Shubham Dhage on Unsplash

Recap

During Week 25, we explained how:

a. Blocks are validated &

b. Blocks are added to the decentralized ledger

We also mentioned that Blockchain is a FOUNDATIONAL TECHNOLOGY which has the potential to create new ways of processing transactions & storing data.

The current processes for initiating, authorizing and recording transactions in traditional ledgers are also implemented in a blockchain but done by nodes & protocols (code) without manual intervention.

Also, security is ensured through the use of cryptography & mathematics. The concept a truly digital asset which has transferable value is amazing that it has already opened up a whole new paradigm which never existed before like Non-Fungible Tokens (NFTs), Decentralized Finance (DEFi) and even Central Banks are experimenting with Central Bank Digital Currencies (CBDCs). There are innovations ongoing in crypto space to improve efficiency, scalability & interoperability.

  • Efficiency - Processing transactions with least friction
  • Scalability - Increase in transactions processed per unit of time
  • Interoperability - between different blockchains

But for now, due to new paradigms & mindset shift required as with any new technology, the concepts need to be clear before it is ready for mass adoption.

We ended Week 25 with the question - How do nodes reach a consensus in an open network without relying on each other?

Before we explore Nakamoto Consensus & Byzantine Fault Tolerance (BFT), let us first understand Peer-to-Peer & Synchronization.

Peer-to-Peer

We know that Blockchain is structured in a way that all the connected computers (or nodes) forming part of the network are peers to each other (Hence P2P - Peer To Peer technology). There is no centralized server or company which manages the network. We also know that nodes (full nodes) maintain their copy of the decentralized ledger. So let us understand :

  1. What happens when a new node joins a Blockchain network? From where does it get it's copy of the decentralized ledger?
  2. What happens when a node goes offline for a couple of hours or days or months ?

First things first, let us understand what is Synchronization.

Synchronization

As per thefreedictionary.com, one of the definition (among others) is

  1. To transfer data between (two devices) to ensure that the same data is stored on both
  2. To execute such a transfer to cause the content of the devices (files or folders or other data sets) to be identical
a white corded phone next to a laptop
Photo by Thorsten Konersmann / Unsplash

So, we understand from the above that

  1. Two or more devices must store the same data to be in sync with each other and
  2. The process of ensuring that the data in these two devices remain the same at all times is called synchronization.

When we store a document or file in Google Docs, Dropbox or Apple ICloud, we are able to access the same document or file from any device (phone, pad or laptop). If we update the document or file from one device, we are able to access the updated document from other devices since the application syncs all the devices with the updated document or file.

Let us take another example of a bank. At a very high level, all the applications used by the bank & customer data are "stored" in the production servers (application servers, database servers etc) and if a customer wants to access their balance, they have to access the server after proper authentication. What if this server goes offline for any reason? What do you think will happen?

The bank's applications will shut down and not function as long as it's servers are down. Customers will not be able to access their balances or perform transactions either online or offline. This will cause a huge reputation risk for the bank.

Under server management, there are ways to mitigate this risk by setting back up servers which take up the "responsibility" of running applications & accessing databases till the production server is fixed.

Server Mgmt-2.jpg

Now, to ensure continuity & consistency in the functioning of the bank, all the data in the backup server should be the same as that of production server. For example, if a customer has a balance of USD 100 in their account in the production server, they cannot show a different balance (like USD 200 or USD 10) once the back up server takes over. It has to show the exact same balance as the production server.

Also, large companies like Alphabet, Meta or Netflix maintain hundreds of servers all synchronized and efficiently utilized using load balancers which distributes the customer requests to servers depending on the server load so that not one server is overloaded at any point in time.

All these servers need to be in sync with each other and they should reflect the same "state" meaning they should all reflect the same balances & transactions at any point in time.

The below visualization makes it more clear :

Server Mgmt.jpg

So, the servers are always synchronized with each other ensuring that everything inside those servers (applications & databases) are in sync and reflect the same state at all points in time. This is a continuous and ongoing process.

Let us now go further by seeing how this is implemented in a Blockchain

Node Synchronization in a Blockchain

New (Full) Node joining the network

  1. Once a new full node joins the network, it first connects to the other nodes in the network. There are message exchanges (called inventory messages) between the nodes identifying the number of blocks required for the new node to "catch up" with the other full nodes. A new node always gets the first genesis block (Block # 0) which is part of the bitcoin software.
  2. To catch up with the other nodes in terms of blocks, the new node will have to download all the blocks to synchronize with other full nodes in the blockchain.

The below visualizations make it more clear:

Slide1.JPG
Slide2.JPG
Slide3.JPG
Slide4.JPG

Nodes going offline

In a Public Blockchain, (There are different types of blockchains depending on access rights given to nodes which we will get into later!) anyone can join as a node and they can leave at any point in time. it is also possible that due to some network issue, some nodes go offline for sometime.

The same principle applies here as for new nodes. The node which goes offline "catches up" once it comes online by downloading the missing blocks from the other nodes. This process of comparing the blocks with peer nodes & updating it with missing blocks happens whenever a node goes offline.

We know that miners mine blocks every 10 minutes and a successful block is broadcast to all the nodes for validation & verification. So when a new transaction or block is to be validated, do all the nodes "see" them at the same time?