Blockchain Byte - Week 13 : Hashing & Public Key Cryptography
Table of Contents
- Recap
- Digital Signature
- Encryption & Decryption
- Checksum
- Passwords
Recap
Last week, we introduced Hashing. To recap, hashing is the process of
having an input of any length -> running it through a hashing algorithm -> getting an output of fixed length.
This means the hash output generated is independent of the length of the input which means even if the input contains 10, 100, 1000's or more characters, the output length will remain fixed & of same length irrespective of the length of input characters.
We also noted that a Blockchain uses hash functions in order to create a record of the data booked on the blockchain. For e.g., if a block has like 100 transactions, a hash is created of those 100 transactions.
Why? So that any change to a single piece of data is easily identified. We all know that the Blockchain architecture is decentralized, meaning there is no centralized institution or authority verifying the authenticity of any transaction or maintaining IT security. The nodes are the ones carrying out the verification. (We will get into addition of transactions to the blockchain later!!).
Digital Signature
So, there should be some security or check to ensure no one is able to change any part of the transaction during verification or once verified. This is where cryptography plays a fundamental role in ensuring transaction authentication without a trusted third party.
Before we see how it is done in blockchain, let us take a simpler scenario of implementing digital signature.
What is Digital Signature ? - It is a technique to ensure secure transmission of documents to:
- Ensure authenticity of source &
- That no alteration to the document has been carried out during transit from sender to receiver.
How do we ensure authenticity of a physical document? By signing on the document physically, it is proof that the person signing it has verified and agreed to the contents of the document.
How does the same authenticity and verification happen in the digital domain? It has to be proved that
- The person sending the digital document (owner) is actually the source from where the document has originated
- The contents of the document has not been tampered with in transit.
How can the recipient of the document ensure both the above conditions are satisfied?
Let us put these conditions in a diagram for better visualization
In a digital signature, the sender uses two keys. Let us name these two keys as Key A and Key B. The sender uses Key A to encrypt their document which can be decrypted only by using Key B so no one else other than the receiver who has the Key B can decrypt the document - which means these are paired keys. Paired means only key B can unlock (decrypt) whatever Key A has locked (encrypted).
Let us visualize this as below :
Encryption & Decryption
How exactly does encryption and decryption work? The document is run through a hashing algorithm which creates a hash output. We then use key A to encrypt the hash. Now, why are we encrypting only the hash instead of the whole document?
A hash is created from all the input characters of the document. Last week, we mentioned that if we change even a single character in the input document, the hash changes completely. Thus, encrypting a hash which has shorter number of characters is much faster than encrypting an entire document. Also, it is much more efficient as the hash output is fixed irrespective of the length of the document.
The receiver who has the paired Key B uses the key to decrypt the document hash & generate a second hash. The receiver is now able to compare the hash of the document that was initially created by the sender and the hash just generated. If these two match, then the receiver can ensure authenticity of the document. Otherwise, either
- It has been tampered with or
- It was NOT been sent by the sender using Key A
The below diagram makes it more clear.
Now that the overall concept is clear, in technical terms -
- Key A which was used by the sender to encrypt the document is called the sender's PRIVATE KEY.
- Key B which was provided to the receiver to decrypt the document is called the sender's PUBLIC KEY.
- The concept of using a PAIRED PUBLIC & PRIVATE KEY system is called ASYMMETRIC CRYPTOGRAPHY.
- Asymmetric cryptography enables users who do not know each other to exchange encrypted information.
- A public key is made available to all. It is like a bank account number, which can be provided to anyone.
- The private key, which remains secret, acts as the password to the same bank account.
Now, a question!! Is it possible for two documents to produce the exact same hash output? Well, it is probable but depending on the hash function, the probability of that happening is very rare. In case that happens, it is called a HASH COLLISION.
Can we think of some real life examples of using hash function?
Checksum
Checksums are commonly used when downloading operating system images or software. When we download a software or an operating system image, how do we verify that the software or system we are downloading is authentic & not some fake containing malware or virus? The verification is done though Checksum.
A checksum is the output of a hashing algorithm’s application to a piece of input data, in this case, an OS image or program (we discussed document earlier).
The checksum for a particular software or OS image is listed on the vendor's website. To confirm the version of the file downloaded is safe, the user compares the checksum of the downloaded version with the checksum listed on the vendor’s site. If the two values match, the file is trustworthy. If they don’t match, it’s possible that the file isn’t safe and shouldn’t be downloaded.
Passwords
When we create a password in any website, the raw password is generally not what is saved int the site's database. What is saved is the hash of this password. When we enter the password next time, the program checks if the hash of the entered password matches that of the saved password & then provides access accordingly.
Another simple write up on hashing, hash functions & hash tables can be found here.