Blockchain - Part 1, Introduction
Blockchain is one of the new technology that is taking the industry by storm. In this series of article we will go through the working of blockchain technology.
What is Blockchain
Blockchain in a simplest of term is a list of immutable records of data or block, where each block points to its immediate predecessor with cryptographic hash (more about this later) forming a chain, i.e, blockchain. Since blocks are linked via a hash function, they are resistant to change.
As you can see from the above diagram, each block has hash of the previous block in its header. Let us suppose a malicious entity tries to forge block 3. To do this it has to either create a block whose hash is same as block 3 (which is cryptographically impossible) or it has to change every block following block 3 which requires lot of computing power and resources.
Blockchain is not owned by any single entity, instead it is managed by a cluster of computers (or nodes) in a peer to peer network, where each node can read the data in the blockchain. This open nature of blockchain makes it perfect for a distributed ledger, that can record and verify a transaction between 2 parties.
Blockchain was invented by person or group of person using the name Satoshi Nakamoto (real identity is unknown) in 2008, to be used as distributed ledger for crypto currency bitcoin.
Before understanding blockchain technology, Following are some topics that you should get familiarized with.
Distributed Ledger :-
Ledger by definition means book of records. In a centralized Ledger, the ledger is owned and maintained by a central authority, for example a bank. There are 2 problems with a centralized ledger
- If the central authority goes down, the whole system goes down and no more records can be processed.
- If the central authority, has a malicious intent, It can corrupt the whole system, for example it can take all the money from a user without its consent.
In a distributed ledger there is no central authority. The database is replicated and synchronized among multiple nodes. Each node maintains and identical copy of the ledger and updates it independently. The nodes in the distributed ledger uses a consensus algorithm to decide which ledger copy is correct.
Cryptographic Hashing :-
Hashing is a process by which data is converted into a fixed size output. Same input will always create the same output. This process is done by a hash function. The output is called hash value or simply hash. Two different data can create same hash value, this is called as collision. A cryptographic hash function is the one that is used in cryptography. A cryptographic hash function has following properties :-
- Same input data will create same hash value
- It should not be possible to generate input data from hash value
- It should be impractical find two different input data with the same hash value, i.e its collision rate should be really low
- A small change in input data should change the hash value in such a way that the new hash value should be unrelated with the old hash value
MD5, SHA-1 and SHA-2 are some popular examples of cryptographic hash functions. If you are using linux or mac, you can explore md5 hashing from command line, for example:-
From the above example you can clearly see that the 2 input data, “hello world” and “hello world 2” generates completely different hash values.
Public and private key cryptography or asymmetric cryptography:-
As the name suggest, one of the key is kept private and other is made public. Both the keys are mathematically linked with each other, such that one key is used to encrypt and the other is used to decrypt the message.
Digital Signature:-
Digital signature is a mechanism by which we verify authenticity of a message. Digital signature uses asymmetric or public private key cryptography.
In this process, first sender computes the hash of the message, than the hash value is encrypted using sender’s private key. Then the sender sends the message along with the encrypted hash and hash algorithm (md5, sha1) to receiver. On receiving the message, Receiver decrypts the encrypted hash with sender’s public key and than compares the hash of the message with the decrypted hash. If both are same, receiver can assure itself that the message is authentic and definitely sent by sender.
Consensus Algorithm:-
As the name suggest, consensus algorithm are set of protocols that nodes in a distributed system uses to decide the state of the system as a whole.
In a blockchain network, All nodes have a copy of the blockchain, when a new block is introduced, nodes uses consensus algorithm to come to agreement on which node’s block is correct and should be considered as source of truth.
Following are some of the consensus algorithms :-
Proof of Work: It is a consensus algorithm used in bitcoin. In this nodes in system has to solve a complex mathematical puzzle which is time consuming and requires lot o resources(Ex. Electricity, Computing Power)
Proof of Stake: It is a consensus algorithm which, unlike proof of work algorithm, does not require heavy computing power. The nodes who wants to participate in block creation process has to put certain amount of coins in the network as stake. Higher the stake, higher is the chance to be selected as validator to create a block. When the node is selected, it will validate all the transaction and will add it to a block, sign it and will add the block to the blockchain.
In the next article we will understand working of blockchain using bitcoin as example.