Categories
Crypto

A Bitcoin protocol explainer

Much has been written around the Bitcoin protocol. I will walk you through a practical exercise in which we will rebuild Bitcoin together.

If you work in Tech, you surely heard of blockchain as a magic cost-saving panacea that will cut expenses and improve efficiency in whatever process or application you may be willing to apply it.

Of course, this is far from reality.

Blockchain is an inherently inefficient technology that should only be explored for very specific use cases.

Most people have a vague idea of what blockchain is.

They might have heard of Bitcoin, perhaps they even bought some Bitcoin, then they might have dipped their toes into other cryptocurrencies but without a clear understanding of the underlying technology.

Blockchain was popularized by the anonymous person (or group) known as Satoshi Nakamoto when, between 2007 and 2009, he/she/they invented Bitcoin, a peer-to-peer payment network that runs on a cryptographic protocol, with transactions recorded on a specific version of blockchain technology.

However, blockchain as an idea is far older than that. It was introduced in 1991 by Haber and Stornetta in the Journal of Cryptology, in the context of time-stamping digital documents. It was not called “blockchain” but the concept has been around for a long time. Blockchain, as we intend it today, is a “special” ledger, distributed, transparent, append-only and cryptographically secured.

In this article, I am going to explain some of the intricacies of the Bitcoin protocol and blockchain as well as a few of the most salient architectural choices made by Satoshi.

This understanding will be extremely useful when, as professionals or crypto-enthusiasts, we all investigate cryptocurrencies, cryptoventures and wider blockchain-based solutions.

Table of Content

  • What is Bitcoin
  • Rebuilding Bitcoin
    • The Double-spend problem in the Bitcoin protocol
    • Proof-of-Work mechanism
    • What are the incentives in the Bitcoin protocol?
    • What could go wrong…
    • The chain can fork
    • Merkle Trees in Bitcoin
  • A couple of loose ends
    • Why the 10-minute block?
    • Why the small blocks size?
  • Conclusive Thoughts

What is Bitcoin

The first clarification we need to make is around the relation that exists between the Bitcoin protocol, the Bitcoin network, the Bitcoin currency and the Bitcoin blockchain. This clarification is important because often these terms are used interchangeably and this can generate some confusion.

The Bitcoin network is a peer-to-peer (decentralized) payment network that functions on a cryptographic protocol. This is, in a way, similar to the Internet network operating on top of the HTTPS (Hypertext Transfer Protocol Secure) protocol.

The members of the Bitcoin network exchange bitcoin, the units of currency that represent economic value (bitcoin in this sense is a “cryptocurrency”). They do so by broadcasting digitally signed messages that contain information regarding the transaction.

Each transaction is then recorded, upon validation through a mechanism which we will review shortly, in a public database known as the blockchain. The blockchain has some very specific features that make it different from a common database, and we will read about some of those later in this article.

For now, we should know that the blockchain is an append-only chain of blocks: each block is pretty small, sized at 1 megabyte (we’ll find out why) and is made up of a set of value-exchange transactions generated between users.

Before the Bitcoin network went live on 3rd January 2009, Satoshi wrote a white paper called “Bitcoin: A Peer-to-Peer Electronic Cash System” that he shared in a Cryptography specialized forum in order to get feedback by the expert community. This process much resembles that of an Academic Researcher who goes through rounds of peer-reviews to ensure his findings are sound.

Copy of the original email sent by Satoshi Nakamoto

Rebuilding Bitcoin

Many articles, blog posts and books have been written around Bitcoin and many of them explain the workings of the protocol with a varying degree of detail.

I will attempt to walk you through a practical exercise in which we will rebuild Bitcoin together, starting from the most intuitive concepts and slowly iterating to add the more elaborated features.

We will start with a very rough version of the Bitcoin digital currency protocol.

As we mentioned earlier, Bitcoin is a form of digital money and, as such, each bitcoin can be exchanged between people.

Let’s assume that Alice wants to send some Bitcoin (BTC) to Bob. To do so, she writes the message “Alice sends 1 BTC to Bob”, she digitally signs it with her private key and broadcasts this message. At this point, the network is aware that Alice is sending this money to Bob.

The private key is, for all practical purposes, a random number and it is mathematically linked to a public key. While it is simple to go from the private key to the public key, it is nearly impossible to do the opposite. The technology currently used in Bitcoin to perform these operations is known as Elliptic Curve Cryptography (ECC). 

The immediate problem that we have with this initial iteration of the protocol is that the exact same message can be replicated multiple times and Alice can send the same 1 BTC many times over to Bob.

We can improve this initial sketch by introducing a serial number linked to each BTC. In this way, when Alice initiates a transaction, her message would look something like “Alice sends 1 BTC to Bob – serial number 15674981”. If she wants to send another bitcoin, she will need something like “Alice sends 1 BTC to Bob – serial number 15674982” and so on.

This is a better version of the protocol but now we need a mechanism to ensure that Alice doesn’t double spend the coins she owns.

We could do that by introducing a Third Party that certifies which coins have been spent at any point in time. This central authority would effectively act like a Bank.

However in our protocol we want to do something much more ambitious than that: we want to replace any sort of trusted third parties with the network itself.

In order to achieve this, we are going to ask any person on the network to maintain a copy of the ledger (ie the list of all transactions ever executed) and we will call such a public shared ledger a blockchain. The network members that keep a copy of the distributed ledger are known as full nodes.

Alice can now send her coin to Bob: “Alice sends 1 BTC to Bob – serial number 15674981”. Bob can verify in his version of the blockchain that she actually has the coin and, if satisfied, can accept the transaction.

After accepting the transaction, Bob can update the ledger and broadcast the new version to the whole network.

This sounds like everything is working fine… but the double spend problem is yet not solved.

The Double-spend problem in the Bitcoin protocol

I will show what Alice could do.

She could send a “Alice sends 1 BTC to Bob – serial number 15674981” message to Bob and a “Alice sends 1 BTC to Charlie – serial number 15674981” message to Charlie at the same time.

Bob and Charlie would verify their versions of the ledger and, because this is happening in more or less the same moment, they would both accept Alice’s coin… and one of them would be robbed of his BTC!

One possible solution to this could be for Bob and Charlie to ask the rest of the network to verify the transaction they have received.

In this way, they don’t just rely on their own version of the ledger but ask the other members of the network to verify the reliability of the transaction on their versions as well. Only after many of the network members accept the transaction, this is considered legitimate.

Is this good enough? Not really.

Alice is a very smart fraudster: she could create millions of fake identities in the network and, when Bob and Charlie are trying to validate the transactions, she would unleash those identities to falsely mark both transactions as legit.

The solution to this problem comes with the introduction of a well thought Consensus Mechanism in the protocol, a system called Proof-of-Work (PoW).

The PoW mechanism is designed in such a way that it is costly for network members to validate transactions as well as it is against their own interest to validate fraudulent ones!

Proof-of-Work mechanism

This is also the point in which we need to introduce incentives in our protocol.

We need to reward network members for helping validate transactions and being truthful. We also need to make it artificially hard and expensive for network members to validate transactions, so that it becomes anti-economic to try and fool the system.

Let’s run through an example.

Let’s assume that Alice sends a “Alice sends 1 BTC to Bob – serial number 15674981” message and, in order for this message to get validated, it is broadcast to the members of the network.

We can think of a network member called Maria that has a bunch of pending transactions to be validated in the queue, for example:

  • “John sends 1 BTC to Romeo – serial number 15623481”
  • “Fred sends 1 BTC to Gaia – serial number 16931581”
  • “Luca sends 1 BTC to Paul – serial number 19713981”

and finally

  • “Alice sends 1 BTC to Bob – serial number 15674981”

Maria checks her copy of the blockchain and can see that these transactions are all valid. She wants to help ensuring truthfulness in the network and therefore she wants to broadcast her finding with regards to this block of transactions. Before she is allowed by the protocol to broadcast her verification, she needs to solve a computationally hard problem, the Proof-of-Work.

Here, I will have to go a bit in the details but please make an effort to stay with me.

Let’s call h a hash function known by everyone in the network – it’s built into the protocol. In Bitcoin, as of now we use the well-known SHA-256 hash function.

Hashing is a function where any input of arbitrary size can be uniquely expressed as a string of alphanumeric digits. Any change in the input causes the hash to change too. Hashing is a one way function: it is easy to go from the input to the hash but extremely difficult to go the other way.

Let’s give Maria’s queue of pending transactions a label, q, for our easy reference.

Maria appends a number n (called the nonce) to l and hashes the combination.

For example, if we use q = “I love Italy” and the nonce n = 0 then:

h(I love Italy0) = 50286ee57614be7c3436a387d2526e93885ac53220e126897ea965492f92a3f1

Alternatively, if we use q = “I love Italy” and the nonce n = 1 then:

h(I love Italy1) = 9745f193b904ea0a4efe3613d11a4c21e875e13fdfe65951f55b697e216d0710

A small change in the input generates a large variation in the output.

Solving the hash function requires some computational power but the way the Bitcoin protocol is designed allows for a clever way of increasing or decreasing the computational power required based on the state of the network.

In the Bitcoin protocol, we call target a 256-bit number that all Bitcoin clients share. The SHA-256 hash of a block‘s header must be lower than or equal to the current target for the block to be accepted by the network.

The lower the target, the more difficult it is to generate a block.

Block generation is not a long, set problem (such as doing a million hashes), but it is a randomized activity that resembles a lottery.

When the nonce varies, each hash gives you a random number between 0 and the maximum value of a 256-bit number (an extremely high value).

If your hash is below the target, then you win. If not, you increment the nonce and try again.

The target is adjusted automatically by the algorithm so that each new Bitcoin block is validated every ~10 minutes (and I’ll explain later why 10 minutes).

In our example, we can now assume Maria has identified the right nonce that generates the hash expected to win the mathematical challenge at hand. She can now broadcast the transactions as well as the nonce n.

The other network members (the nodes) can verify that n is actually the correct solution: this verification is trivial and requires very little computational expenditure.

What are the incentives in the Bitcoin protocol?

But why would network members use so much computational power and energy at their own cost to look for the right nonce and ensure integrity in the Bitcoin network? The answer lies in incentives.

Those network members that spend time and energy in solving the hash function in order to validate the transactions are called miners. The first miner that finds the correct nonce at each block is rewarded with some bitcoin.

When the protocol first started on 3rd January 2009, this reward was set at 50 bitcoin per block.

Every 210,000 blocks (approximately every 4 years) the bitcoin reward is halved. So far, this happened on 28th Nov 2012 (bitcoin reward set at 25 BTC), 9th July 2016 (bitcoin reward set at 12.5BTC), 11th May 2020 (bitcoin reward set at 6.25BTC) with the next halving event bound to happen around February-May 2024.

The 2024 halving will set the bitcoin reward for miners to 3.125BTC.

Halving events in Bitcoin block rewards

As per the original Bitcoin protocol created by Satoshi Nakamoto, there will only be 21 million bitcoin ever in circulation, with the last bitcoin expected to be mined around the year 2140.

We should note that block reward is not the only incentive for miners, because the network also charges transaction fees, which go to the miners that validate those transactions.

It is largely understood that, after 2140, transaction fees will be a big enough incentive for miners to keep sustaining the security of the network.

What could go wrong…

There is a potential vulnerability that has to be highlighted around the Proof-of-Work: this is the so-called 51 percent attack.

If 51% of the hashing power of the network started to approve transactions that were invalid with fraudulent intentions, that could create a problem for the overall integrity of the network.

However, importantly, orchestrating such an attack would be very expensive. Buying 51% of the network would cost a few billion dollars in computational resources and energy expenditure – at the current size of the Bitcoin network. This exercise would get more expensive as the network gets bigger.

Additionally, as soon as these invalid transactions are inserted in the ledger, people would question the integrity of the whole system causing a sudden drop in the price of bitcoin. This would lead to the original expenditure in buying computational power becoming a malinvestment (ie a waste of money).

In essence, the 51 percent attack is definitely a potential weakness of the network but such an attack is not that likely, doesn’t make any economic sense, and it becomes less probable to happen at every new block that gets mined (because as the network increases its size, it also becomes more secure).

The chain can fork

In our attempt to rebuild Bitcoin, we have now reached a critical step in the process.

We know that miners validate blocks with legitimate transactions by resolving a complex mathematical function. Whichever miner solves this puzzle first, earns some reward. Resolution of the computational challenge is somewhat a function of how much computational power and energy the miner has input in the network: roughly speaking, the more computational resources are applied to the Bitcoin network, the higher the chance of being first at solving the hash challenge and hence receiving the bitcoin reward.

What happens if two miners solve the hash problem at nearly the same time? This is a possibility that, although probabilistically remote, we cannot exclude.

In such a scenario, the blockchain produces a so-called fork, with the two newly minted blocks becoming the start of two new chains. From that point onwards, the nodes keep track of both chains until one of them becomes the ultimate “winner”, ie keeps getting new blocks appended to it.

Safeguarding the correct sorting of the chain is possible because each block is linked to its predecessor via its own hash.

I will try to explain this concept with some pictures.

When two blocks are mined at nearly the same time, a fork appears in the blockchain:

Example of a chain fork

At this point, miners keep track of both chains, until we have a clear “winner”. This is the chain which more blocks are appended to:

Resolution of a chain fork

The “greenlight” chain in my representation is the blockchain that has the agreed-upon sorting of blocks, as per network validation. This is now considered the source of truth for our transactions.

In the Bitcoin network, each block is not final until at least 5 more blocks are appended to it in the longest greenlight chain: this is known as “6 confirmations” mechanism to ensure each transaction is final.

At this point, the entire network has witnessed this transaction being sent by the user, validated by each node, and confirmed by the miner. Final settlement is achieved and funds are irreversibly passed from the sender to the receiver.

Merkle trees in Bitcoin

Of course, in my exposition I have oversimplified some of the details when representing the Bitcoin blockchain.

We understood that the Bitcoin blockchain is a distributed, linked list of blocks. Blocks contain a list of transactions and a piece of information that stores metadata (the hash).

We can now introduce the next level of detail, by learning that part of this metadata is the root of a specific data structure called “Merkle tree”. The root of a block’s Merkle tree is known as “Merkle root”.

The introduction of the Merkle tree is of fundamental importance because it helps to significantly make the network more efficient in its storing mechanism, without losing any security.

A Merkle Tree, in fact, is well known in Computer Engineering as a data structure that allows a large set of information to be verified for accuracy very quickly.

Two properties of Merkle trees that are incredibly important for blockchain are Data Integrity and Partial Verification:

  • Data integrity – The root of the tree changes if any node in the tree is modified. Even the same dataset, with its elements presented in a different order, will produce a different tree.
  • Partial verification – We do not need the full tree to verify whether a transaction belongs to it. We only need a path that starts in one of the leaves and goes all the way up to the root (proof of inclusion).
A Merkle Tree representation

The usage of Merkle Trees to store data in the blockchain ensures that information can be securely validated in the most efficient way possible without the need of a central authority.

In this sense, we can finally explain why Bitcoin’s architecture with its Proof-of-Work represents the first example of a working real-world application that solves the famous Byzantine Generals problem.

If you are a Game Theory nerd like myself, you surely heard of this unsolved (until now) challenge.

A formulation of the Byzantine General problem is offered below and I’ll leave it to the reader to elaborate the parallels with our Bitcoin network protocol.

Several generals are besieging Byzantium. They have surrounded the city and now they must collectively decide when to attack. If all generals attack at the same time, they will win. If they attack at different times, they will lose. The generals have no secure communication channels with one another: any messages they send or receive may have been intercepted or deceptively sent by Byzantium’s defenders.

As per our explanation of the Bitcoin protocol, we now know that Proof-of-Work solves the Byzantine Generals problem in the most elegant, efficient and secure way ever architected so far in human history.

A couple of loose ends

I hope that you have followed me in this walkthrough of the Bitcoin protocol.

With the information discussed in this article, you should now have a good enough understanding of how Bitcoin works.

You can certainly go in more detail on just about any of the concepts described here and, if you are interested in knowing more, I definitely invite you to keep your research going and/or to contact me for further idea exchanging (or pointing out any mistake I may have made!).

But before we wrap up, I would like to close a couple of loose ends that we have from our conversation above.

Why the 10-minute block?

We said that the mining of a new block happens at a (more or less) regular interval of about 10 minutes.

Where are these 10 minutes coming from?

Satoshi Nakamoto does not offer an explanation of this design choice in his paper, however we can hypothesize this decision was taken as a trade-off between two factors, Network Latency and Confirmation Time:

  1. Network latency – as we discussed, after a block is mined, it takes some time for other network members to hear about that block (any network suffers from a latency lag). During that time, other miners are still competing (ie mining) against that block. This leads to waste of mining resources: several validators are still using computational resources to accept/reject that block but we know that the network can accept only one eventually.
  2. Confirmation time – as previously mentioned, in order for blocks of transactions to be considered “final” in the network, we need to have 5 blocks appended to the current one. This process takes, of course, some time too.

Now, the Network Latency factor needs the block time to be long whereas the Confirmation Time factor needs the block time to be short.

Hence, a trade-off was chosen of ~10 minutes. We don’t have any formal communication from Satoshi Nakamoto on this aspect, however we can reasonably assume that he was able to back this decision with some mathematical computations.

Why small blocks size?

At the beginning of this piece, I gave you the information that each Bitcoin block is sized at 1 megabyte, but we didn’t really explain why.

The understanding of this design option is of great importance, because assigning a limit to the size of each block gives the Bitcoin network some constraints with regards to its scalability.

The on-chain transaction processing capacity of the bitcoin network is overall linked to the average block creation time of ~10 minutes and the immutable block size limit of 1 megabyte.

Jointly, these conditions affect the network’s throughput.

Often, people that do not intimately understand the architecture of the Bitcoin network, its purpose and where its security comes from, point out that existing payments processors, such as Visa, can already process ~1700 transactions per second. Instead, due to the mentioned intentional limitations, Bitcoin has a maximum transaction processing capacity between 3.3 and 7 transactions per second.

There are valid reasons why Satoshi Nakamoto kept the block size small:

  • Larger blocks make nodes more expensive to operate;
  • Because larger blocks make nodes more expensive to operate, less hashers would run one. This would lead to centralized entities having more power, diluting the decentralization proposition of the Bitcoin network and undermining its security;
  • Ultimately, no amount of max block size would support all the world’s future transactions on the main blockchain, therefore increasing the block size would not solve any actual problem. The most viable and effective techniques for scalability are layer-2 solutions, such as the popular Lightning Network.

In summary, Satoshi Nakamoto intentionally adopted a small block size to offer the possibility of running a node to the widest possible pool of people, hence enabling decentralization at scale and ensuring the security of the network.

The Bitcoin network has been running since 3rd January 2009 and, as of today, it has never been hacked in any shape or form.

Conclusive thoughts

First and foremost, if you made it here, congratulations!

Your understanding of how Bitcoin works is now far superior to the one that most people on the planet have.

This article was intended as an introduction to the world’s fastest growing protocol in adoption: Bitcoin. Many details and technical nuances were omitted in this piece. I tried to capture the essence of the technology and some of the most interesting (at least to me) design choices made by Satoshi Nakamoto.

I am planning to write more about Bitcoin and other cryptocurrencies, so you may want to check back here occasionally or follow me on LinkedIn and Twitter.