technical backgrounder
Network protocols
After the successful installation of client software, the software will connect to other nodes. The connection is established via the common network protocols (TCP/IP) and by default via port 8333. The node that wants to establish a connection sends an initial message(version message) to a known internet protocol (IP) address of another node. The message contains information about the node itself and the local copy of the Bitcoin ledger, enabling the connection to be established. Optionally, the address list of the new peer can be queried (getaddr message) so that the node can extend its known network with additional connections.
New nodes face the initial problem of not having any known addresses. To enable the software to establish a connection when the client is started for the first time, an IP list is supplied with the initial download. Alternatively, IP addresses can be entered manually.
Typically, every node attempts to maintain at least eight connections. The actual number can substantially diverge from this value. With regard to standard settings, nodes maintain on average thirty-two active connections. Possible firewall and router settings may limit the number of connections. In general, a broader network connectivity will facilitate the exchange of data.
If an interruption causes a node to have fewer than eight active connections, it will immediately attempt to establish new connections. For this purpose, known IP addresses can be used or new IP addresses requested from other nodes.
Let us assume that Tamara downloads and installs the Bitcoin client. After successful installation, the client consults the supplied IP list and establishes a connection with one of the other nodes.
In order to do this, the client sends a version message to which the other node responds with a verack (version acknowledged) message.
Edith can accept the connection by responding with a version message and waiting for the verack confirmation. Tamara can then request Edith’s IP address list. This is achieved using a getaddr message. Edith will send a random selection of IP addresses from her large pool of known IP addresses to Tamara using several addr messages. The random selection process is termed bootstrapping.
In many cases, Edith sends IP addresses that are present in her pool but that she is not currently connected with. This leads to a more robust network topology. Tamara receives the IP addresses of Michèle and Jake. She can use the new IP addresses to send further version messages and thus establish new connections.
In the real Bitcoin network, bootstrapping creates quasi-random connection paths. These lead to a complex and randomized network topology. Local proximity is irrelevant for the choice of connections. In this respect, the random creation of network partitions is practically impossible. Moreover, if a partition is created, this can be detected by a sudden drop in network computing resources (more precisely regarding the speed at which new blocks are created) and the number of transaction messages.
Partitions generally only become a problem if complete isolation of a certain (geographical) area persists over an extended period of time. This is a manageable risk, considering that any communication between two subnetworks can resolve partitions.
Geographical characteristics of the node distribution can therefore be of interest due to geopolitical considerations and to protect against (natural) disasters.
The Bitcoin system offers each network participant the possibility of independently verifying the legitimacy of all transactions included in the Bitcoin blockchain. If a network participant waives this option, he automatically loses part of his independence and must place a certain amount of trust in his information sources.
The exact structure of these dependencies and the extent of the trust vary greatly. Indirect network participation can take the form of centralized subnetworks or simplified payment verification (SPV).
Centralized Subnetworks
Centralized subnetworks display the highest form of dependency. The participants are only indirectly connected to the Bitcoin network and rely exclusively on the information and communications channel of a specific node.
Clients that are connected to a centralized subnetwork can exercise the wallet function without the need for direct access to the Bitcoin network. The central node is used as a proxy server, which can be consulted periodically to check the Bitcoin balances of the user’s addresses. In addition, transaction messages are transmitted to the central node and thus indirectly relayed to the Bitcoin network.
A connection to a centralized subnetwork can be much more convenient for a user since he has only to install a light client or to manage his Bitcoin balances via a web application. The resulting dependencies are hardly noticeable under normal operations. However, it would be possible for a central node to either withhold certain information from the participants or not relay their transactions to the rest of the network and thus block them. This can be done intentionally or can happen as a consequence of technical issues. In this respect, centralized subnetworks lose a large part of the robustness properties of a peer-to-peer network and introduce new vulnerabilities into the system.
In many cases, centralized subnetworks are also accompanied by custody services. In such relationships, the owner transfers complete control of his Bitcoin units to the central node. He does not hold a private key for the corresponding balance but only has a user account on the service provider’s platform, with which he can request the delivery of his Bitcoin units. The actual Bitcoin transaction is initiated by the central node. In such a relationship, the user only gets an IOU promising to deliver the Bitcoin units on request. This is comparable to credit money, for which the value of the promise depends on the creditworthiness of the issuer.
Simplified Payment Verification (SPV) Node
Simplified Payment Verification (SPV) clients facilitate the use of the wallet function without it being necessary to store a full copy of the Bitcoin blockchain locally. As opposed to indirect network participants who are tied to a central node, SPV nodes possess direct access to the Bitcoin network. The required data are sourced by various nodes and can be partially verified.
The diversity of data sources and the possibility of partially verifying the received data give the SPV node greater security and independence than a connection to a centralized subnetwork.
An SPV node holds only a small part of the blocks—the so-called block header. Among other things, the block header includes the identification number that depends on the transactions included but not the transactions themselves. For this reason, SPV clients require only around a thousandth of the storage capacity of full nodes. An SPV client needs to store only eighty bytes per block. More importantly, this amount remains unchanged regardless of the number of transactions included, resulting in a linear growth path even with a large increase in users and transactions.
Full nodes use the block height to verify a transaction. To ensure that the Bitcoin unit (unspent transaction output) referenced in a transaction has not already been used, the full nodes check the complete Bitcoin blockchain. SPV nodes instead use a heuristic based on the block depth—that is, the number of confirmations that secured a trans- action. If the block is referenced by a certain number of additional blocks (usually six), SPV nodes regard the transactions contained in it as valid. Due to the high computational resources required to create these subsequent blocks and the various sources used to obtain the information, the probability of a manipulation attempt is very low.
SPVs source information by selectively querying individual transactions. This creates two problems.
First, SPV nodes can verify whether a received transaction actually belongs to a block; however, they do not know whether they are being denied information or whether another, possibly competing transaction exists.
Second, information gathering can lead to privacy problems. If an SPV node asks only for transactions in connection with its own public keys (or Bitcoin addresses), the other nodes will be able to connect these pseudonyms to its IP address and create a distinct user profile. As a countermeasure, the SPV node could request a large amount of additional data. However, the large volume of data would undermine the original purpose for implementing the SPV client.
To counteract the second problem, transactions are usually queried via so-called bloom filters. Bloom filters specify a search request using hash functions. The SPV node sends a request for transactions that match a certain search pattern after applying various hash functions. The precision can vary according to requirements.
There is still a trade-off between privacy and data volume. False positive results are possible or even desirable because of the probabilistic nature of the system. False negative results are not possible. If a transaction is rejected by the filter, it is irrelevant for the SPV client.
Bloom filters serve the purpose of disguising the search queries by SPV nodes. Due to the nature of the hash function, it is much harder to identify the pattern behind the query. The idea originated with an academic essay by Bloom and was formalized by BIP0037 for the Bitcoin system.
The Exchange of Blocks
When the client software is first started, it spends several hours downloading, verifying, and indexing all the blocks of the Bitcoin blockchain. The first block, the genesis block, is integrated into the client software on delivery. All subsequent blocks have to be procured from the other nodes and verified by the client software. The volume of data contained in the Bitcoin blockchain was approximately 205 gigabytes (GB) at the end of 2019.
Each block needs to be downloaded and verified only once. Long loading times occur only if the client has to catch up on a large number of blocks; that is, during the initial installation of the client software or if the node was not connected to the network for a long time.
The comparison between two copies of the Bitcoin blockchain takes place via the mutual exchange of getblocks messages. These messages contain the identification number of the newest block in the local chain. If the two chains are equivalent, no blocks need to be exchanged. However, if one of the two nodes receives a getblocks message with an identification number that does not correspond to the last block of the local chain, it will try to locate the block with this identification number within the local chain and send an inv (inventory) message with the identification numbers of the successors of this block. The node that receives the inv message, then, has the possibility to request the respective blocks using getdata messages.
This principle is used to prevent a node from receiving block data that it already has. Each node can independently decide which data it wants to request from which nodes.
When a node receives a block, he independently examines the validity of the transactions that it contains and verifies that the transactions reference only previously unspent transaction outputs (UTXO) and were initiated by the owner.
The node also checks the reference to the old block and examines the current block’s identification number to determine whether it meets the threshold value criterion. Each node can thus clearly determine whether a block fulfills the various consensus conditions. If and only if all checks are passed, the node will include the block into his version of the blockchain.
The Exchange of Transactions
Transaction messages are payment orders that nodes can verify, relay, and process. The method for exchanging transaction messages is very similar to that used for blocks. The inv messages can alternatively include transaction identification numbers. If a node receives an inv message that contains an unknown transaction identification number, the node can similarly use a getdata message to request the transaction.
The actual transmission of the transaction data is subsequently made using a tx message. If a node receives a requested tx message, it will first examine it and forward only if the validation is successful. The validation is performed using predefined unlocking conditions and signatures. If validation fails, the transaction will be discarded. This protects the network from certain types of DoS attacks, which cause data transmissions to seize up by flooding the service with a large number of invalid transactions. However, if the validation is successful, the transaction message will be filed in the node’s local memory, the so-called mempool, and offered to other nodes as part of the inv message.
Pseudonyms
Owing to the decentralized structure of the Bitcoin network, it is not possible to manage Bitcoin balances and access rights in a traditional manner. There is no central authority that is responsible for opening accounts, recording owners’ personal details, and authorizing subsequent access. Therefore, decentralization makes it extremely difficult to examine the legitimacy of ownership claims.
The use of real identities in the form of personal names and personal details is neither feasible nor desirable in the Bitcoin system. It is not desirable because if Bitcoin addresses were registered under personal names, it would be possible to associate all transactions to individuals. Information about salary payments, purchase preferences, and personal wealth would be accessible to everyone. It is not feasible because in a decentralized system it is impossible to provide proof of identity in the same way as in the traditional financial system. The Bitcoin system uses pseudonyms instead of actual identities to guarantee the legitimacy of transactions. A pseudonym-based solution in a decentralized system requires that the following conditions are met:
- Participants must be able to create their own pseudonyms without the assistance of a central party.
- No two pseudonyms may overlap.
- Ownership claims to the pseudonyms must be publicly verifiable so that access to the respective Bitcoin balances can be restricted.
Bitcoin satisfies these conditions by using pairs of cryptographic keys. A pair consists of a private and a public key. The public key (or the Bitcoin address derived from it) acts as a pseudonym that represents the identity of the respective participant but cannot be easily linked to a person (point 1). 1 In practice, the number of pseudonyms is so large that the probability of two persons choosing the same pseudonym is negligible (point 2). The private key must always remain in the exclusive possession of the person who generated the pseudonym and thereby provides proof that the owner of the respective pseudonym is authorized to use it (point 3).
Generating a Key Pair
To create a key pair, a person must select at random an element from an unimaginably large set of numbers which ranges from 1 to 115, 792, 089, 237, 316, 195, 423, 570, 985, 008, 687, 907, 852, 837, 564, 279, 074, 904, 382, 605, 163, 141, 518, 161, 494, 336; that is, between 1 and a seventy-eight-digit number. The selected number serves as a private key and can be subsequently used to provide proof of ownership.
The public key is derived from the private key. It is derived by multiplying a commonly known base point G of the elliptic curve by the previously selected private key k prv. For this reason, the public key is a point K pub on an elliptic curve that is represented by an x and a y value. The multiplication is shown in the formula below:
K pub = k prv ◦ G.
It is crucial that multiplications based on elliptic curves cannot be inverted. Otherwise, every person who knows the pseudonym could then derive the corresponding right of access in the form of the private key.
Owing to the one-way function, people are able to disclose their public key as a pseudonym while at the same time retaining exclusive knowledge of their private key.
A person can choose a private key, derive a pseudonym from it, and receive a Bitcoin payment on behalf of the pseudonym. Since the person is in exclusive possession of the private key, it can be used to prove ownership of the associated pseudonym and all of its assets.
The most common pseudonyms are Bitcoin addresses. To derive a Bitcoin address from a public key, we need a few additional steps as shown in figure 4.1. The Bitcoin address is nothing more than a hash value (see section 4.2) of the public key. For now, we will consider public keys and Bitcoin addresses as equivalents. We will look at some advantages of Bitcoin addresses later and differentiate them from the public key.
To further describe pseudonyms and access rights, we will follow Tamara, who just joined the Bitcoin network in chapter 3. Tamara now needs a pseudonym, which she will use to receive Bitcoin units. The following steps will be executed by her wallet software.
First, a random number, k prv , is chosen as the private key:
k prv = 100649517912463298218554941963735551419990... 919394775808943667076258561523410426.
From k prv , the software derives the corresponding public key by means of multiplication on elliptic curves (see section 4.3.2). Tamara obtains the point with the following coordinates as her public key:
x K pub = 430861088190638454717842912988288069473526... 45388418363213743744756576526107326,
y K pub = 746045400873459552096268383348084222597854... 86813648239447613724663528494663884.
Representation of the Keys
Bitcoin introduced another format: the so-called Base58Check. This format uses base 58, which consists of the integers 1–9 as well as all uppercase and lowercase alphabetical characters, except for O (uppercase o), l (lowercase l), and I (uppercase i). The large base permits the information to be written very compactly. At the same time, this format avoids alphanumeric symbols that can be confused with others when being transcribed. As a further measure against transmission errors, the format contains a checksum with which some typos can be recognized.
Base58Check is used to display private keys and some pseudonyms, where a prefix identifies the type of data. If the sequence starts with a 1, a 3, or bc, it is a pseudonym. The prefixes 5, K, and L refer to a private key. Private keys in Base58Check are also called wallet import format (WIF) keys.
Public keys are usually presented in hexadecimal. This notation, which employs base 16, uses the integers 0–9 and the characters a–f. A single character in hexadecimal rep- resents exactly four bits in binary (2 4 = 16 1 ).
When Tamara’s public key coordinates from section 4.1.1 are converted into the hexadecimal system, they respectively produce the following two alphanumeric sequences:
x K pub = 5f41df966899767381592461911e12789393736b29 . . .
0a5d4beda3ba573d5582be,
y K pub = a4f0ac5d9ca56b776db9f10895303efc8450892e0f . . .
8bd99db228dbd1206f08cc.
To be presented in a single alphanumeric sequence, the coordinates are concatenated and are supplemented by the prefix 04. 4 This representation is called the uncompressed public key K pub . K pub = 04 ^ x K pub ^ y K pub = 045f41df966899767381592461911e12789393736b... 290a5d4beda3ba573d5582bea4f0ac5d9ca56b776d... b9f10895303efc8450892e0f8bd99db228dbd1206f... 08cc
Because the public key corresponds to a point on a predefined elliptic curve, the x-coordinate is sufficient to compute the corresponding y-coordinate. More precisely, given any value for x, there are no more than two potential candidates for y. This is due to the symmetry of the elliptic curves (see section 4.3.2). To obtain a unique point, the x value is extended by a prefix. The prefix is 02 if the y value of the public key is even and 03 if the y value is odd. In Tamara’s case the prefix 02 is used. This compressed representation of the public key will subsequently be termed K pub .
K pub = 025f41df966899767381592461911e12789393736b... 290a5d4beda3ba573d5582be.
The compressed public key has the great advantage that it is shorter. For most transactions, public keys have to be included in the transaction at some point and therefore become part of the Bitcoin blockchain. A shorter key length reduces the required storage space.
Bitcoin Addresses
The most common pseudonym is the Bitcoin address. It is derived from a public key to which two hash functions are applied one after the other. The double hash is a one-way function; in other words, the public key cannot be derived from the Bitcoin address.
A Bitcoin address has a length of 160 bits but is generally presented in Base58Check format with the prefix 1.
SHA256 and RIPEMD160 are the names of the two hash functions. Bitcoin addresses are also referred to as public-key-hashes. The main advantages of the Bitcoin address over the public key are its convenience, security, and flexibility.
First, a Bitcoin address is significantly shorter than the public key and therefore better suited for daily use. It contains a checksum due to the Base58Check encoding.
Second, the Bitcoin address offers some added security. Even if an attacker were to discover a vulnerability in the elliptic curve, he would only be able to start an attack once he had obtained the public key of the target. The Bitcoin address ensures that the public key has to be disclosed only at the time of a transaction. This makes Bitcoin addresses much more robust against the threat of quantum computers.
Third, Bitcoin addresses can also be constructed as so-called pay-to-script-hash addresses. These addresses are not derived from the hash value of a public key and are therefore not classic Bitcoin addresses. Instead, they are based on the hash value of a whole locking script that binds the access right to a specific condition. This allows unusual pseudonym constructs to be created, which for example require that payments must be signed by several private keys or can only be spent after a certain amount of time. These pay-to-script-hash addresses always begin with the number 3.
Compressed Keys and Bitcoin Addresses
Although the compressed and the uncompressed public keys represent the exact same point on the elliptic curve and derive from the same private key, they lead to different Bitcoin addresses. If Tamara generates her Bitcoin address from the uncompressed public key K pub , she obtains the Bitcoin address B. If she uses the compressed public key K pub instead, she receives the Bitcoin address B.
B =1E8jc2eRXmjF2FKebTZwAsxwaRWeDvEwDj, B =13HE523Wvpqzjijjb1z3NDUz25AQN2eLw1.
Tamara can use her private key to access both of these Bitcoin addresses. However, her private key can be represented in two distinct ways. Private keys with the prefix 5 manage Bitcoin addresses generated from the uncompressed public key. Private keys with the prefix K or L are used to manage Bitcoin addresses generated from the compressed public key (see below). The prefix facilitates in particular the import of the keys into a wallet software. The information in the prefix instructs the software which pseudonyms it needs to check for balances and therefore significantly increases efficiency. The “compressed” WIF version of Tamara’s private key k prv corresponds to the following string:
k prv = L4gGHffx1goCCfDCpGAdZYmjKPgNk1mBnT2dPakUkRWjEec7ArQY.
Strictly speaking, however, the term “compressed” private key is incorrect. It is not a compressed version of the information but merely a signal that indicates which pseudonyms to use. In fact, the “compressed” private key is even 8 bits or 2 hexadecimal characters longer than the uncompressed format. The length of the private key is not very crucial because it is never transmitted with transactions and therefore will not burden the blockchain.
Disposable Pseudonyms
All pseudonyms in the Bitcoin system are designed to be used only once. This may not seem intuitive (compared with bank account numbers), but it is based on the fact that the Bitcoin blockchain is public. If a person always uses the same pseudonym, it will be easier for others to identify patterns in transactions and to track down the corresponding identity to the pseudonym. If the identification succeeds, all past and future transactions of the person concerned could be queried and monitored.
To make such analyses more difficult, most wallets create a new key pair for each transaction and always use different Bitcoin addresses. When a payment is made, the wallet sends the invoiced amount to the invoice issuer’s address and additionally generates new addresses to which the remaining balance is transferred. Observers cannot distinguish between the invoice amount and the change and are equally unable to discover which pseudonym is retained by the owner.
The constant need for new addresses poses questions regarding the selection process and the organization of the access data. New pseudonyms can either be generated from random, independent private keys or be based on an initial value, a so-called seed.