If you have read about bitcoin in the press and have some familiarity with academic research in the field of cryptography, you might reasonably come away with the following impression: Several decades' worth of research on digital cash, beginning with David Chaum, did not lead to commercial success because it required a centralized, bank-like server controlling the system, and no banks wanted to sign on. Along came bitcoin, a radically different proposal for a decentralized cryptocurrency that did not need the banks, and digital cash finally succeeded. Its inventor, the mysterious Satoshi Nakamoto, was an academic outsider, and bitcoin bears no resemblance to earlier academic proposals.
This article challenges that view by showing nearly all of the technical components of bitcoin originated in the academic literature of the 1980s and 1990s . This is not to diminish Nakamoto's achievement but to point out he stood on the shoulders of giants. Indeed, by tracing the origins of the ideas in bitcoin, we can zero in on Nakamoto's true leap of insight—the specific, complex way in which the underlying components are put together. This helps explain why bitcoin took so long to be invented. Readers already familiar with how bitcoin works may gain a deeper understanding from this historical presentation. Bitcoin's intellectual history also serves as a case study demonstrating the relationships among academia, outside researchers, and practitioners, and offers lessons on how these groups can benefit from one another.
The Ledger
If you have a secure ledger, the process to leverage it into a digital payment system is straightforward. For example, if Alice sends Bob $100 by PayPal, then PayPal debits $100 from Alice's account and credits $100 to Bob's account. This is also roughly what happens in traditional banking, although the absence of a single ledger shared between banks complicates things.
This idea of a ledger is the starting point for understanding bitcoin. It is a place to record all transactions that happen in the system, and it is open to and trusted by all system participants. Bitcoin converts this system for recording payments into a currency. Whereas in banking, an account balance represents cash that can be demanded from the bank, what does a unit of bitcoin represent? For now, assume that what is being transacted holds value inherently.
How can you build a ledger for use in an environment like the Internet where participants may not trust each other? Let's start with the easy part: the choice of data structure. There are a few desirable properties. The ledger should be immutable or, more precisely, append only: you should be able to add new transactions but not remove, modify, or reorder existing ones. There should also be a way to obtain a succinct cryptographic digest of the state of the ledger at any time. A digest is a short string that makes it possible to avoid storing the entire ledger, knowing that if the ledger were tampered with in any way, the resulting digest would change, and thus the tampering would be detected. The reason for these properties is that unlike a regular data structure that is stored on a single machine, the ledger is a global data structure collectively maintained by a mutually untrusting set of participants. This contrasts with another approach to decentralizing digital ledgers,7,13,21 in which many participants maintain local ledgers and it is up to the user querying this set of ledgers to resolve any conflicts.
Linked timestamping. Bitcoin's ledger data structure is borrowed, with minimal modifications, from a series of papers by Stuart Haber and Scott Stornetta written between 1990 and 1997 (their 1991 paper had another co-author, Dave Bayer).5,22,23 We know this because Nakamoto says so in his bitcoin white paper.34 Haber and Stornetta's work addressed the problem of document timestamping—they aimed to build a "digital notary" service. For patents, business contracts, and other documents, one may want to establish that the document was created at a certain point in time, and no later. Their notion of document is quite general and could be any type of data. They do mention, in passing, financial transactions as a potential application, but it was not their focus.
In a simplified version of Haber and Stornetta's proposal, documents are constantly being created and broadcast. The creator of each document asserts a time of creation and signs the document, its timestamp, and the previously broadcast document. This previous document has signed its own predecessor, so the documents form a long chain with pointers backwards in time. An outside user cannot alter a timestamped message since it is signed by the creator, and the creator cannot alter the message without also altering the entire chain of messages that follows. Thus, if you are given a single item in the chain by a trusted source (for example, another user or a specialized timestamping service), the entire chain up to that point is locked in, immutable, and temporally ordered. Further, if you assume the system rejects documents with incorrect creation times, you can be reasonably assured that documents are at least as old as they claim to be. At any rate, bit-coin borrows only the data structure from Haber and Stornetta's work and reengineers its security properties with the addition of the proof-of-work scheme described later in this article.
In their follow-up papers, Haber and Stornetta introduced other ideas that make this data structure more effective and efficient (some of which were hinted at in their first paper). First, links between documents can be created using hashes rather than signatures; hashes are simpler and faster to compute. Such links are called hash pointers. Second, instead of threading documents individually—which might be inefficient if many documents are created at approximately the same time—they can be grouped into batches or blocks, with documents in each block having essentially the same time-stamp. Third, within each block, documents can be linked together with a binary tree of hash pointers, called a Merkle tree, rather than a linear chain. Incidentally, Josh Benaloh and Michael de Mare independently introduced all three of these ideas in 1991,6 soon after Haber and Stornetta's first paper.
Merkle trees. Bitcoin uses essentially the data structure in Haber and Stornetta's 1991 and 1997 papers, shown in simplified form in Figure 2 (Nakamoto was presumably unaware of Benaloh and de Mare's work). Of course, in bitcoin, transactions take the place of documents. In each block's Merkle tree, the leaf nodes are transactions, and each internal node essentially consists of two pointers. This data structure has two important properties. First, the hash of the latest block acts as a digest. A change to any of the transactions (leaf nodes) will necessitate changes propagating all the way to the root of the block, and the roots of all following blocks. Thus, if you know the latest hash, you can download the rest of the ledger from an untrusted source and verify that it has not changed. A similar argument establishes another important property of the data structure—that is, someone can efficiently prove to you that a particular transaction is included in the ledger. This user would have to send you only a small number of nodes in that transaction's block (this is the point of the Merkle tree), as well as a small amount of information for every following block. The ability to efficiently prove inclusion of transactions is highly desirable for performance and scalability.
Merkle trees, by the way, are named for Ralph Merkle, a pioneer of asymmetric cryptography who proposed the idea in his 1980 paper.33 His intended application was to produce a digest for a public directory of digital certificates. When a website, for example, presents you with a certificate, it could also present a short proof that the certificate appears in the global directory. You could efficiently verify the proof as long as you know the root hash of the Merkle tree of the certificates in the directory. This idea is ancient by cryptographic standards, but its power has been appreciated only of late. It is at the core of the recently implemented Certificate Transparency system.30 A 2015 paper proposes CONIKS, which applies the idea to directories of public keys for end-to-end encrypted emails.32 Efficient verification of parts of the global state is one of the key functionalities provided by the ledger in Ethereum, a new cryptocurrency.
Bitcoin may be the most well-known real-world instantiation of Haber and Stornetta's data structures, but it is not the first. At least two companies—Surety starting in the mid-1990s and Guardtime starting in 2007—offer document timestamping services. An interesting twist present in both of these services is an idea mentioned by Bayer, Haber, and Stornetta,5 which is to publish Merkle roots periodically in a newspaper by taking out an ad. Figure 3 shows a Merkle root published by Guardtime.
Byzantine fault tolerance. Of course, the requirements for an Internet currency without a central authority are more stringent. A distributed ledger will inevitably have forks, which means that some nodes will think block A is the latest block, while other nodes will think it is block B. This could be because of an adversary trying to disrupt the ledger's operation or simply because of network latency, resulting in blocks occasionally being generated near-simultaneously by different nodes unaware of each other's blocks. Linked timestamping alone is not enough to resolve forks, as was shown by Mike Just in 1998.26
A different research field, fault-tolerant distributed computing, has studied this problem, where it goes by different names, including state replication. A solution to this problem is one that enables a set of nodes to apply the same state transitions in the same order—typically, the precise order does not matter, only that all nodes are consistent. For a digital currency, the state to be replicated is the set of balances, and transactions are state transitions. Early solutions, including Paxos, proposed by Turing Award winner Leslie Lamport in 1989,28,29 consider state replication when communication channels are unreliable and when a minority of nodes may exhibit certain "realistic" faults, such as going offline forever or rebooting and sending outdated messages from when it first went offline. A prolific literature followed with more adverse settings and efficiency trade-offs.
A related line of work studied the situation where the network is mostly reliable (messages are delivered with bounded delay), but where the definition of "fault" was expanded to handle any deviation from the protocol. Such Byzantine faults include both naturally occurring faults as well as maliciously crafted behaviors. They were first studied in a paper also by Lamport, cowritten with Robert Shostak and Marshall Pease, as early as 1982.27 Much later, in 1999, a landmark paper by Miguel Castro and Barbara Liskov introduced practical Byzantine fault tolerance (PBFT), which accommodated both Byzantine faults and an unreliable network.8 Compared with linked time-stamping, the fault-tolerance literature is enormous and includes hundreds of variants and optimizations of Paxos, PBFT, and other seminal protocols.
In his original white paper, Nakamoto does not cite this literature or use its language. He uses some concepts, referring to his protocol as a consensus mechanism and considering faults both in the form of attackers, as well as nodes joining and leaving the network. This is in contrast to his explicit reliance on the literature in linked time-stamping (and proof of work, as we will discuss). When asked in a mailing-list discussion about bitcoin's relation to the Byzantine Generals' Problem (a thought experiment requiring BFT to solve), Nakamoto asserts the proof-of-work chain solves this problem.35
In the following years, other academics have studied Nakamoto consensus from the perspective of distributed systems. This is still a work in progress. Some show that bitcoin's properties are quite weak,45 while others argue that the BFT perspective does not do justice to bitcoin's consistency properties.41 Another approach is to define variants of well-studied properties and prove that bitcoin satisfies them.19 Recently these definitions were substantially sharpened to provide a more standard consistency definition that holds under more realistic assumptions about message delivery.37 All of this work, however, makes assumptions about "honest," that is, procotol-compliant, behavior among a subset of participants, whereas Nakamoto suggests that honest behavior need not be blindly assumed, because it is incentivized. A richer analysis of Nakamoto consensus accounting for the role of incentives does not fit cleanly into past models of fault-tolerant systems.
back to top Proof Of Work
Virtually all fault-tolerant systems assume that a strict majority or supermajority (for example, more than half or two-thirds) of nodes in the system are both honest and reliable. In an open peer-to-peer network, there is no registration of nodes, and they freely join and leave. Thus an adversary can create enough Sybils, or sockpuppet nodes, to overcome the consensus guarantees of the system. The Sybil attack was formalized in 2002 by John Douceur,14 who turned to a cryptographic construction called proof of work to mitigate it.
The origins. To understand proof of work, let's turn to its origins. The first proposal that would be called proof of work today was created in 1992 by Cynthia Dwork and Moni Naor.15 Their goal was to deter spam. Note that spam, Sybil attacks, and denial of service are all roughly similar problems in which the adversary amplifies its influence in the network compared to regular users; proof of work is applicable as a defense against all three. In Dwork and Naor's design, email recipients would process only those email messages that were accompanied by proof that the sender had performed a moderate amount of computational work—hence, "proof of work." Computing the proof would take perhaps a few seconds on a regular computer. Thus, it would pose no difficulty for regular users, but a spammer wishing to send a million email messages would require several weeks, using equivalent hardware.
Note that the proof-of-work instance (also called a puzzle) must be specific to the email, as well as to the recipient. Otherwise, a spammer would be able to send multiple messages to the same recipient (or the same message to multiple recipients) for the cost of one message to one recipient. The second crucial property is that it should pose minimal computational burden on the recipient; puzzle solutions should be trivial to verify, regardless of how difficult they are to compute. Additionally, Dwork and Naor considered functions with a trapdoor, a secret known to a central authority that would allow the authority to solve the puzzles without doing the work. One possible application of a trapdoor would be for the authority to approve posting to mailing lists without incurring a cost. Dwork and Naor's proposal consisted of three candidate puzzles meeting their properties, and it kicked off a whole research field, to which we will return.
cryptocurrency market bitcoin usd