A Study on the Blockchain + Privacy Preservation with PlatON as An Example
Author: Wu Zhuocheng
Since the emergence of many privacy coins in 2014 along with the rise of blockchain privacy preservation, different techniques have sprung up in the field of cryptocurrency. But to this day, there is not a single project that can really implement privacy-preserving technology. In fact, privacy preservation should never be confused with the many over-hyped techniques that have become popular. If blockchain wants to truly develop into a virtual parallel world, what cannot be avoided is to create a reasonable economic system in this new world.
In the traditional economic system, the only factors of production were land and labor. In the industrial era, capital and entrepreneurial talent added to the chorus (Marshall theory). And in the digital era, data has become an important factor of production. The market-based allocation of factors of production can improve the efficiency of production, but the special nature of data, namely “what you see is what you get”, makes it different from other factors. That is to say, its worth is not reflected by a price. We are all owners and suppliers of data, just like everyone provides labor, but we are not paid for providing data. The root cause of this lies in that data has not been privatized, and this is where privacy-preserving technologies come in.
Several Technologies for Privacy Preservation
China Academy for Information and Communications Technology (CAICT), based on the Privacy-preserving Computation Technology Research Report (2020), published by Big Data Technology and Standard Committee of the China Communications Standards Association (CCSA), classifies privacy-preserving technologies into five categories: Federated Learning, Differential Privacy, Secure Multi-Party Computation, Homomorphic Encryption, and Trusted Execution Environment. Among them, federated learning and differential privacy are mainly used in the field of machine learning, with the average complexity of encrypting the raw data, and thus are not included in the scope of discussion. In addition, there is a privacy-preserving technique based on zero-knowledge proofs in blockchain.
Secure Multi-Party Computation (MPC) was first proposed by Turing Award winner, Yao Qizhi, an academician of the Chinese Academy of Sciences, in 1982. The technical logic is that in a distributed network, there are N untrusted nodes. Each node holds data x, and collaboratively executes function f(x) to obtain their own result y. If the y values of each node are equal, the result of the computation can be exported. The biggest advantage of MPC is that it provides 100% privacy of the data, and the results can be calculated with relative accuracy. The challenge is the high bandwidth requirement and the communication if with many collaborative participants. The current secure computation can each reach the millisecond level. But in the big data scenario, a data application or model training involves tens of thousands of data samples. The computation efficiency and burdens of communication are the bottlenecks that hinder the development of MPC.
Homomorphic Encryption (HE) is an asymmetric encryption cryptography where all participants can encrypt and compute the data, but only the private key holder can decrypt the data. The special feature of HE is that it allows direct computation on the encrypted data, and the computation result theoretically remains the same as the decrypted one. As we can imagine, the computation result under HE cryptography is difficult to achieve high accuracy, and it will be a big test to weigh the complexity of encryption and the accuracy of computation. Full homomorphic encryption is still in its theoretical stage, which is relatively backward in terms of trustworthiness, flexibility and efficiency, and is too inefficient in practical application. The construction and implementation are too complicated for large-scale commercial application.
Trusted Execution Environment (TEE) is the most widely used technology for large-scale commercial applications, such as Touch ID and face recognition on cell phones, etc. TEE data encryption must rely on hardware devices, and the computation process is performed in an isolated execution environment based on hardware protection. So it needs to rely on a trusted hardware vendor for security. Phala Network, Oasis Labs, Enigma, etc. are major application projects, which are closest to practical scenarios compared to other privacy-preserving computation solutions.
Zero-knowledge Proof (ZKP) is a special kind of interactive proof in which the prover knows the answer to a question and he can convince the verifier that his answer is correct without providing any useful information. Zero-knowledge proofs enable flexible interactive computation and cross-validation, but are still difficult to implement because they require repeated example verification to prove that the answer is true, which is very demanding in terms of hashrate. Currently, it takes around 7 seconds to generate such a proof, and high hashrate is needed to improve the computation rate. ZK Rollup on Ethereum Layer 2 is an application of zero-knowledge proofs. So the significance of ZK Rollup lies in not only scaling, but also helping Ethereum achieve off-chain privacy-preserving computation.
The biggest challenge facing privacy-preserving computation is how to improve the efficiency of privacy preservation and realize large-scale commercial application. The aforementioned technologies, whether HE or MPC based on computation, or ZKP based on verification, all have this problem. The only commercial application of TEE relies on hardware facilities, and the R&D and production of dedicated computation hardware requires huge upfront costs, which is why the concept of privacy-preserving computation has emerged around since 2014, but there has not been a real project on the ground. This is more like an industrial blockchain, which is different from our traditional blockchain in the sense that it needs to bridge the gap between the virtual world and the reality.
PlatON privacy-preserving design is a combination of some of the aforementioned technologies, and seeks to secure the entire network privacy from three perspectives: Privacy-preserving Computation, Privacy Verification, and Dedicated Privacy Circuit. First, the privacy-preserving computation is implemented through MPC and HE; then the computation results are verified through ZKP and Verifiable Computation (VC); and finally, combined with contractual computation, the encrypted smart contract is compiled into a circuit and split into multiple subtasks in the form of a circuit, which attracts the idle hashrate in the network to compute the subtasks through an incentive mechanism, solving the efficiency problem commonly found in the aforementioned technologies. In fact, this idea is also borrowed from the ZK Rollup of Ethereum, which moves complex operations off the chain and only transmits the computation results back to the main chain. Given the need to compile smart contracts into circuits, the PlatON team had to work with mainstream hardware vendors in the industry to further improve computational performance by improving its hardware.
PlatON’s Special PoS Consensus Algorithm
According to the official white paper, PlatON will launch dedicated FPGA/ASIC-based computation hardware when appropriate. This is not simply PoW mining. PoW is just one type of consensus protocol. As long as the community reaches consensus, PoW can be changed to another type of consensus altogether, such as PoS, toward which Ethereum is practicing. However, PlatON separates consensus from hashrate, which is only used to perform privacy-preserving computation. The PlatON public chain serves the functions of distributing computational tasks, matching computational tasks with hashrate, and recording transactions, while the core computational work happens outside the public chain. Of course, you can interpret this as a PoW in disguise, but privacy-preserving computation is not a meaningless puzzle game. Even without the blockchain, these computing hardware can be put into the centralized world to provide privacy preservation.
From this, the author speculates that the maintenance of the PlatON ecosystem may be divided into two parts: one is to use PoS protocol to obtain fixed block rewards, and the other is to provide hashrate to obtain labor fees from the data demand sides.
The first part of PoS is to outline the four mainstream models in the industry: Chain-Based, DPOS, VRF and BFT. Chain-Based is the earliest PoS, which selects the verifier for block generation based on the number of token holders, and this is the model currently adopted by Ethereum. DPoS is a model where each token holder commissions a delegate who participates in the generation and validation of the block, a model currently used by EOS. VRF is a random selection of validators through verifiable random function, and some representative projects include Dfinity, Algorand, etc. BFT is a model, by running Byzantine Fault Tolerance, to have multiple rounds of voting after the selection of validators to confirm the final block. NEO currently adopts this type of consensus algorithm.
According to the official blue paper, PlatON uses a special PoS consensus algorithm, Giskard, which consists of PPoS (PlatON PoS) and BFT. The PPoS is essentially a combination of Chain-Based and VRF, where a binomial cumulative distribution function curve is plotted based on the nodes’ staking and then VRF is used to randomly select validators. The advantage of this consensus is that the selected nodes are random and have no linear relationship with the nodes’ staking. After node confirmation, each node verifies the generated blocks through the BFT protocol and finally reaches block consensus, which reduces the probability of a block being controlled by a malicious node. The Giskard consensus mechanism can theoretically endogenously inhibit the expansion of mining pools to ensure the decentralization and security of PlatON public chain.
The second part is to provide privacy-preserving computation to obtain labor fees from data demand sides. The author believes that this is the essence of the PlatON consensus protocol, and if this consensus works as expected, then market-based pricing of this element of data will be realized. There are two problems in the process of data transaction: first, the ownership is not clear, and it is easy to be used without authorization; second, the data structure is diverse, and it is difficult to quantify by a uniformed standard.
The method shown in the blue paper is to use cryptographic techniques such as HE and MPC to confirm data rights and determine the owner of the data. Adhering to the principle of data sovereignty in the process of data transactions makes it possible to trade data usage rights without affecting data ownership. There are two methods for data pricing: the first is absolute pricing, that is, the price that data users are willing to pay for obtaining the data; the second is relative pricing, that is, to evaluate the contribution made by members of a given data set to the given common task. The latter utilizes the Shapley value as an important evaluation tool, an important concept introduced by the well-known economist Lloyd Shapley (the 2012 Nobel Prize winner in economics) when he studied cooperative games in 1953.
The status quo of the industry development
There are two main development paths for the privacy-preserving segment: one is the privacy coin, and the other is privacy-preserving public chain.
Privacy coins typically include XMR, DASH, ZEC, XZC, etc. XMR, as the leading project in this field, emerged in 2014. This technology only needs to encrypt the information such as the sender, receiver, transaction amount, and transaction IP, so that only the two parties involved in the transaction (or authorized third parties) can view the transaction information through the private key. Since there is not too much complicated information involved in currency circulation, the anonymous transaction is not a problem for encrypted currencies. This technology has been full-fledged. In fact, BTC is also upgrading its privacy algorithm through community voting. Technologies such as CoinJoin can merge multiple transactions to cover the upstream of UTXO.
The privacy-preserving public chain technology is more complicated as it is to encrypt smart contracts in nature. It needs to encrypt input and output data and network status to conceal them from all parties except the user (including the node that executes the smart contract). At present, ZK rollup on Ethereum Layer 2 and Polkadot’s parallel network Phala boast the most promising prospect, but these can only exist as sub-chains or parachains, mainly to provide data calculations for the main chain, and the calculation results must be returned to the main chain. Developing an independent privacy-preserving public chain is more difficult than above-mentioned technologies, and the leading projects for the time being are PlatON and Oasis. Once completed, they will release substantial potential. The reason is simple: as independent public chains, they allow developers to directly build privacy smart contracts on the main chain and also serve as side chains or parallel chains to provide privacy-preserving computation for other public chains.
Oasis team members include Professor Dawn Song from the University of Berkeley and a number of world-leading security experts. This project has received $45 million in investment from venture capital such as Binance Labs and a16z. In addition, Oasis has realized the interaction with the Ethereum network, and attracted developers to establish NFT projects on it. PlatON has now received $50 million of investment from Alpine Capital, Hash Global Capital and other institutions. Like Oasis, PlatON enables high-concurrency privacy-preserving computation; in addition to the consensus network (mainnet) and the privacy-preserving computation network, PlatON has an innovation — a layer of independent AI network, which aims to train big data models.