In reality, it will take us years to obtain validity proofs for Ethereum consensus.
Original Title: 《Possible futures of the Ethereum protocol, part 4: The Verge》
Author: Vitalik Buterin
Translation: Mensh, ChainCatcher
Special thanks to Justin Drake, Hsia-wei Wanp, Guillaume Ballet, Icinacio, Rosh Rudolf, Lev Soukhanoy, Ryan Sean Adams, and Uma Roy for their feedback and review.
One of the most powerful features of blockchain is that anyone can run a node on their own computer and verify the correctness of the blockchain. Even if 9,596 nodes running chain consensus (PoW, PoS) all immediately agree to change the rules and start producing blocks according to the new rules, every person running a fully validating node will refuse to accept the chain. Coin issuers who are not part of this conspiracy group will automatically converge on a chain that continues to follow the old rules and continue to build on that chain, and users who fully validate will follow this chain.
This is the key difference between blockchain and centralized systems. However, for this feature to hold, running a fully validating node needs to be truly feasible for enough people. This applies both to proposers (because if proposers do not validate the chain, they are not contributing to enforcing protocol rules) and to ordinary users. Today, running a node on a consumer laptop (including the one used to write this article) is possible, but it is difficult to do so. The Verge aims to change this situation, making the computational cost of fully validating the chain cheap, so that every mobile wallet, browser wallet, and even smartwatch will default to validation.

The Verge 2023 Roadmap
Originally, "Verge" referred to moving Ethereum state storage to Verkle trees—a tree structure that allows for more compact proofs, enabling stateless validation of Ethereum blocks. Nodes can validate an Ethereum block without storing any Ethereum state (account balances, contract code, storage...) on their hard drive, at the cost of a few hundred KB of proof data and a few hundred milliseconds of extra time to verify a proof. Today, Verge represents a larger vision focused on achieving maximum resource-efficient validation of the Ethereum chain, which includes not only stateless validation technology but also using SNARKs to verify all Ethereum execution.
In addition to the long-term focus on SNARK verification of the entire chain, another new issue concerns whether Verkle trees are the optimal technology. Verkle trees are vulnerable to quantum computer attacks, so if we replace the current KECCAK Merkle Patricia trees with Verkle trees, we will have to replace the tree again in the future. The self-replacement method for Merkle trees is to skip directly to using STARKs for Merkle branches, placing them into a binary tree. Historically, this approach has been considered infeasible due to overhead and technical complexity. However, recently, we have seen Starkware prove 1.7 million Poseidon hashes per second on a laptop using ckcle STARKs, and due to technologies like GKB, the proof time for more "traditional" hashes is also rapidly decreasing. Therefore, over the past year, "Verge" has become more open, with several possibilities.
The Verge: Key Goals
In this chapter
What problem are we solving?
Today, Ethereum clients need to store hundreds of gigabytes of state data to validate blocks, and this amount increases every year. Raw state data increases by about 30GB per year, and each client must store some extra data on top to efficiently update the trie.

This reduces the number of users who can run fully validating Ethereum nodes: although large hard drives capable of storing all Ethereum state and even years of history are common, the computers people typically buy often have only a few hundred gigabytes of storage. State size also creates huge friction in the process of initially setting up a node: the node needs to download the entire state, which can take hours or days. This creates various knock-on effects. For example, it greatly increases the difficulty for node operators to upgrade their node setup. Technically, upgrades can be done without downtime—start a new client, wait for it to sync, then shut down the old client and transfer keys—but in practice, this is technically very complex.
How does it work?
Stateless validation is a technique that allows nodes to validate blocks without possessing the entire state. Instead, each block comes with a witness that includes: (i) the values, code, balances, and storage at specific locations in the state that the block will access; (ii) cryptographic proofs that these values are correct.
In practice, implementing stateless validation requires changing Ethereum's state tree structure. This is because the current Merkle Patricia tree is extremely unfriendly to implementing any cryptographic proof scheme, especially in the worst case. This is true for both "raw" Merkle branches and the possibility of "wrapping" them into STARKs. The main difficulties stem from some weaknesses of MPT:
1. It is a hexary tree (i.e., each node has 16 children). This means that in a tree of size N, a proof on average requires 32*(16-1)*log16(N) = 120*log2(N) bytes, or about 3,840 bytes in a tree with 2^32 items. For a binary tree, only 32*(2-1)*log2(N) = 32*log2(N) bytes are needed, or about 1,024 bytes.
2. Code is not Merklized. This means that to prove any access to account code, the entire code must be provided, up to 24,000 bytes.

We can calculate the worst case as follows:
30,000,000 gas / 2,400 (cold account read cost) * (5 * 488 + 24,000) = 330,000,000 bytes
The branch cost is slightly reduced (using 5*480 instead of 8*480), because when there are more branches, their top parts are repeated. But even so, the amount of data to be downloaded in a single slot is completely unrealistic. If we try to wrap it with STARK, we encounter two problems: (i) KECCAK is relatively unfriendly to STARK; (ii) 330MB of data means we must prove 5 million calls to the KECCAK round function, which may be unprovable for all but the most powerful consumer hardware, even if we can make STARK proofs of KECCAK more efficient.
If we directly replace the hexary tree with a binary tree and additionally Merklize the code, then the worst case becomes roughly 30,000,000/2,400*32*(32-14+8) = 10,400,000 bytes (14 is the subtraction for redundant bits for 2^14 branches, and 8 is the length of the proof into the code block leaf). Note that this requires changing gas costs, charging for access to each individual code block; EIP-4762 does just that. 10.4 MB is much better, but for many nodes, the data to be downloaded in a single slot is still too much. Therefore, we need to introduce more powerful technology. In this regard, there are two leading solutions: Verkle trees and STARKed binary hash trees.
Verkle Trees
Verkle trees use elliptic curve-based vector commitments to make shorter proofs. The key is that, regardless of the tree width, each parent-child relationship in the proof is only 32 bytes. The only limitation on proof tree width is that if the proof tree is too wide, proof computation efficiency decreases. The proposed implementation for Ethereum has a width of 256.

Thus, the size of a single branch in the proof becomes 32 - log256(N) = 4*log2(N) bytes. Therefore, the theoretical maximum proof size is roughly 30,000,000 / 2,400 * 32 * (32 - 14 + 8) / 8 = 130,000 bytes (due to uneven distribution of state blocks, the actual calculation is slightly different, but this is a good first approximation).
It should also be noted that in all the above examples, this "worst case" is not the worst: a worse case is if an attacker deliberately "mines" two addresses so that they have a long common prefix in the tree and reads data from one of them, which may double the worst-case branch length. But even with such a case, the worst-case proof length for Verkle trees is 2.6MB, which is basically in line with the current worst-case witness data.
We also use this observation for another thing: we make the cost of accessing "adjacent" storage very cheap: either many code blocks of the same contract or adjacent storage slots. EIP-4762 provides a definition of adjacency and charges only 200 gas for adjacent accesses. In the case of adjacent accesses, the worst-case proof size becomes 30,000,000 / 200*32 - 4,800,800 bytes, which is still roughly within tolerance. If we want to reduce this value for safety, we can slightly increase the fee for adjacent accesses.
STARKed Binary Hash Trees
The principle of this technology is self-explanatory: you simply make a binary tree, get a maximum 10.4 MB proof, prove the values in the block, and then replace the proof with a STARK proof. In this way, the proof itself contains only the data being proved, plus a fixed overhead of 100-300kB from the actual STARK.
The main challenge here is verification time. We can do basically the same calculation as above, except we calculate hashes instead of bytes. A 10.4 MB block means 330,000 hashes. If we add the possibility of an attacker "mining" addresses with a long common prefix in the address tree, the worst-case number of hashes will reach about 660,000. Therefore, if we can prove 200,000 hashes per second, that's fine.
On consumer laptops using the Poseidon hash function, these numbers have already been achieved, and the Poseidon hash function is specifically designed for STARK-friendliness. However, the Poseidon system is still relatively immature, so many people do not yet trust its security. Therefore, there are two realistic paths forward:
If we want to prove conservative hash functions, the Starkware STARK circuit at the time of writing can only prove 10-30k hashes per second on a consumer laptop. However, STARK technology is improving rapidly. Even today, GKR-based technology shows the potential to increase this speed to the 100-200k range.
Witness Use Cases Beyond Block Validation
Besides block validation, there are three other key use cases that require more efficient stateless validation:
All these use cases have one thing in common: they require a considerable number of proofs, but each proof is small. Therefore, STARK proofs do not make practical sense for them; instead, the most realistic approach is to use Merkle branches directly. Another advantage of Merkle branches is updatability: given a proof of a state object X rooted at block B, if a child block B2 and its witness are received, the proof can be updated to be rooted at block B2. Verkle proofs are also natively updatable.
How does this relate to existing research:
What else can be done?
The main remaining work is:
1. More analysis of the consequences of EIP-4762 (stateless gas cost changes)
2. More work on completing and testing the transition program, which is the main complexity of any stateless environment implementation
3. More security analysis of Poseidon, Ajtai, and other "STARK-friendly" hash functions
4. Further development of ultra-efficient STARK protocol features for "conservative" (or "traditional") hashes, such as Binius or GKR-based approaches.
In addition, we will soon decide to choose one of the following three options: (i) Verkle trees, (ii) STARK-friendly hash functions, and (iii) conservative hash functions. Their characteristics can be roughly summarized in the table below:

In addition to these "headline numbers," there are some other important considerations:
If we want to achieve Verkle witness updatability in a quantum-safe and reasonably efficient way, another possible approach is lattice-based Merkle trees.
If, in the worst case, the efficiency of the proof system is not high enough, we can also use the unexpected tool of multidimensional gas to compensate: set separate gas limits for (i) calldata; (ii) computation; (iii) state access, and possibly other different resources. Multidimensional gas increases complexity, but in exchange, it more strictly limits the ratio between the average and worst cases. With multidimensional gas, the maximum number of branches that need to be proved in theory may be reduced from 12,500 to, for example, 3,000. This would make BLAKE3 (barely) sufficient even today.

Multidimensional gas allows block resource limits to more closely match the underlying hardware resource limits
Another unexpected tool is to delay state root computation to the slot after the block. In this way, we have a full 12 seconds to compute the state root, which means that even in the most extreme case, a proof time of only 60,000 hashes per second is sufficient, again making BLAKE3 barely meet the requirements.
The downside of this approach is that it increases light client latency by one slot, but there are more clever techniques to reduce this delay to only the proof generation delay. For example, proofs can be broadcast on the network as soon as they are generated by any node, rather than waiting for the next block.
How does it interact with other parts of the roadmap?
Solving the stateless problem greatly increases the difficulty of solo staking. If there is technology to lower the minimum balance for solo staking, such as Orbit SSF or application-layer strategies like squad staking, this will become more feasible.
If EOF is introduced at the same time, multidimensional gas analysis becomes easier. This is because the main execution complexity of multidimensional gas comes from handling child calls that do not pass all gas from the parent call, and EOF can simply make such child calls illegal, making this problem trivial (and native account abstraction will provide an in-protocol alternative for the current main use case of partial gas).
There is also an important synergy between stateless validation and history expiry. Today, clients must store nearly 1TB of historical data; this data is several times the size of state data. Even if clients are stateless, unless we can relieve clients of the responsibility of storing historical data, the dream of almost no-storage clients cannot be realized. The first step in this direction is EIP-4444, which also means storing historical data in torrents or the Portal Network.
What problem are we solving?
The long-term goal of Ethereum block validation is clear—it should be possible to validate an Ethereum block by: (i) downloading the block, or even just downloading a small part of the block's data availability sampling; (ii) verifying a small proof that the block is valid. This would be a very low-resource operation, doable in mobile clients, browser wallets, or even on another chain (without the data availability part).
To achieve this, we need SNARK or STARK proofs for (i) the consensus layer (i.e., proof of stake) and (ii) the execution layer (i.e., EVM). The former is a challenge in itself and should be addressed as we continue to improve the consensus layer (e.g., for single-slot finality). The latter requires EVM execution proofs.
What is it and how does it work?
Formally, in the Ethereum specification, the EVM is defined as a state transition function: you have a pre-state S, a block B, and you compute a post-state S' = STF(S, B). If users are using a light client, they do not have S and S' in full, or even E; instead, they have a pre-state root R, a post-state root R', and a block hash H.
If this exists, a light client can fully validate Ethereum EVM execution. This already makes the client's resource requirements quite low. To achieve a truly fully validating Ethereum client, the same must be done for consensus.
Validity proofs for EVM computation already exist and are widely used by L2s. However, much work remains to make EVM validity proofs feasible on L1.
How does this relate to existing research?
What else can be done?
Today, validity proofs for electronic accounting systems are lacking in two aspects: security and verification time.
A secure validity proof needs to ensure that the SNARK actually verifies the EVM computation and that there are no vulnerabilities. The two main techniques for improving security are multi-prover and formal verification. Multi-prover means having multiple independently written validity proof implementations, just like having multiple clients, and if a block is proven by a sufficiently large subset of these implementations, clients will accept the block. Formal verification involves using tools commonly used to prove mathematical theorems, such as Lean4, to prove that the validity proof only accepts correct execution of the underlying EVM specification (such as EVM K semantics or the python-written Ethereum Execution Layer Specification (EELS)).
Sufficiently fast verification time means that any Ethereum block can be verified in less than 4 seconds. Today, we are still far from this goal, although we are much closer than we imagined two years ago. To achieve this, we need progress in three directions:

There are challenges in implementing this. Even in the worst case, where a very large transaction takes up the entire block, the computation cannot be split by transaction but must be split by opcode (EVM or RISC-V or other underlying VM opcode). Ensuring that the VM's "memory" remains consistent between different parts of the proof is a key challenge in implementation. However, if we can achieve this kind of recursive proof, then we know that, even without any other improvements, at least the prover latency problem is solved.
In addition, the two tools mentioned in the previous section (multidimensional gas and delayed state root) can also help here. However, it is worth noting that, unlike stateless validation, using these two tools means we already have enough technology to do what we currently need, and even with these technologies, full ZK-EVM verification still requires more work—just less work than otherwise.
One point not mentioned above is prover hardware: using GPUs, FPGAs, and ASICs to generate proofs faster. Fabric Cryptography, Cysic, and Accseal are three companies making progress in this area. This is very valuable for L2, but is unlikely to be a decisive factor for L1, because there is a strong desire for L1 to remain highly decentralized, which means proof generation must be within the reasonable reach of Ethereum users and should not be bottlenecked by a single company's hardware. L2s can make more aggressive trade-offs.
There is more work to be done in these areas:
Possible trade-offs:
How does it interact with other parts of the roadmap?
The core technology required to achieve L1 EVM validity proofs is largely shared with two other areas:
Once validity proofs are successfully implemented on L1, truly easy solo staking can finally be achieved: even the weakest computers (including phones or smartwatches) can stake. This further increases the value of solving other limitations of solo staking (such as the 32ETH minimum).
In addition, L1 EVM validity proofs can greatly increase the L1 gas limit.
What problem are we solving?
If we want to fully validate an Ethereum block with SNARK, EVM execution is not the only part we need to prove. We also need to prove consensus, i.e., the part of the system that handles deposits, withdrawals, signatures, validator balance updates, and other elements of Ethereum proof-of-stake.
Consensus is much simpler than EVM, but it faces the challenge that we do not have L2 EVM convolution, so most of the work must be done anyway. Therefore, any implementation of proving Ethereum consensus needs to be done "from scratch," although the proof system itself can be built on shared work.
What is it and how does it work?
The beacon chain is defined as a state transition function, just like the EVM. The state transition function consists mainly of three parts:
In each block, we need to prove 1-16 BLS12-381 ECADDs for each validator (possibly more than one, as signatures may be included in multiple sets). This can be offset by subset precomputation techniques, so we can say each validator only needs to prove one BLS12-381 ECADD. Currently, there are 30,000 validator signatures per slot. In the future, with single-slot finality, this may change in two directions: if we take the "brute force" route, the number of validators per slot may increase to 1 million. Meanwhile, if Orbit SSF is adopted, the number of validators will remain at 32,768 or even decrease to 8,192.

How BLS aggregation works: verifying the total signature only requires one ECADD per participant, not one ECMUL. But 30,000 ECADDs is still a large proof load.
As for pairing, currently there are up to 128 proofs per slot, meaning 128 pairings need to be verified. With EIP-7549 and further modifications, this can be reduced to 16 per slot. The number of pairings is small, but the cost is extremely high: each pairing takes thousands of times longer to run (or prove) than an ECADD.
A major challenge in proving BLS12-381 operations is that there is no convenient curve with an order equal to the BLS12-381 field size, which adds considerable overhead to any proof system. On the other hand, the proposed Verkle tree for Ethereum is built with the Bandersnatch curve, making BLS12-381 itself the native curve in the SNARK system for proving Verkle branches. A relatively simple implementation can prove 100 G1 additions per second; to make proof speed fast enough, clever techniques like GKR will almost certainly be needed.
For SHA256 hashes, the worst case currently is epoch transition blocks, where the entire validator short balance tree and many validator balances are updated. Each validator's short balance tree is only one byte, so 1 MB of data will be rehashed. This is equivalent to 32,768 SHA256 calls. If a thousand validators' balances are above or below a threshold, updating the effective balance in the validator record is equivalent to a thousand Merkle branches, so possibly 10,000 hashes are needed. The shuffling mechanism requires 90 bits per validator (so 11 MB of data), but this can be computed at any time during an epoch. In the case of single-slot finality, these numbers may increase or decrease depending on the situation. Shuffling becomes unnecessary, although Orbit may restore this need to some extent.
Another challenge is the need to reacquire all validator states, including public keys, to validate a block. For 1 million validators, just reading public keys requires 48 million bytes, plus Merkle branches. This requires millions of hashes per epoch. If we must prove the validity of PoS, a realistic approach is some form of incrementally verifiable computation: storing a separate data structure in the proof system memory, optimized for efficient lookup and proof of updates to that structure.
In summary, there are many challenges. To address these challenges most effectively, a deep redesign of the beacon chain will likely be needed, which may coincide with the move to single-slot finality. Features of such a redesign may include:
How does this relate to existing research?
What else needs to be done, and what are the trade-offs:
In reality, it will take us years to obtain validity proofs for Ethereum consensus. This is roughly the same amount of time needed to achieve single-slot finality, Orbit, signature algorithm changes, and the security analysis required to have enough confidence to use "aggressive" hash functions like Poseidon. Therefore, the wisest approach is to solve these other problems and consider STARK-friendliness while doing so.
The main trade-off is likely to be in the order of operations, between a more gradual approach to reforming the Ethereum consensus layer and a more aggressive "change many things at once" approach. For the EVM, a gradual approach is reasonable because it minimizes disruption to backward compatibility. For the consensus layer, the impact on backward compatibility is smaller, and there is also value in a more "comprehensive" rethink of various details of how the beacon chain is constructed to optimize SNARK-friendliness in the best way.
How does it interact with other parts of the roadmap?
When redesigning Ethereum PoS in the long term, STARK-friendliness must be a primary consideration, especially for single-slot finality, Orbit, signature scheme changes, and signature aggregation.