Introduction:
Recently, Vitalik and several scholars jointly published a new paper that touches upon how Tornado Cash implements its anti-money laundering scheme (essentially enabling the withdrawer to prove that their deposit history belongs to a set that doesn’t include tainted funds). However, the paper lacked an intricate explanation of Tornado Cash’s business logic and principles, leaving some readers with only a superficial understanding.
Worth noting is that projects like Tornado, which represent privacy ventures, genuinely utilize the zero-knowledge aspect of the ZK-SNARK algorithm. Meanwhile, most solutions brandishing the ZK banner, such as Rollups, merely harness the succinctness of ZK-SNARK. Often, people tend to conflate Validity Proof with ZK, and Tornado serves as an excellent example to clarify the real-world application of ZK.
The author of this article had penned a piece on the principles of Tornado back in 2022 for Web3Caff Research. Today, we’ve extracted and expanded on certain sections of that original work to provide a systematic understanding of Tornado Cash.
Original Article Link:https://research.web3caff.com/zh/archives/2663?ref=157

Tornado Cash utilizes the zero-knowledge proof as its mixing protocol. While its older version was launched in 2019, a beta version of the updated model was rolled out at the end of 2021. The earlier version of Tornado achieved a good level of decentralization with on-chain contracts being open-source and free from multi-signature controls. Moreover, the frontend code was open-source and backed up on the IPFS network. Due to the simplicity of the older Tornado version, this article focuses on explaining it.
Tornado’s primary approach is to mix numerous deposit and withdrawal actions together. After depositing Tokens into Tornado, depositors present a ZK Proof to verify their previous deposit and then withdraw using a new address, thereby breaking the connection between the deposit and withdrawal addresses.

To put it more succinctly, imagine Tornado as a glass box filled with coins (Coins) deposited by many individuals. We can see who deposited the Coins, but these coins are highly homogeneous. If someone unfamiliar were to take a coin from the box, it would be hard to trace who originally put that coin there.

(图源:rareskills)
Such scenarios seem commonplace. When we SWAP a few ETHs from a Uniswap pool, it’s impossible to determine whose ETH we’ve received, given the numerous liquidity providers to Uniswap. However, the difference lies in the process. With Uniswap, swapping Tokens requires another Token as equivalent value, and funds can’t be “privately” transferred. In contrast, a mixer simply requires the withdrawer to present their deposit receipt.
To make deposit and withdrawal actions appear homogenous, the Tornado pool maintains consistency in the deposit and withdrawal amounts. For instance, if a pool has 100 depositors and 100 withdrawers, even though the actions are publicly visible, there seems to be no connection between them. Everyone deposits and withdraws the same amount, making it hard to trace the movement of funds. Clearly, this provides an innate advantage for money laundering.

The key question arises: when withdrawing, how does one prove their prior deposit? The address initiating the withdrawal is not linked to any deposit address, so how does one verify the right to withdraw? The most direct method would be for the withdrawer to reveal their deposit record, but that would expose their identity. This is where zero-knowledge proofs come into play.
With a ZK Proof, a withdrawer can confirm they have a deposit record in the Tornado contract and that this deposit has not yet been withdrawn. The beauty of zero-knowledge proofs is that they preserve privacy. The public only knows that the withdrawer indeed made a deposit but cannot determine their specific identity.

To prove “I’ve deposited in the Tornado pool” can be translated to “My deposit record can be found in the Tornado contract.” If Cn denotes a deposit record, then given Tornado’s deposit record set as {C1, C2,…C100…}, Bob needs to prove he used his private key to generate a record in this set without revealing which specific Cn it is. This utilizes the unique properties of the Merkle Proof.
All of Tornado’s deposit records are aggregated into an on-chain constructed Merkle Tree. The majority of these leaves (around 2^20, over 1 million) remain blank (with an initial value). Each new deposit updates a corresponding commitment leaf and then the tree’s root.

For example, if Bob’s deposit was the 10,000th in Tornado’s history, the associated value Cn would be the tree’s 10,000th leaf, i.e., C10000 = Cn. The contract would then automatically compute the new Root.

(图源:RareSkills)
Merkle Proof itself is concise and efficient. To prove a transaction TD exists within a Merkle Tree, one only needs to provide the associated Merkle Proof, which remains compact even if the Merkle Tree is vast.

To validate that a transaction, say H3, is indeed included in the Merkle Tree, one has to prove that using H3 and other data from the Merkle Tree can generate the Root. This data (including Td) constitutes the Merkle Proof. When Bob wants to withdraw, he needs to verify two things:
·Cn is in the Merkle Tree built on-chain by Tornado, for which he can construct a Merkle Proof containing Cn;
·Cn is related to Bob’s deposit voucher.

In the frontend code of Tornado’s user interface, numerous functionalities have been pre-implemented. When a depositor opens the Tornado Cash webpage and clicks the deposit button, the attached frontend code generates two random numbers, K and r, locally. It then calculates the value of Cn=Hash(K, r), passing Cn (referred to as the commitment in the diagram below) into the Tornado contract to be incorporated into its Merkle Tree. Simply put, K and r act like private keys. They are critical, and users are advised to store them safely, as they will be required again during the withdrawal process.

An “encryptedNote” is an optional feature that allows users to encrypt the credentials K and r with a private key and store them on the chain to prevent forgetting.
It’s noteworthy that all the above operations take place off-chain, meaning neither the Tornado contract nor any external observers are aware of K and r. If K and r are exposed, it is akin to having one’s wallet’s private key stolen.

Upon receiving a user’s deposit and the computed Cn=Hash(K, r), the Tornado contract places Cn at the base level of the Merkle Tree, turning it into a new leaf node and subsequently updating the root’s value. However, it’s important to understand that the leaves of this Merkle Tree are not logged within the contract’s status but are solely recorded as event parameters in past blocks. The Tornado contract only logs the Merkle root. During withdrawal, users can prove, via Merkle Proof, that the deposit record corresponds to the current Merkle root, a concept somewhat resembling light client cross-chain bridge withdrawals. This design reveals Tornado’s ingenuity: to save on gas costs, the full Merkle tree isn’t logged in the contract’s status, only its root is. The leaves of the tree are simply recorded in historical block records, a mechanism somewhat analogous to Rollup’s gas-saving principle (though the details differ).
During the withdrawal process, the withdrawer inputs the credentials/private keys (random numbers K and r generated during deposit) on the frontend webpage. The Tornado Cash frontend code utilizes K and r, Cn=Hash(K, r), and the Merkle Proof corresponding to Cn to generate a ZK Proof, thereby confirming that Cn corresponds to a deposit record on the Merkle Tree and that K and r are the valid credentials for Cn. This step essentially proves the knowledge of a deposit record’s keys on the Merkle Tree. When the ZK Proof is submitted to the Tornado contract, all four parameters are concealed, ensuring outsiders, including the Tornado contract itself, remain unaware, thus safeguarding user privacy.

An interesting detail is that the deposit operation uses two random numbers, K and r, to generate Cn instead of just one because a single random number might not be secure enough and could potentially be brute-forced.
Regarding the symbol “A” in the illustration, it represents the address receiving the withdrawal and is provided by the withdrawer. Meanwhile, “nf” is an identifier set in place to prevent replay attacks, its value determined as nf=Hash(K), where K is one of the two random numbers (K and r) used during deposit to generate Cn. As such, each Cn has a corresponding nf, and the two are uniquely linked.
Why the need to prevent replay attacks? Owing to the mixer’s design features, during withdrawal, it’s unclear which deposit in the Merkle Tree corresponds to the withdrawn funds. As the connection between the depositor and the withdrawn amounts remains obscure, malicious users might exploit this and repeatedly withdraw from the mixer, draining the pool of tokens.

Here, the nf identifier functions similarly to the transaction counter “nonce” inherent to every Ethereum address, established to prevent transaction replays. Upon a withdrawal request, users must submit an nf. The system checks whether this nf has been previously used: if it has, the withdrawal is invalidated; if not, the withdrawal proceeds, and the nf gets recorded, ensuring its subsequent use would result in invalidation.

Some might wonder: Can someone fabricate an nf that the contract hasn’t recorded? That’s unlikely. During the generation of ZK Proof, it’s essential to ensure nf=Hash(K), and the random number K is linked to the deposit record Cn. If someone arbitrarily creates an nf, it won’t match any of the recorded deposits, rendering the generation of a valid ZK Proof impossible, subsequently stalling the withdrawal process.
Others might question: Is there a way around using nf? Given that withdrawers must submit a ZK Proof, which attests their connection to a specific Cn, wouldn’t it suffice to check if a corresponding ZK Proof has already been logged on-chain? However, the costs associated with such an approach are exorbitant since the Tornado Cash contract doesn’t perpetually store previously submitted ZK Proofs to avoid storage wastage. Comparing every new ZK Proof with existing ones to ensure consistency is far more resource-intensive than merely logging a compact identifier like nf.
As per the withdrawal function’s code example, the required parameters and business logic are as follows: Users submit ZK Proof, nf (NullifierHash) = Hash(K), and designate a recipient address for the withdrawal. The ZK Proof conceals the values of Cn, K, and r, ensuring the outside world cannot determine the user’s identity. Typically, recipients will specify a clean, new address to prevent revealing personal information.

However, a minor challenge emerges: when users withdraw, for the sake of untraceability, they often use freshly generated addresses to initiate the withdrawal transaction. At such times, these new addresses lack ETH to cover gas fees. Therefore, during withdrawal, the address must explicitly declare a relayer to cover the gas fees. Subsequently, the mixer contract deducts a portion from the user’s withdrawal to compensate the relayer.

In conclusion, Tornado Cash can obscure the connection between depositors and withdrawers. When there’s a large user base, it’s akin to a criminal blending into a bustling crowd, making it challenging for authorities to track. The withdrawal process employs ZK-SNARK, with the concealed “witness” portion containing pivotal information about the withdrawer. This is arguably the mixer’s most vital feature. Presently, Tornado might be one of the most clever applications related to ZK.





