Skip to main content

03 - Cryptography primer

1. Why this chapter exists

Chapters 04 and 05 will talk about Spend statements, value commitments, binding signatures, and Halo 2 transcripts. None of that vocabulary is reusable if the reader has not pinned down the underlying notation: which group is which, what a pairing is, why Pedersen commitments are homomorphic, how BLAKE2 personalisation turns a hash function into a domain-separated PRF. This chapter is the calibration step. By the end of it, you should be able to read the personalisation tag b"ZcashTxHash_" in zcash_primitives/src/transaction/txid.rs#L33-L40 and explain why every BLAKE2b call site needs one.

2. Definitions

Definition (prime field). Fp\mathbb{F}_p is the finite field of order pp for prime pp; Fp\mathbb{F}_p^* is its multiplicative group of p1p-1 non-zero elements.

Definition (cyclic group, additive). G\mathbb{G} is a cyclic group of prime order qq written additively with generator GG; G=q|\mathbb{G}| = q. For an integer kk, [k]G[k]G is GG added to itself kk times.

Definition (discrete logarithm problem, DLP). Given G,HGG, H \in \mathbb{G} with H=[k]GH = [k]G, find kk. Zcash assumes DLP is hard in every group it uses.

Definition (pairing). A non-degenerate bilinear map e ⁣:G1×G2GTe\colon \mathbb{G}_1 \times \mathbb{G}_2 \to \mathbb{G}_T between three prime-order-rr groups, satisfying e([a]P,[b]Q)=e(P,Q)abe([a]P, [b]Q) = e(P, Q)^{ab} for all a,bFra, b \in \mathbb{F}_r, PG1P \in \mathbb{G}_1, QG2Q \in \mathbb{G}_2.

Definition (commitment scheme). An algorithm Com(m;r)c\mathsf{Com}(m; r) \to c that is

  • binding: hard to find (m1,r1)(m2,r2)(m_1, r_1) \neq (m_2, r_2) with Com(m1;r1)=Com(m2;r2)\mathsf{Com}(m_1; r_1) = \mathsf{Com}(m_2; r_2);
  • hiding: cc reveals nothing computational about mm.

Definition (Pedersen commitment). In G\mathbb{G} of prime order qq with two generators G,HG, H such that logGH\log_G H is unknown,

Com(m;r)  =  [m]G  +  [r]H.\mathsf{Com}(m; r) \;=\; [m]G \;+\; [r]H.

Pedersen commitments are additively homomorphic, perfectly hiding, and computationally binding under DLP.

Definition (PRF from BLAKE2b). Sapling defines PRFkx(m)=BLAKE2b(persx;km)\mathsf{PRF}^{x}_{k}(m) = \mathsf{BLAKE2b}(\text{pers}_x;\, k \mathbin{\|} m), where persx\text{pers}_x is a 16-byte personalisation string fixing the PRF instance. The construction is in zcash_spec and reused by every workspace crate via the PrfExpand helper.

Definition (Fiat-Shamir transform). A reduction from an interactive 3-move protocol to a non-interactive one: replace the verifier's challenge with H(transcript)H(\text{transcript}) for a public hash HH in the random-oracle model.

Invariant (one personalisation per call site). Every BLAKE2b invocation in Zcash uses a unique 16-byte personalisation. The reason: cross-protocol replay. If two protocols use the same BLAKE2b on similar inputs and one accepts a value as a hash output, the attacker should not be able to repurpose it elsewhere. Adding a hash invocation means adding a new personalisation tag.

3. The code

3.1 Groups and fields

Zcash uses several groups. Each row corresponds to one crate.io dependency declared in Cargo.toml:

CurveFieldOrderUsed for
BLS12-381 (G1,G2\mathbb{G}_1, \mathbb{G}_2)Fq\mathbb{F}_q, qq 381-bitrr, 255-bitSapling Groth16
JubjubFr\mathbb{F}_r where rr is BLS12-381 scalar field252-bit primeSapling commitments, key agreement
PallasFp\mathbb{F}_p, p2255p \approx 2^{255}qPallasq_{\text{Pallas}}Orchard arithmetic
VestaFqPallas\mathbb{F}_{q_{\text{Pallas}}}pPallasp_{\text{Pallas}}Orchard recursion
secp256k1Bitcoin curve256-bitTransparent ECDSA

The Pallas/Vesta pair is a 2-cycle of elliptic curves: the base field of one equals the scalar field of the other. This is essential for efficient recursive proofs (Halo); see chapter 05.

The Jubjub curve has a scalar field equal to BLS12-381's scalar field, which means scalar arithmetic inside a BLS12-381-based SNARK is cheap. Sapling uses this for in-circuit elliptic-curve operations.

Read in code: the workspace Cargo.toml pulls bls12_381, jubjub, pasta_curves, secp256k1, group, and ff from crates.io:

Cargo.toml
loading...

The Pallas / Vesta type aliases used throughout the Orchard code live in pasta_curves/src/pallas.rs and pasta_curves/src/vesta.rs.

3.2 Pairings and Groth16

BLS12-381 is a pairing-friendly curve: G1,G2\mathbb{G}_1, \mathbb{G}_2 are specific subgroups of elliptic-curve points and GTFq12\mathbb{G}_T \subseteq \mathbb{F}_{q^{12}}^*.

Sapling proofs are Groth16 SNARKs with a constant-size pairing check at verification:

e(A,B)  =?  e(αG1,βG2)e(C,γG2)e(Cpub,δG2).e(A, B) \;\stackrel{?}{=}\; e(\alpha G_1, \beta G_2) \cdot e(C, \gamma G_2) \cdot e(C_{\text{pub}}, \delta G_2).

You do not need to memorise this; what matters is that the verification is a constant-size pairing equation, and that the verifying key contains αG1,βG2,γG2,δG2\alpha G_1, \beta G_2, \gamma G_2, \delta G_2 and a vector of G1\mathbb{G}_1 points for the public inputs. bellman::groth16::Proof is the type; zcash_proofs consumes prepared verifying keys produced once and cached.

3.3 Hash functions and PRFs

BLAKE2b / BLAKE2s. Pervasive in Zcash. Both support a 16-byte personalisation string that acts as domain separation. The idiomatic Zcash usage is

Hpers(m)  =  BLAKE2b ⁣(key=,  personalisation=pers,  m).H_{\text{pers}}(m) \;=\; \mathsf{BLAKE2b}\!\bigl( \text{key} = \emptyset,\; \text{personalisation} = \text{pers},\; m \bigr).

Personalisation tags in this codebase are short ASCII strings such as "ZcashTxHash_", "ZTxIdSaplingHash", "Zcash_ExpandSeed". The full list of TxId personalisations lives at the top of txid.rs:

zcash_primitives/src/transaction/txid.rs
loading...

SHA-256, RIPEMD-160. Used in the transparent layer for Bitcoin compatibility: Hash160(x)=RIPEMD160(SHA256(x))\mathsf{Hash160}(x) = \mathsf{RIPEMD160}(\mathsf{SHA256}(x)) for P2PKH addresses; Hash256(x)=SHA256(SHA256(x))\mathsf{Hash256}(x) = \mathsf{SHA256}(\mathsf{SHA256}(x)) for some legacy contexts. Sprout circuits also use SHA-256, because the original Zerocash construction did.

Pedersen and Sinsemilla hashes. Algebraic hash functions (output is a curve point) optimised for SNARK-friendliness. Defined and motivated in chapter 04 (Pedersen) and chapter 05 (Sinsemilla).

PRF^{expand}. The single PRF used pervasively for key derivation:

PRFskexpand(t)  =  BLAKE2b ⁣(pers="Zcash_ExpandSeed",  skt),\mathsf{PRF}^{\text{expand}}_{\mathsf{sk}}(t) \;=\; \mathsf{BLAKE2b}\!\bigl( \text{pers} = \text{"Zcash\_ExpandSeed"},\; \mathsf{sk} \mathbin{\|} t \bigr),

where tt is a tag byte (and sometimes more bytes). Defined once in zcash_spec and reused everywhere. Grep PrfExpand in the workspace.

3.4 Commitments

Pedersen. As in Section 2. Properties:

  • Additively homomorphic: Com(m1;r1)+Com(m2;r2)=Com(m1+m2;r1+r2)\mathsf{Com}(m_1; r_1) + \mathsf{Com}(m_2; r_2) = \mathsf{Com}(m_1 + m_2;\, r_1 + r_2).
  • Perfectly hiding (the randomness completely masks the message).
  • Computationally binding under DLP.

The homomorphism is the mathematical engine behind shielded value conservation. Chapter 04 shows how it lets a transaction prove that input value equals output value without revealing the values themselves.

Pedersen hash. Generalise the commitment to many generators G1,,GnG_1, \ldots, G_n:

PedHash(m1,,mn)  =  i=1n[mi]Gi.\mathsf{PedHash}(m_1, \ldots, m_n) \;=\; \sum_{i=1}^{n} [m_i] G_i.

Collision-resistant under DLP and much cheaper inside a SNARK than SHA-256 because elliptic-curve arithmetic is the SNARK's native operation. Sapling's note commitments and Merkle-tree hashes use Pedersen-hash variants.

Value commitments. Sapling uses

VCom(v,r)  =  [v]V  +  [r]R    GJubjub,\mathsf{VCom}(v, r) \;=\; [v]V \;+\; [r]R \;\in\; \mathbb{G}_{\text{Jubjub}},

with curve-specific generators V,RV, R. The crucial property is

iinVCom(vi,ri)    joutVCom(vj,rj)  =  [vbal]V  +  [rbal]R,\sum_{i \in \text{in}} \mathsf{VCom}(v_i, r_i) \;-\; \sum_{j \in \text{out}} \mathsf{VCom}(v_j, r_j) \;=\; [v_{\text{bal}}]V \;+\; [r_{\text{bal}}]R,

the binding equation: the prover proves it knows rbalr_{\text{bal}} relative to a public vbalv_{\text{bal}}, completing the value-conservation proof. This is what the "binding signature" signs.

3.5 Signatures

ECDSA (secp256k1). Used for transparent inputs. Standard Bitcoin signatures; see the secp256k1 crate.

RedDSA / RedJubjub / RedPallas. Sapling and Orchard use RedDSA, a re-randomisable EdDSA-style signature scheme. The instantiation over Jubjub is RedJubjub (Sapling); over Pallas is RedPallas (Orchard).

A RedDSA signature key is a pair (sk,pk)(\mathsf{sk}, \mathsf{pk}) with pk=[sk]G\mathsf{pk} = [\mathsf{sk}]G. To sign MM:

  1. Sample r \stackrel{\}{\leftarrow} \mathbb{F}_q;compute; compute R = [r]G$.
  2. Compute challenge c=H(RpkM)Fqc = H(R \mathbin{\|} \mathsf{pk} \mathbin{\|} M) \in \mathbb{F}_q.
  3. Set s=r+csk(modq)s = r + c \cdot \mathsf{sk} \pmod{q}.
  4. The signature is (R,s)(R, s).

Verification: [s]G=?R+[c]pk[s]G \stackrel{?}{=} R + [c]\mathsf{pk}.

This is Schnorr-style; what makes it "Red" is the re-randomisation:

rk  =  pk  +  [α]G,rsk  =  sk  +  α(modq).\mathsf{rk} \;=\; \mathsf{pk} \;+\; [\alpha]G, \qquad \mathsf{rsk} \;=\; \mathsf{sk} \;+\; \alpha \pmod{q}.

A signature under rsk\mathsf{rsk} verifies under rk\mathsf{rk}. The randomiser α\alpha is uniform per spend, so rk\mathsf{rk} is unlinkable to the underlying pk\mathsf{pk}. Sapling spend authorisation uses this: the spend description publishes rk\mathsf{rk}; the spender signs under rsk\mathsf{rsk}; a Spend Authorisation Signature is included in the description.

Binding signature. A signature whose verification key is computed from the value commitments themselves. The combined value commitment

cvin    cvout    [vbalance]V\sum \mathsf{cv}_{\text{in}} \;-\; \sum \mathsf{cv}_{\text{out}} \;-\; [v_{\text{balance}}]V

should equal [rbal]R[r_{\text{bal}}]R for some rbalr_{\text{bal}} known only to the spender. The spender publishes a signature whose verification key is exactly that point, using RR as the group generator. Verifying the signature proves the prover knew rbalr_{\text{bal}}; hence values balance.

Read in code: redjubjub (used by Sapling) and reddsa (Orchard).

3.6 Key agreement

In a group of prime order qq with generator GG:

Alice:a$Fq,A=[a]G,\text{Alice}: \quad a \stackrel{\$}{\leftarrow} \mathbb{F}_q^*, \quad A = [a]G, Bob:b$Fq,B=[b]G,\text{Bob}: \quad b \stackrel{\$}{\leftarrow} \mathbb{F}_q^*, \quad B = [b]G,

then [a]B=[b]A=[ab]G[a]B = [b]A = [ab]G is the shared secret. Both parties feed it to a key-derivation function KDF\mathsf{KDF} to get a symmetric key.

Sapling and Orchard both use ECDH on Jubjub / Pallas for note encryption (chapter 08), with GG being a per-recipient diversifier generator gdg_d rather than a fixed generator. This is part of how diversified addresses work.

3.7 Symmetric primitives

Note encryption uses ChaCha20-Poly1305, an authenticated stream cipher: Enck(n,m)c\mathsf{Enc}_k(n, m) \to c where nn is a 12-byte nonce and the output includes a 16-byte tag. The Zcash spec uses n=0n = 0 always because each key is single-use. The AEAD discipline still applies: never reuse (k,n)(k, n); always include associated data; always check the tag before using the plaintext. The dependency is declared in zcash_primitives/Cargo.toml.

3.8 Zero-knowledge proofs

Zcash uses two families of NIZK arguments:

  • Groth16 (Sapling, Sprout): preprocessing SNARK, constant proof size (3×G1+1×G21923 \times \mathbb{G}_1 + 1 \times \mathbb{G}_2 \approx 192 bytes), constant verification cost (three pairing equations collapsed). Requires a per-circuit trusted setup, performed in a multi-party computation ceremony ("Powers of Tau" plus circuit-specific). The proving key is many megabytes; the verifying key is a few kilobytes.
  • Halo 2 (Orchard): a PLONK-derived argument with a polynomial commitment based on the Inner Product Argument (IPA). No per-circuit trusted setup, but uses a transparent universal setup (a structured reference string that anyone can verify) and a custom arithmetisation (custom gates, lookups, permutations) tuned for the Pallas/Vesta cycle.

The interface as seen from librustzcash is, in both cases:

Prover(circuit,public inputs x,witness w)π,\mathsf{Prover}(\text{circuit}, \text{public inputs } x, \text{witness } w) \to \pi, Verifier(vk,x,π){0,1}.\mathsf{Verifier}(\text{vk}, x, \pi) \to \{0, 1\}.

The witness includes secret values such as note values, randomness, the spending key, and the Merkle path. The public input includes the anchor, the value commitment, the nullifier, rk\mathsf{rk}, and the output commitment.

For Sapling the verifying-key hashes are bundled with the binaries in zcash_proofs/src/lib.rs:

zcash_proofs/src/lib.rs
loading...

The proving keys are downloaded via download-params.

3.9 Fiat-Shamir and personalisation

Many protocols are stated as interactive: prover sends commitment, verifier sends challenge, prover sends response. The Fiat-Shamir transform replaces the verifier's challenge with a hash of the prover's messages (and prior context), producing a non-interactive protocol in the random-oracle model. It is everywhere in Zcash:

  • The RedDSA challenge c=H(RpkM)c = H(R \mathbin{\|} \mathsf{pk} \mathbin{\|} M).
  • The IPA challenges inside Halo 2.
  • Sighash for transparent inputs (a generalised Fiat-Shamir).

Whenever you see let chal = blake2b(transcript), that is a Fiat-Shamir challenge.

Personalisations are 16 bytes; if shorter, they are padded with zero bytes. Examples seen in this codebase:

  • "Zcash_ExpandSeed": PRF^{expand}.
  • "Zcash_SaplingNf": Sapling nullifier PRF.
  • "ZTxIdSaplingHash": sighash sub-tree.
  • "Zcash_OrchardMH": Orchard Merkle hash.

If you ever add a new hash usage, define a new personalisation. Reusing an existing one is a bug.

4. Failure modes

A contributor who confuses the primitives in this chapter produces errors that pass unit tests but break interoperability:

  • Field confusion. Jubjub's scalar field equals BLS12-381's scalar field, but its base field does not. Pallas and Vesta swap base and scalar. Calling Fr::from_bytes on a Fp\mathbb{F}_p representation looks plausible and compiles, but produces wrong curve points downstream.
  • Endianness drift. Zcash standardises on little-endian for most field serializations, but a handful of legacy Bitcoin-derived contexts use big-endian. The ZIPs spell out the order. Mixing the two has caused real production bugs across multiple wallets.
  • Pedersen-window off-by-one. Each Pedersen window has its own generator, derived deterministically from a hash of an index. Reusing a generator across windows breaks collision resistance.
  • Personalisation reuse. As stated above, every new BLAKE2b call site must add a new 16-byte personalisation. Two recent changes to the protocol added new sighash sub-trees, each with its own tag; do the same.
  • Nonce / randomness reuse. Every RedDSA signature, every note randomness, every diversifier randomness must be sampled uniformly and independently. The Sprout counterfeiting CVE (chapter 12) is the canonical example of what a flaw at this layer can cost.

5. Spec pointers

  • Zcash Protocol Specification, sections 5.4 and 5.6: the full table of personalisations and the precise PRF constructions used by Sapling and Orchard.
  • ZIP 32: the hierarchical-deterministic key derivation tree that all PRF^{expand} invocations sit inside.
  • Groth, 2016: the original Groth16 paper. Read sections 1 and 3 to understand the pairing equation cited above.
  • Halo 2 book: the canonical reference for the Halo 2 proof system used by Orchard. Chapter 05 cites specific sections.
  • BLAKE2 RFC 7693: the authoritative specification for BLAKE2b and BLAKE2s, including the personalisation parameter Zcash relies on.

6. Exercises

  1. Trace a personalisation. Search the workspace for the byte string b"Zcash_ExpandSeed". List every call site and, for each, identify the input t it passes.
  2. Verify a Pedersen identity. In a scratch test, sample m1,m2,r1,r2m_1, m_2, r_1, r_2 uniformly, compute c1=Com(m1;r1)c_1 = \mathsf{Com}(m_1; r_1) and c2=Com(m2;r2)c_2 = \mathsf{Com}(m_2; r_2), and confirm in code that c1+c2=Com(m1+m2;r1+r2)c_1 + c_2 = \mathsf{Com}(m_1 + m_2;\, r_1 + r_2) holds in jubjub::SubgroupPoint. Add it as a unit test under zcash_primitives (do not commit; this is a scratch exercise).
  3. Add a new BLAKE2b call. Pretend you need a new BLAKE2b hash under the personalisation "OnboardingPrim ". Write the helper as fn h_onboarding(input: &[u8]) -> [u8; 32] in a throwaway file. Run cargo check. Then delete the helper and the file before moving on; the exercise is to convince yourself that the personalisation is a 16-byte string parameter you can add anywhere, not a magic constant.

Answers in the code

7. Further reading

  • chapter 16: the windowed encoding for Pedersen hashes, in-circuit cost, generator derivation.
  • chapter 17: the polynomial-commitment layer underneath Halo 2.
  • Boneh, Drijvers, Neven, Compact Multi-Signatures for Smaller Blockchains, 2018: background on Schnorr-style signatures and their re-randomisation properties, the foundation of RedDSA.