Skip to main content

Reading the Zcash Protocol Specification

Why This Chapter Exists

The protocol spec is dense, mathematical, and required reading. This chapter teaches you how to navigate it: section numbering, notation conventions, and how to translate a spec rule into a code search query.

protocol.pdf (the Zcash Protocol Specification, currently the NU6 edition with NU7 drafts circulating) is the single authoritative document for what Zcash is. A principal cryptography engineer at ZODL needs to read it fluently, not just refer to it occasionally.

The PDF is dense. This file is a guide to using it efficiently.

Structure of the Document

The spec is organized into:

  1. Introduction: scope, conventions, audience. Skip after one read.
  2. Concepts: high-level overview of the cryptography and chain model. Read once, then keep open as a reference.
  3. Abstract protocol: defines the protocol in math, independent of byte encoding. This is what cryptographers should read.
  4. Concrete protocol: defines specific algorithms, hash functions, and serializations. This is what implementers should read.
  5. Consensus changes for network upgrades: a section per activated NU, listing what changed and at which height.
  6. Differences from the original Zerocash protocol.
  7. Acknowledgements and References.
  8. Appendices: normative parameter tables, test vectors, hash personalizations.

The appendices contain the personal-string registry, group element encodings, test vectors, and "for-the-implementer" tables. Bookmark them; you will refer to them constantly.

Notation Conventions

The spec uses a mix of mathematical notation and pseudocode. Some conventions you must internalize:

  • types in calligraphic font: sets and groups (G, F).
  • bold lowercase: scalars in a specific field.
  • bold uppercase: group elements.
  • := for definition, = for equality, == is not used.
  • || is concatenation; 0xN denotes hex; [0..k] is the set of integers from 0 to k inclusive (some sections use exclusive, watch for the prose).
  • subscripts indicate domain separation or parameter binding.
  • "the bit-string s of length n" usually means little-endian unless the prose says otherwise.

Hash inputs are explicitly bracketed; do not be casual about the order of concatenation. ZIP-244 bugs have come from getting the order wrong.

Normative vs Non-normative

Anything in the main body of the spec is normative. Footnotes that clarify or explain are non-normative. Appendices are usually normative parameter tables (test vectors, hash personalizations) but the prose around them may not be. The opening pages identify the convention.

If you find a discrepancy between the spec text and a test vector, the test vector wins by convention because it is more precise, but file a zcash/zips issue immediately. This has happened.

The Pseudocode Style

The spec's pseudocode is closer to mathematics than to a programming language. A for each loop is a set operation; a function defined in the math fonts is a deterministic function, not a procedure.

Specific gotchas:

  • conditional bit-vector operations: b ? x : y means "if b is 1 return x else y", with no notion of short-circuit evaluation.
  • ranges may be inclusive or exclusive; the prose specifies. Do not guess.
  • "an honest party computes" is a non-normative explainer; the consensus rule is whatever the validation rule says.

The Validating-implementer's Reading Order

A first pass should give you:

  1. read the concepts section once.
  2. read the abstract protocol section once. Skim the parts you know; pause on commitment schemes, key derivation, value pool, note encryption.
  3. read the concrete protocol section sequentially. Compare each subsection to the corresponding directory in zebra-chain and librustzcash.
  4. for each network upgrade you are implementing or auditing, read the matching consensus-changes section.

Second pass: each time you touch a piece of code, open the corresponding spec section and read alongside.

Worked Example: ZIP-244 Transaction Id

Pick this as your first deep dive. The flow:

  1. open the spec section on transaction identifiers. The v5 txid is computed by hashing per-pool digests under a personal-string tree.
  2. open zebra-chain/src/transaction/txid.rs. Identify which function corresponds to which spec equation.
  3. open zcash_primitives (via the dependency) and find the underlying digest construction.
  4. open ZIP-244 itself. Cross-reference with the spec text.
  5. open zebra-script/src/lib.rs calculate_sighash and trace the sighash variant of the same computation.

You should be able to point at any byte of a v5 transaction and say which input to which digest under which personal string it contributes to. If you cannot, you do not yet read the spec fluently. Keep at it.

Skim List Per File

A rough mapping from spec sections to Zebra source you can use as a two-way table:

Section numbers shift across spec revisions; treat this as a template, not gospel.

Tools

  • pdftotext protocol.pdf - | less for searchable text. The rendered PDF is the canonical form, but grep is faster for lookups.
  • the Zcash Spec katex web rendering (linked from zcash/zips).
  • katex-header.html in this repo root suggests the Zebra book also renders some spec-style math.

Active Reading Habits

  • never read a hash construction without writing the personal string into your notes.
  • never read a circuit description without listing the public inputs and the secret witness.
  • never read a signing operation without identifying which key signs, which message is signed, and what the verifier learns.
  • when in doubt, write down the consensus rule in your own words and ask a maintainer to confirm.

See Also

  • 12-protocol-history-and-governance.md (why the spec is shaped this way).
  • 14-zips-and-ceremonies.md (the ZIP process, which feeds spec revisions).
  • 16-formalisation-opportunities.md (turning spec sections into Lean).

Spec Pointers

Exercises

  1. Pick one consensus rule in the spec and find its implementation in zebra-consensus. Cite both.
  2. Identify one piece of notation in the spec that is also used in Zebra code (e.g. for a hash function or a field operation). Confirm the names match.
  3. Find a spec section whose corresponding code lives in more than one crate. Explain in one sentence why.