Bytecode
At each cycle of the RISC-V virtual machine, the current instruction (as indicated by the program counter) is "fetched" from the bytecode and decoded. In Jolt, this is proven by treating the bytecode as a lookup table, and fetches as lookups. To prove the correctness of these lookups, we use the Shout lookup argument.
One distinguishing feature of the bytecode Shout instance is that we have multiple instances of the read-checking and -evaluation sumchecks. Intuitively, the bytecode serves as the "ground truth" of what's being executed, so one would expect many virtual polynomial claims to eventually lead back to the bytecode. And this holds in practice –– in the Jolt sumcheck DAG diagram, we see that there are three colors of edges (corresponding to stages 1, 2, and 3,) pointing to the bytecode read-checking node, and two colors of edges (corresponding to stages 1 and 3) pointing to the bytecode -evaluation node.
Each stage has its a unique opening point, so having in-edges of different colors implies that multiple instances of that sumcheck must be run in parallel to prove the different claims.
Read-checking
Another distinguishing feature of the bytecode Shout instance is that we treat each entry of lookup table as containing a tuple of values, rather than a single value. Intuitively, this is because each instruction in the bytecode encodes multiple pieces of information: opcode, operands, etc.
The figure below loosely depicts the relationship between bytecode and polynomial.
We start from some ELF file, compiled from the guest program. For each instruction in the ELF (raw bytes), we decode/preprocess the instruction into a structured format containing the individual witness values used in Jolt:
Then, we compute a Reed-Solomon fingerprint of some subset of the values in the tuple, depending on what claims are being proven. These fingerprints serve as the coefficients of the polynomial for that read-checking instance.
The figure above depicts the polynomial we would use to prove the read-checking instance for the Stage 2 claims, labeled as "ra/wa claims" in the DAG diagram.
As explained below, the and polynomials correspond to the registers (rs1
, rs2
, rd
) in the bytecode.
Each coefficient in is , for the registers of one particular instruction.
We can write the read-checking sumcheck expression for the Stage 2 claims as the following:
where we treat
as a single multilinear polynomial.
Observe that this sumcheck satisfies our needs for the Stage 2 claims: , , are the claims output by the registers read/write-checking sumcheck, and we can simply compute Reed-Solomon fingerprint of these claims to obtain the left-hand side (i.e. input claim) of sumcheck expression above.
Registers
Technically, the explanation above glosses over some details for the sake of exposition. To be precise, the claims output by registers read/write-checking are actually:
where .
So we define:
where :
- if the -th instruction in the bytecode has
- otherwise
and similarly for and .
Equivalently,
where (respectively, and ) is the bitvector denoting the rd
(respectively, rs1
and rs2
) for the -th instruction in the bytecode.
Instruction address
Each instruction in the bytecode has two associated "addresses":
- its index in the expanded bytecode. "Expanded" bytecode refers to the preprocessed bytecode, after instructions are expanded to their virtual sequences
- its memory address as given by ELF. All the instructions in a virtual sequence are assigned the address of the "real" instruction they were expanded from.
The former is used for the -evaluation sumcheck, described below. The latter is used to enforce program counter updates in the [R1CS constraints] (./r1cs_constraints.md), and is treated as a part of the tuple of values in the preprocessed bytecode.
The "outer" and shift sumchecks in Spartan output claims about the virtual UnexpandedPC
polynomial, which corresponds to the latter. These claims are proven using bytecode read-checking.
Flags
There are two types of Boolean flags used in Jolt:
- Circuit flags, used in R1CS constraints
- Lookup table flags, used in the instruction execution Shout
The associated flags for a given instruction in the bytecode can be computed a priori (i.e. in preprocessing), so any claims about these flags arising output by Spartan or instruction execution Shout are also proven using bytecode read-checking.
raf-evaluation
There are two -evaluation instances for bytecode, corresponding to claims arising from stages 1 and 3.
The polynomial, in the context of bytecode, is the program counter (PC). The two sumchecks that output claims about the PC are the "outer" and "shift" sumchecks used to prove Jolt's R1CS constraints.
One-hot checks
Jolt enforces that the polynomials used for bytecode Shout are one-hot, using a Booleanity and Hamming weight sumcheck as described in the paper. These implementations follow the Twist and Shout paper closely, with no notable deviations.