Precompile: Frontend and backend for building circuits by dreamATD · Pull Request #799 · scroll-tech/ceno

dreamATD · 2024-12-31T12:53:37Z

This is an implementation of the expression-based and plonkish-like GKR IOP protocol. The circuit is denoted as Chip, holding all information to process commit phases and GKR proving phase. In the current implementation, we assume there are two commit phases. To process the GKR phase, we extract a GKRCircuit from it and run the GKR protocol. For the implementation status, the GKR phase is ready for review, while the commit phases hasn't been finalized.

Define a GKR IOP protocol for a chip includes defining build_commit_phase, build_commit_phase2 and build_gkr_phase. Specially, build_gkr_phase is mainly to build GKR layers in the reverse order. In addition to specify the expressions, to simplify the case of either transferring evaluations from an input of a succeeding layer to an output of the current layer or even make some computations before feeding to the current layer, we use an evaluation tape to place the evaluations and EvalExpression to define the computation. Each layer input will be assigned a position in the evaluation tape. EvalExpression is defined as follows:

#[derive(Clone, Debug)]
pub enum EvalExpression {
    Single(usize),
    Linear(usize, Constant, Constant),
    Partition(Vec<Box<EvalExpression>>, Vec<(usize, Constant)>),
}

of which the items denote how to compute the output evaluations. For more details please refer to gkr_iop/src/evaluation.rs.

Here are some subsequent tasks:

Parallelize the vector evaluations under subprotocols/src/expression/.
Devirgo migration.
Benchmarks.
Keccak example and benchmarks.

Although the previous tasks should be done, I suggest to start the first round of review first. Would like to see comments from @naure and @hero78119 so that I can adjust the design before moving forward.

Upd: The design doc: https://hackmd.io/@sphere-liu/HyLR-h2L1g.

Suggestions for #799 Feel free to pick and choose from the suggestions. I talk about most of them on your PR. --------- Co-authored-by: dreamATD <tianyi.liu.08@gmail.com>

naure

First pass on gkr_iop. It makes sense so far.

Suggestions for #799 Feel free to pick and choose from the suggestions. I talk about most of them on your PR. --------- Co-authored-by: dreamATD <tianyi.liu.08@gmail.com>

hero78119

Awesome job!
I leave few comments in separate section due to large PR so I did the review in segmented time.

Most of the utility of code reused can be done later, I think the most important point might be trying one pre-compile (e.g. keccak-f) first, and benchmark the preliminary performance. Once it meet the requirements, we proceed to more engineering polishing works :)

Suggestions for #799 Feel free to pick and choose from the suggestions. I talk about most of them on your PR. --------- Co-authored-by: dreamATD <tianyi.liu.08@gmail.com>

hero78119 · 2025-03-05T02:46:00Z

related to #191

Suggestions for #799 Feel free to pick and choose from the suggestions. I talk about most of them on your PR. --------- Co-authored-by: dreamATD <tianyi.liu.08@gmail.com>

Remove buffers and replace the underlying util functions. Add comments and fix some tiny bugs Suggestions for 'Frontend and backend for building circuits' (#801) Suggestions for #799 Feel free to pick and choose from the suggestions. I talk about most of them on your PR. --------- Co-authored-by: dreamATD <tianyi.liu.08@gmail.com> Refine according to comments refine the protocol prover and verifier structs Add more comments Tiny fix according to the latest comments.

To close issue #632 named io as `debug_println` in guest program debug build, assuming no "println!" use case in guest program. In debug build, we extend stack address a bit to cover a reserved 256k for io. This extra reserved space also reflect in linker script, so the write to this region wont got any complaints from either elf or riscv emulator Besides, this PR also fix a previous problem where meaningful symbol in bss/sbss section will be skip due to their value are 0. We need to reserve and padding to cover them, since those might be some static variables initialized with 0 or uninitialized. Without do it, emulator will also complain regions is not writable. - cleanup previois workaround in guest program for io - extend stack address for io consistency check during debug build - refactor `load_elf` bss/sbss padding issue. - e2e command also shows io result. - respect profile in guest program examples compilation. An guest program with IO ```bash cargo run --release --features sanity-check --package ceno_zkvm --bin e2e -- --platform=ceno --hints=10 --public-io=4191 examples/target/riscv32im-ceno-zkvm-elf/release/examples/ceno_rt_io cargo run --features sanity-check --package ceno_zkvm --bin e2e -- --platform=ceno --hints=10 --public-io=4191 examples/target/riscv32im-ceno-zkvm-elf/debug/examples/ceno_rt_io ```

To close #936 ### Design rationales - introduce `VirtualPolynomialsBuilder` to lift a witness of "ArcPoly" type to expression container, so they can involve into expression domain for calculation - apply `VirtualPolynomialsBuilder` in tower prover. - keep scalar in base field as possible via introducing `Either<Base, Ext>` type - reserve design for "eq" degree -1 optimisation > this part work haven't done yet and set as future work :) `VirtualPolynomialsBuilder` is more like a util function for ceno main sumcheck flow. For GKR layer circuit in gk- iop #799 , the expression system will directly applied on chip-builder and skip `VirtualPolynomialsBuilder` ### benchmark there is no impact for e2e benchmark before/after this change, which is expected 2^20 ``` fibonacci_max_steps_1048576/prove_fibonacci/fibonacci_max_steps_1048576 time: [2.3583 s 2.3709 s 2.3848 s] change: [-1.8405% -1.0740% -0.2480%] (p = 0.03 < 0.05) Change within noise threshold. ``` 2^21 ``` fibonacci_max_steps_2097152/prove_fibonacci/fibonacci_max_steps_2097152 time: [4.4650 s 4.4758 s 4.4867 s] change: [-0.6673% -0.3122% +0.0493%] (p = 0.13 > 0.05) No change in performance detected. ``` 2^22 ``` fibonacci_max_steps_4194304/prove_fibonacci/fibonacci_max_steps_4194304 time: [9.0115 s 9.0574 s 9.1011 s] change: [-1.0658% -0.3407% +0.3803%] (p = 0.40 > 0.05) No change in performance detected. ```

sync up #799 with master

### Change scope - [x] unify `Expression` with ceno - [x] unify sumcheck with ceno - [ ] WIP GKR witness generation, take bit benchmark as example --------- Co-authored-by: Zhang Zhuo <mycinbrin@gmail.com>

``` RUST_LOG=info JEMALLOC_SYS_WITH_MALLOC_CONF=retain:true,metadata_thp:always,thp:always,dirty_decay_ms:-1,muzzy_decay_ms:-1,abort_conf:true cargo run --features jemalloc --package gkr_iop --bin lookup_keccak ``` > this only cover prover flow, and not verifier flow yet benchmark command ``` JEMALLOC_SYS_WITH_MALLOC_CONF=retain:true,metadata_thp:always,thp:always,dirty_decay_ms:-1,muzzy_decay_ms:-1,abort_conf:true cargo bench -p gkr_iop --features jemalloc --bench lookup_keccakf ``` Benchmark results on AMD EPYC 32 cores machine | Version | Throughput (keccak/s) | |------------------------|------------------------| | Ceno Keccak version | 4215 | | Plonky3 + Baby Bear | 1188.47 | | Plonky3 + Goldilocks | 683.05 | | Ceno (textbook gkr) | 128 | --------- Co-authored-by: Zhang Zhuo <mycinbrin@gmail.com>

hero78119

amazing work with many inspiring new designs 👍 !!

### Change This PR sync with ceno master, and rollback partial of change to assure not affect ceno mainflow benchmark ### benchmark against master | Benchmark | Median Time (s) | Median Change (%) | |----------------------------------|------------------|-------------------------------------| | fibonacci_max_steps_1048576 | 2.1283 | +2.0905% (Change within noise) | | fibonacci_max_steps_2097152 | 3.6231 | +0.9229% (No change in performance) | | fibonacci_max_steps_4194304 | 6.4747 | -0.1104% (No change in performance) | --------- Co-authored-by: Zhang Zhuo <mycinbrin@gmail.com> Co-authored-by: xkx <xiakunxian130@gmail.com> Co-authored-by: Akase Haruka <lightsing@users.noreply.github.com>

This PR build on top of #799 with one extra 48ded1a to introduce backend expression and cached in constrain system. This align the design with pre-compile so its easier for next step refactor to introduce precompile chip in main flow. Main sumcheck read/write lookup expression was simplified, as post `evaluate()` was also removed. ### Expression Expression will be simplified into 2 kind: frontend and backend expression - frontend expression: expression with Witin/StructuralWitin/Fixed, in recursive/nested style - backend expression: expression with Witin only, in monomial style. After circuit setup, both expression content are all known and freezed. During runtime, we can take backend expression and evaluate its scalar with "challenge/instance" then the final expression can be put into sumcheck. ### benchmark The nice thing is before/after change, there is no performance difference. | Benchmark | Median Time (s) | Median Change (%) | |----------------------------------|------------------|----------------------------------------| | fibonacci_max_steps_1048576 | 2.0641 | -0.9869% (No change in performance detected) | | fibonacci_max_steps_2097152 | 3.5514 | -1.0748% (Change within noise threshold) | | fibonacci_max_steps_1048576 | 2.0641 | -0.9869% (No change in performance detected) |

@naure

This is an implementation of the expression-based and plonkish-like GKR IOP protocol. The circuit is denoted as `Chip`, holding all information to process commit phases and GKR proving phase. In the current implementation, we assume there are two commit phases. To process the GKR phase, we extract a `GKRCircuit` from it and run the GKR protocol. For the implementation status, the GKR phase is ready for review, while the commit phases hasn't been finalized. Define a GKR IOP protocol for a chip includes defining `build_commit_phase`, `build_commit_phase2` and `build_gkr_phase`. Specially, `build_gkr_phase` is mainly to build GKR layers in the reverse order. In addition to specify the expressions, to simplify the case of either transferring evaluations from an input of a succeeding layer to an output of the current layer or even make some computations before feeding to the current layer, we use an evaluation tape to place the evaluations and `EvalExpression` to define the computation. Each layer input will be assigned a position in the evaluation tape. `EvalExpression` is defined as follows: ```rust #[derive(Clone, Debug)] pub enum EvalExpression { Single(usize), Linear(usize, Constant, Constant), Partition(Vec<Box<EvalExpression>>, Vec<(usize, Constant)>), } ``` of which the items denote how to compute the output evaluations. For more details please refer to [gkr_iop/src/evaluation.rs](https://github.com/scroll-tech/ceno/blob/tianyi/refactor-prover/gkr_iop/src/evaluation.rs). Here are some subsequent tasks: - [ ] Parallelize the vector evaluations under `subprotocols/src/expression/`. - [ ] Devirgo migration. - [ ] Benchmarks. - [ ] Keccak example and benchmarks. Although the previous tasks should be done, I suggest to start the first round of review first. Would like to see comments from @naure and @hero78119 so that I can adjust the design before moving forward. **Upd:** The design doc: https://hackmd.io/@sphere-liu/HyLR-h2L1g. --------- Co-authored-by: Mihai <mihai.calancea@gmail.com> Co-authored-by: mcalancea <mihai@inversed.tech> Co-authored-by: Sphere L <sph6r6.l1u@gmail.com> Co-authored-by: Ming <hero78119@gmail.com> Co-authored-by: Zhang Zhuo <mycinbrin@gmail.com> Co-authored-by: xkx <xiakunxian130@gmail.com> Co-authored-by: Akase Haruka <lightsing@users.noreply.github.com>

This PR build on top of #799 with one extra 48ded1a to introduce backend expression and cached in constrain system. This align the design with pre-compile so its easier for next step refactor to introduce precompile chip in main flow. Main sumcheck read/write lookup expression was simplified, as post `evaluate()` was also removed. ### Expression Expression will be simplified into 2 kind: frontend and backend expression - frontend expression: expression with Witin/StructuralWitin/Fixed, in recursive/nested style - backend expression: expression with Witin only, in monomial style. After circuit setup, both expression content are all known and freezed. During runtime, we can take backend expression and evaluate its scalar with "challenge/instance" then the final expression can be put into sumcheck. ### benchmark The nice thing is before/after change, there is no performance difference. | Benchmark | Median Time (s) | Median Change (%) | |----------------------------------|------------------|----------------------------------------| | fibonacci_max_steps_1048576 | 2.0641 | -0.9869% (No change in performance detected) | | fibonacci_max_steps_2097152 | 3.5514 | -1.0748% (Change within noise threshold) | | fibonacci_max_steps_1048576 | 2.0641 | -0.9869% (No change in performance detected) |

dreamATD requested review from hero78119 and naure January 1, 2025 02:37

dreamATD force-pushed the tianyi/refactor-prover branch 2 times, most recently from dc664af to 929eddf Compare January 1, 2025 03:46

kunxian-xia self-requested a review January 2, 2025 08:50

matthiasgoergens self-requested a review January 2, 2025 08:52

matthiasgoergens mentioned this pull request Jan 2, 2025

Suggestions for 'Frontend and backend for building circuits' #801

Merged

dreamATD linked an issue Jan 6, 2025 that may be closed by this pull request

frontend design for precompiles #591

Closed

naure reviewed Jan 6, 2025

View reviewed changes

Comment thread gkr_iop/src/chip.rs Outdated

Comment thread gkr_iop/src/chip/builder.rs Outdated

Comment thread gkr_iop/examples/multi_layer_logup.rs Outdated

Comment thread gkr_iop/src/chip/protocol.rs Outdated

dreamATD commented Jan 7, 2025

View reviewed changes

Comment thread gkr_iop/src/lib.rs Outdated

dreamATD force-pushed the tianyi/refactor-prover branch from 16b57f3 to 29061f1 Compare January 9, 2025 11:30

dreamATD force-pushed the tianyi/refactor-prover branch from cffdd03 to d51562b Compare January 15, 2025 01:20

hero78119 reviewed Jan 15, 2025

View reviewed changes

Comment thread gkr_iop/examples/multi_layer_logup.rs Outdated

Comment thread gkr_iop/examples/multi_layer_logup.rs Outdated

Comment thread gkr_iop/src/chip.rs Outdated

hero78119 reviewed Jan 15, 2025

View reviewed changes

Comment thread gkr_iop/src/lib.rs Outdated

Comment thread multilinear_extensions/src/virtual_poly.rs Outdated

Comment thread subprotocols/src/expression.rs Outdated

Comment thread subprotocols/src/expression/evaluate.rs Outdated

hero78119 reviewed Jan 15, 2025

View reviewed changes

Comment thread subprotocols/src/expression.rs Outdated

dreamATD force-pushed the tianyi/refactor-prover branch 2 times, most recently from 7762988 to 87d1a30 Compare January 15, 2025 13:32

dreamATD force-pushed the tianyi/refactor-prover branch from 87d1a30 to 88f9b00 Compare February 19, 2025 06:54

mcalancea mentioned this pull request Mar 4, 2025

[Circuit] Naive keccak_f prototype (v1) #847

Closed

hero78119 mentioned this pull request Mar 5, 2025

Sumcheck optimizations #191

Closed

2 tasks

hero78119 mentioned this pull request Mar 5, 2025

"Some Improvements for the PIOP for ZeroCheck" paper follow up #537

Closed

dreamATD force-pushed the tianyi/refactor-prover branch from 88f9b00 to eb4c9cb Compare March 22, 2025 11:50

dreamATD force-pushed the tianyi/refactor-prover branch from eb4c9cb to ba053c2 Compare March 22, 2025 11:53

add keccak_f precompiles & utilities

5e18d9a

hero78119 and others added 2 commits May 7, 2025 08:19

Merge branch 'tianyi/refactor-prover' into tianyi/keccak-opt

6d8dd1c

hero78119 mentioned this pull request May 8, 2025

support build virtual polynomials in expression style #937

Merged

hero78119 added 2 commits May 9, 2025 17:02

merge with master

c22db17

log_state work in keccak guest program

c3086a4

lispc changed the title ~~Frontend and backend for building circuits~~ Precompile: Frontend and backend for building circuits May 13, 2025

hero78119 and others added 6 commits May 16, 2025 14:20

merge with master

7527048

Refactor keccak circuit to be a single layer

7d8afae

Merge pull request #941 from hero78119/tianyi/refactor-prover

fab82b4

sync up #799 with master

tmp

109b043

Refactor keccak circuit to be a single layer

1339fc0

Merge branch 'tianyi/keccak-opt' into tianyi/refactor-prover

b38ac9d

dreamATD force-pushed the tianyi/refactor-prover branch from 834e0d6 to b38ac9d Compare May 19, 2025 22:34

Update .gitignore

24a43b4

hero78119 mentioned this pull request May 30, 2025

[Experiment] new crates subprotocols #960

Draft

hero78119 and others added 2 commits June 3, 2025 17:58

WIP refactor #799 (#952)

009a61f

### Change scope - [x] unify `Expression` with ceno - [x] unify sumcheck with ceno - [ ] WIP GKR witness generation, take bit benchmark as example --------- Co-authored-by: Zhang Zhuo <mycinbrin@gmail.com>

hero78119 approved these changes Jun 5, 2025

View reviewed changes

hero78119 mentioned this pull request Jun 5, 2025

generalize main sumcheck as one gkr layer #964

Merged

hero78119 and others added 2 commits June 7, 2025 03:57

merge with master

5368ef9

hero78119 enabled auto-merge June 6, 2025 21:09

hero78119 added this pull request to the merge queue Jun 6, 2025

Merged via the queue into master with commit 4160d42 Jun 6, 2025
4 checks passed

hero78119 deleted the tianyi/refactor-prover branch June 6, 2025 21:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Precompile: Frontend and backend for building circuits#799

Precompile: Frontend and backend for building circuits#799
hero78119 merged 39 commits intomasterfrom
tianyi/refactor-prover

dreamATD commented Dec 31, 2024 •

edited

Loading

Uh oh!

naure left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hero78119 left a comment

Uh oh!

Uh oh!

hero78119 commented Mar 5, 2025

Uh oh!

hero78119 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

dreamATD commented Dec 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

naure left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hero78119 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hero78119 commented Mar 5, 2025

Uh oh!

hero78119 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dreamATD commented Dec 31, 2024 •

edited

Loading