Introducing a fully CUDA-optimized implementation of the non-interactive Sumcheck protocol for arbitrary functions over multilinear polynomials.

What is ICICLE
ICICLE is our software library built to empower cryptographers in implementing advanced algorithms and protocols — starting with Zero-Knowledge Proofs (ZKPs) — with exceptional performance and ease of use. It marks a major step toward hardware-agnostic cryptographic solutions, ensuring seamless compatibility across diverse hardware platforms with zero switching costs.
Highlights: What’s New in V3.5
The key highlight of V3.5 is the new Sumcheck API. Below, we’ll dive into the details of how to use this interface and the Program engine, which enables the user to generate sumcheck proof for any statement expressed as an arbitrary function of MLE (Multi Linear Extensions).
We also implemented proof-of-work in this version to accelerate grinding in FRI-like protocols — shoutout to Eylon Yogen and Giacomo Fenzi for requesting this feature! Additionally, we added the Poseidon2 sponge function and fixed bugs in:
- get_device_count() now correctly returns the actual value instead of 0 — thanks to Zircuit for reporting this!
- The vecops Rust wrapper now supports batch sizes greater than 1.
- Fixed host-math inv2() by disabling the no-aliasing optimization.
For further information, see the full release notes.
Program — Lambda Functions
The Program class enables users to define expressions on vector elements, which ICICLE compiles into a fused implementation for the backends. This approach mitigates memory bottlenecks while allowing users to customize algorithms like Sumcheck. Program exclusively supports element-wise lambda functions and currently includes two predefined programs: (A(X) * B(X) — C(X)) and eq(X) * (A(X) * B(X) — C(X)), with more to come in future updates. The Rust API currently supports only these predefined functions, with full API coverage planned for the next releases.
Program Use Case: Sumcheck API
Sumcheck is a fundamental protocol with broad applications in ZKPs. Its most common use case is in the R1CS system (Spartan), where it efficiently verifies constraints of the form eq(X) * (A(X) * B(X) — C(X)), encapsulating degree-two relations. A year ago, we started our journey with a research paper where we developed parallelizable algorithms for arbitrary product sumcheck. Following which, we released an Arkworks based POC that enables users to prove sumcheck for arbitrary functions of MLE’s.
Applications such as the Jolt VM — use Sumcheck with arbitrary products and linear combinations of these products. Notably, in Jolt’s LASSO lookup arguments, the primary Sumcheck function depends on the lookup table structure defined by the user’s application.
Another key application is HyperPlonk, which allows users to define custom gates for high-degree constraints and leverage Sumcheck for efficient proof generation.
With the Program class, ICICLE users can now generate Sumcheck proofs for arbitrary functions, enabling scalable and flexible proof generation across various cryptographic protocols.
Note: This is the first time Fiat-Shamir is being run in ICICLE. As a result, both Sumcheck CUDA and Sumcheck CPU are currently undergoing an audit. If you’re considering incorporating Sumcheck into a production system, we’d be happy to share our design documents and audit report (which will be made public in the future).

Full Example
In this section, we demonstrate the Sumcheck C++ API.
First, we define the parameters for the polynomials, such as their size and number. Specifically, we use the function eq(X) * (A(X) * B(X) — C(X)), which involves four polynomials. In this example, these polynomials are generated randomly.

In this step, we compute the sum of the polynomials:

In this example, the calculation is performed using the CPU backend, but you can also use the GPU backend to generate the Sumcheck proof:

Next, we create the transcript configuration with default values. The transcript is essential for the Fiat-Shamir scheme, as both the Prover and Verifier use it to ensure consistency.
Next, we set up the Sumcheck Prover, which uses the function eq(X) * (A(X) * B(X) — C(X)). Since this function is commonly used, it has a predefined type, allowing it to run faster.
Next, we create the Sumcheck configuration object and an empty proof object to store the proof generated by the Prover.

At this stage, the proof is generated. The Sumcheck Prover produces the proof, which is then stored in the sumcheck_proof object. This proof will later be sent to the Verifier.

Once the proof is ready, we proceed to the Verifier side. Here, a new Sumcheck object is created for verification. We also define a boolean variable to store the verification result.

This line of code executes the verification process. After execution, the verification_pass variable is set to true if the proof is valid, or false otherwise.

It’s important to note that the transcript used by both the Prover and Verifier must be identical to ensure the Fiat-Shamir scheme produces consistent results on both sides.
Our Fiat-Shamir implementation closely follows the Merlin library implementation. Specifically, we append relevant metadata to the Prover’s messages, and challenges are generated using the strong Fiat-Shamir protocol.
Future Work and Upcoming Releases
In V3.5, the Sumcheck API supports only large fields. Since large fields have seen greater adoption than small fields, we have prioritized their support. Small fields will be added in a future release.
As mentioned above, while the C++ Sumcheck API is fully featured, the Rust implementation currently supports only predefined functions, and Golang support is not available. Full Rust and Go support will be provided in upcoming releases.
ICICLE V3.6 will introduce a new backend: Metal support! This will enable all ICICLE code to run efficiently on Apple Silicon. If you’d like to experiment with the next version, send us an email at hi@ingonyama.com.