TFHE-rs v0.6: Zero-Knowledge Support and Signed Integer Operations on GPU
TFHE-rs v0.6 introduces a cryptographic technique that complements FHE, known as Zero-Knowledge Proofs. Additionally, this version enhances GPU support for arithmetic operations with signed integer operations. Furthermore, it introduces additional cryptographic features, such as the generation of encrypted randomness.
Zero-Knowledge Proof for Compact Public Key encryption
In addition to the standard private key settings, TFHE-rs now encompasses the public key scheme as described in Marc Joye's work. This approach allows anyone to encrypt a ciphertext, making it essential in some cases to prove that the encryption was correctly performed. The latest version of TFHE-rs enables the generation of a Zero-Knowledge Proof to verify that a public key encryption process has been performed correctly. In other words, the creation of a proof reveals nothing about the encrypted message, except for its already known range. This technique is derived from Benoit Libert’s work.
Deploying this feature is straightforward: the client generates the proof at the time of encryption, while the server verifies it before proceeding with homomorphic computations. Below is an example demonstrating how a client can encrypt and prove a ciphertext, and how a server can verify the ciphertext and carry out computations on it:
Encrypting and proving an FheUint64 takes 6.9 seconds on a Dell XPS 15 9500, simulating a client machine. On the other hand, verification on an hpc7a.96xlarge, available on AWS, is completed in just 123 milliseconds using a mode where the verification is cheaper.
There is another mode with a more expansive verification, in this setting the proof generation only takes 2.5 seconds on the same laptop and verification takes 467 milliseconds on the same AWS instance.
Enhanced GPU support
This release introduces support for signed integer operations on GPU, as well as:
- unsigned and signed scalar multiplication,
- unsigned and signed encrypted shift and rotate,
- unsigned overflowing subtraction.
Cross-language support is now possible thanks to the new C API that wraps integer arithmetics on GPU.
Performance improvements are also brought in this release: the multi-bit PBS (a.k.a. multithreaded PBS) support has been stabilized and is now recommended for GPU users, as it is significantly faster than the classical PBS. It is indeed an algorithm for the PBS that exposes more parallelism, hence why it performs better on GPU than on CPU. Here is an example of how to use it:
Additionally, H100 GPUs have become increasingly easy and cheap to access with the rise of LLM training and inference, and offer much more compute throughput than the V100 GPUs targeted previously. H100 support has been enhanced in TFHE-rs v0.6, and these GPUs are now targeted in the reference benchmark results, summarized in Table 1.
On a single H100, the GPU performance is now very close to the performance of the high-end CPU used as a reference.
Miscellaneous
The latest version of TFHE-rs also includes new operations, new noise distributions and some other enhancements:
- Support of leading/trailing zeros/ones and [.c-inline-code]log2[.c-inline-code];
- Implementation of checked division, returning an encrypted flag indicating whether the divisor is equal to 0 or not;
- Improvement of multiplication speed by 8% now running in 366 ms for 64 bit integers;
- Introduction of a counter to track the number of PBS executions;
- Support for the TUniform noise distribution has been added.
For the forthcoming release, the focus will shift to reducing the size of ciphertexts and introducing support for multi-GPU computations to further enhance performance.
Additional links
- Star the TFHE-rs Github repository to endorse our work.
- Review the TFHE-rs documentation.
- Get support on our community channels.