Concrete v2.7: GPU Wheel, Extended Function Composition and Other Improvements

July 5, 2024
  -  
Quentin Bourgerie

We’re excited to announce that Concrete v2.7 introduces the first wheel that can accelerate computations on GPUs! In this new release, we also extend the support for function composition, and add several new features in the Python frontend for the user.

GPU acceleration

For those who have been following our GitHub repository, you know that accelerating Fully Homomorphic Encryption (FHE) on GPUs has been a challenging task. After months of dedicated effort and significant progress, we are thrilled to announce that GPU acceleration for FHE is now officially available to end-users.

Concrete aims to make FHE easy to use, so the developer can leverage its power without an extensive knowledge of FHE or of GPU acceleration. To use GPU acceleration, simply install the Concrete wheel that supports the GPU acceleration and to add the [.c-inline-code]use_gpu[.c-inline-code] option when compiling. You can find the GPU in our Zama public PyPI. There are two repositories: one for the CPU wheels (including nightlies) and one for the GPU wheels. To install the GPU wheel, you just need to specify our GPU repository in the installation command.

pip install concrete-python --index-url https://pypi.zama.ai/gpu

Once the GPU wheel is installed, you need to set the [.c-inline-code]use_gpu[.c-inline-code] option of the compile function to [.c-inline-code]True[.c-inline-code], and the Concrete compiler/runtime will do the rest under the hood to exploit all the available CPUs and GPUs of the host.

from concrete import fhe


@fhe.compiler({"x": "encrypted"})
def myfunction(x):
   ...


myfhefunction = myfunction.compile(inputset, use_gpu=True)

It’s important to note that hosts with GPUs are not always faster than machines with only CPUs. The performance gain depends both on the hardware and the workloads. One key factor is the amount of parallelizable work and its granularity, so workloads heavy on linear algebra, for example,  tend to benefit from GPUs.

We have benchmarked this feature running CIFAR-10 on three different systems: 

  • AWS hpc7a.96xlarge with 192 CPU hardware threads
  • Hyperstack n3-H100x4 with 124 hardware threads and 4 GPU H100
  • Hyperstack n3-H100x8 with 252 hardware threads and 8 GPU H100

The execution times for this benchmark, in both exact or approximate PBS modes, are as follows:

As we can see, the GPU wheel can be up to 2.5 times faster than the CPU wheel in this use case.

Enhanced function composition

In our previous release, Concrete v2.7, has introduced modules which allow for the composition of multiple functions. However, the support was quite limited as it relied on the simplest algorithm available to select the cryptographic parameters, meaning that the same set of parameters was used for the whole module. Depending on your module, this approach could significantly impact the performance. 

Concrete v2.7 improves the modules optimization by using a partition-based approach. This approach allows different crypto-parameters to be used for different parts of the module. By partitioning the graph based on the precision requirements of the subpart, you can optimize independently the different partitions and yield tighter crypto-parameters.   

As always, this complexity is completely handled by Concrete. If you already use modules, simply update Concrete and you will benefit from the performance boost. 

If you want to have even tighter crypto-parameters, Concrete v2.7 now allows you to specify the actual dependencies between the functions in your modules. This enables  the compiler to apply an even more aggressive partitioning, leading to faster execution. 

Below is a code snippet showing how to specify dependencies between inputs and outputs. You can find more information on the wiring API here.

from concrete import fhe
from fhe import Wired, Wire, Output, Input


@fhe.module()
class Collatz:


   @fhe.function({"x": "encrypted"})
   def collatz(x):
       y = x // 2
       z = 3 * x + 1
       is_x_odd = fhe.bits(x)[0]
       ans = fhe.multivariate(lambda b, x: b * x)(is_x_odd, z - y) + y
       is_one = ans == 1
       return ans, is_one


   composition = Wired(
       [
           Wire(Output(collatz, 0), Input(collatz, 0)
       ]
   )

Other improvements

Finally, Concrete v2.7 includes several additional minor improvements that you can find on GitHub or in the release note. These improvements don’t require any change in the API, so we encourage you to test our latest releases and enjoy the benefits of these improvements without any additional effort.

Thank you for your continued support and feedback as we strive to make FHE more accessible and efficient for everyone.

Additional links

Read more related posts

Concrete ML v1.6: Bigger Neural Networks and Pre-trained Tree-based Models

Concrete ML v1.6 improves latency on large neural networks and supports pre-trained tree-based models with many other improvements

Read Article

fhEVM v0.5: Enhanced Security and Efficiency for Encrypted Data

fhEVM v0.5 introduces many enhancements to improve the security and efficiency of handling encrypted data in your applications.

Read Article

TFHE-rs v0.7: Ciphertext Compression, Multi-GPU Support and More

TFHE-rs v0.7 introduces the compression of ciphertexts that encrypt the result of homomorphic computations and many improvements.

Read Article

Zama Product Releases - July 2024

With these releases, Zama continues to build its suite of products to make homomorphic encryption accessible, easy, and fast.

Read Article