Evaluating recurrent neural networks over encrypted data using NumPy and Concrete
Fully Homomorphic Encryption (FHE) is a cryptographic technique that allows you to compute on ciphertexts (encrypted messages) without needing to decrypt the messages inside them. FHE programming is notoriously hard, which is why we created an experimental compiler that can convert a classical Numpy program into an FHE circuit that can then be run using the Concrete FHE library.
Homomorphic NumPy
The Homomorphic NumPy (HNP) library allows you to convert functions operating on NumPy multidimensional arrays into a homomorphic equivalent. Think of it as writing your computation in the usual way using NumPy only, then HNP will take care of the conversion.
In this article and in the accompanying video, we will showcase two examples using HNP. We will start with logistic regression to get familiar with the process of converting NumPy functions, then go for a more involved use case, Recurrent Neural Networks.
Disclaimer: HNP is an experimental tool that will not be further developed, as we are working on a new compiler with better performances and reliability. We still wanted to show how far FHE has gotten and enable the community to experiment while we work on the stable release.
Installing HNP
Before we can get into the examples, we need to install HNP using the Zama docker image. The container comes with the necessary libraries preinstalled, including Jupyter so that you can directly start playing with it.
You’re ready to go!
Homomorphic Logistic Regression
Let’s start with a simple example of how to use HNP: performing inference using a logistic regression model.
The first part is to import the libraries and define the inference function. This is pretty straightforward since we assume the model is already trained.
To compile our NumPy function into its homomorphic equivalent, we need to provide some information about the inputs, namely, the shape of the multi-dimensional array, and its bounds. The bounds are the range in which the values of the input array fall. It’s important to note that these bounds should only take into account the input, and not any computation that might occur later on (this will be taken care of by the compiler).
FHE is currently limited in terms of precision, which means bounds have to be as tight as possible. Here, we will generate some random data between -1 and 1, and check that it will run correctly when encrypted using the simulate method. The result can be a little different between simulation and the original NumPy computation, but as far as it doesn’t exceed h.expected_precision() then it’s considered valid.
Next, we need to generate public and private keys for the user. In FHE, the server doing the computation doesn’t need the private key since nothing is decrypted. Instead, a public key is sent for each user of the service. Note that the compilation itself is user-independent, so you only need to compile once and it will run for any public key and user of your system. Key generation currently takes a while (tens of seconds, sometimes minutes), but it only needs to be done once.
Finally, we can run the computation. This consists of three steps: encryption of the input, evaluation of the program, and decryption of the output. In a real application, encryption and decryption is done on the user’s device, while the evaluation is done server side.
For convenience when debugging, HNP also provides a shortcut to do all the steps at once: h.encrypt_and_run(keys, x)
That’s it! You have now successfully created your first homomorphic NumPy program. A more complete Logistic Regression example can be found here.
Recurrent Neural Networks
In this example, we will use a simple LSTM (long short-term memory) to do simple sentiment analysis and classify a sentence as either positive or negative.
Deep learning, and in particular RNNs, are notoriously hard to implement using FHE, as it used to be impossible to evaluate non-linear activation functions homomorphically, as well as impossible to go beyond a few layers deep because of noise accumulation in the ciphertext. Both of these issues are solved in Concrete by implementing a novel operator called “programmable bootstrapping”, which HNP relies upon heavily.
Our model is an LSTM followed by a linear layer and a sigmoid activation function. We use a pre-trained word embedding and this dataset.
For this example, we will need some additional boilerplate code, as we will be using PyTorch, but the overall compilation process remains the same. The complete notebook for this example can be found here, so we will only focus on the important parts.
First, let’s define the model in PyTorch:
We will assume at this point that we have the trained model and want to compile it into its homomorphic equivalent. Our compile function can only take NumPy computation, so we will need to manually convert this PyTorch model to work with NumPy. Here is how to extract the learned parameters and implement the forward pass using NumPy:
Now that we have our forward pass in pure NumPy, we can compile it. We will be using some advanced configuration options to make things better:
- The handselected parameter optimizer will use a pre-computed set of parameters that have been known to work well in machine learning usecases by sacrificing some precision for faster execution.
- The apply_topological_optimization parameter should be enabled by default to ensure the FHE circuit is correctly optimized.
- The probabilistic_bounds parameter controls how big the margin of error can be around the bounds of data; a bigger value will guarantee a bigger margin of error, but less precision.
You can play with these parameters and see how they affect the final result. Here, we limit the length of sentence to five words in order to keep the running time reasonable, but feel free to use longer sentences.
Then we generate some user keys:
We are now ready to run the evaluation. To verify that the compilation went well, we will also output some debugging info using the simulate function. Note that sentences of less than five words will need to be padded with zeros.
And finally, we evaluate an example:
This might take 30 seconds up to 20 minutes or more, depending on your hardware, so the more cores you have on your CPU the better!
Conclusion
FHE is still in its infancy, and until recently was not even working at all. While the precision and speed is still a barrier to adoption, they are improving following a Moore-like law where we have a 10x gain in speed every 18 months or so. This means that by 2025, FHE should be usable everywhere on the internet, from databases to machine learning and analytics!
Let us know what you build!