Today, Zama announces the release of a new version of Concrete-ML. Notable new features are client/server APIs for deployment, new machine-learning models, improved speed, and support for Quantization Aware Training (QAT).
Client / Server APIs.
Now, our users have functions to:
- generate the keys (once, on the client);
- encrypt the data (on the client);
- run the FHE model (on the untrusted server);
- and decrypt the result (on the client).
Thanks to these APIs, models can now be deployed in a production setting. An example of the usage of these functions is available in the documentation.
New machine-learning models.
Our list of available models has also been extended, with notable regressors based on DecisionTree, RandomForest and XGBoost (classifier versions already exist in Concrete-ML). Lasso, Ridge and ElasticNet models were added as well. For more details, have a look at the different built-in linear, tree-based and neural-network models in our documentation. And, if ever you find that a model that you use is missing, create a feature request on GitHub!
Quantization Aware Training.
Currently, one of the main limitations with Concrete-ML is the limited bit width for intermediate values in the computation. On this topic, we have two big updates to share
- First (see the Concrete-Numpy blog post), Zama has been working on extending the limit to 16b in Concrete-Numpy. This means that, in the near future, 16b will be available in Concrete-ML, which should dramatically increase the accuracy one can achieve on complex datasets, especially for Deep Learning.
- Second, Quantization Aware Training (QAT) was added in Concrete-ML. In classical quantization (called Post-Training Quantization, or PTQ), the training is done in floating point while the result model is converted to integer computation. In QAT, the training is completed as the weights and inputs to be quantized are constrained, making the final weights optimal under quantization. As a result, models are much more effective for the same extremely low bit width.
In particular, QAT has been integrated directly, without any additional work for the user, in our built-in neural network models NeuralNetClassifier and NeuralNetRegressor, and it shows impressive improvements:
Thanks to QAT, the FHE classifier better fits the data and produces more accurate results.
QAT was also added to the custom model import functionality, allowing our users to directly compile models that they have quantized themselves, e.g. using third-party tools. Notably, Brevitas has been used extensively to apply QAT on MNIST and on simple real-world datasets.
Next quarter, Zama will tackle even more complex tasks, all thanks to QAT and 16b extended precision.