Concrete ML v1.5: Encrypted DataFrames and Faster Neural Networks

April 8, 2024
Andrei Stoian

In this release, Concrete ML v1.5 introduces a new DataFrame API that enables working with encrypted stored data. This new feature extends Concrete ML's primary use case of private inference and marks another step towards confidential collaboration.

Additionally, Concrete ML v1.5 adds a new option that speeds up neural networks by 2-3 times. It comes with an improved FHE simulation mode to quickly evaluate the impact of this feature on neural network accuracy. 

Finally, a new demo shows how to securely anonymize text data to query a knowledge base using ChatGPT without revealing any personally identifiable information.

Encrypted DataFrames

DataFrames are a programming paradigm that simplifies the manipulation of tabular data.  DataFrames create a portable data container by storing heterogeneous data and data schemas, including data types and column names. In addition, DataFrames have the functionality to query and filter the stored data, much like a database engine. 

DataFrames are popular in data science for storing and preprocessing data before running statistical analysis or training models. Concrete ML v1.5 takes inspiration from the popular Pandas package and gives users an API to encrypt, join, and decrypt DataFrames. This feature allows multiple parties to collaborate on encrypted stored data as an input private model inference or training, marking another step of confidential collaboration. See more details in the example workflow.

Faster neural network inference

Concrete ML v1.5 introduces an option that allows users to trade-off between speed and the exactness of predictions in FHE models compared to the equivalent clear-text models. By opting for the new option, users can achieve speed improvements of 2-3 times.  For example, an FHE primitive computes activation functions faster when such noise is allowed. Concrete ML v1.5 gives users the possibility to choose between the two modes: (1) Ensure full exactness compared to cleartext models but operate at a lower speed and (2) Preserve model accuracy while allowing for some noise in the neural network logits, but achieve a faster execution. Users can validate that accuracy is maintained through an improved FHE simulation mode. 

Secure anonymization of knowledge bases

With the release of Concrete ML v1.5, we published a new HuggingFace space that illustrates how an anonymization model can be executed privately on encrypted text data. While ChatGPT does not provide a way to answer encrypted queries, anonymization with FHE can help securely remove personally identifiable information from documents and queries that are sent to ChatGPT. With this approach, companies can build anonymized knowledge bases and use them for retrieval-augmented generation (RAG) with ChatGPT. 

Additional links

Read more related posts