Dark Market with TFHE-rs

July 7, 2023

The Zama Team

× During the second season of the Zama Bounty Program, we asked you to create a tutorial of a dark market application with TFHE-rs. Github user yagizsenal successfully completed this bounty, and this blog post is based on his contribution.

The Zama Bounty Program

‍

A dark market is a marketplace where buy and sell orders are not visible to the public before they are filled. Different algorithms aim to solve this problem, we are going to implement the algorithm defined in this paper with TFHE-rs.

We will first implement the algorithm in plain Rust and then we will see how to use TFHE-rs to implement the same algorithm with FHE.

In addition, we will also implement a modified version of the algorithm that allows for more concurrent operations which improves the performance in hardware where there are multiple cores.

Specifications

Inputs

• A list of sell orders where each sell order is only defined in volume terms, it is assumed that the price is fetched from a different source.

• A list of buy orders where each buy order is only defined in volume terms, it is assumed that the price is fetched from a different source.

Input constraints

• The sell and buy orders are within the range [1,100].

• The maximum number of sell and buy orders is 500, respectively.

Outputs

There is no output returned at the end of the algorithm. Instead, the algorithm makes changes on the given input lists.

The number of filled orders is written over the original order count in the respective lists. If it is not possible to fill the orders, the order count is set to zero.

Example input and output

Example 1

	Sell	Buy
Input	[ 5, 12, 7, 4, 3 ]	[ 19, 2 ]
Output	[ 5, 12, 4, 0, 0 ]	[ 19, 2 ]

Last three indices of the filled sell orders are zero because there is no buy orders to match them.

Example 2

	Sell	Buy
Input	[ 3, 1, 1, 4, 2 ]	[ 5, 3, 3, 2, 4, 1 ]
Output	[ 3, 1, 1, 4, 2 ]	[ 5, 3, 3, 0, 0, 0 ]

Last three indices of the filled buy orders are zero because there is no sell orders to match them.

Plain Implementation

1. Calculate the total sell volume and the total buy volume.

let total_sell_volume: u16 = sell_orders.iter().sum();
let total_buy_volume: u16 = buy_orders.iter().sum();

2. Find the total volume that will be transacted. In the paper, this amount is calculated with the formula:

(total_sell_volume > total_buy_volume) * (total_buy_volume − total_sell_volume) + total_sell_volume

When closely observed, we can see that this formula can be replaced with the min function. Therefore, we calculate this value by taking the minimum of the total sell volume and the total buy volume.

let total_volume = std::cmp::min(total_buy_volume, total_sell_volume);

3. Beginning with the first item, start filling the sell orders one by one. We apply the min function replacement also here.

let mut volume_left_to_transact = total_volume;
for sell_order in sell_orders.iter_mut() {
    let filled_amount = std::cmp::min(volume_left_to_transact, *sell_order);
    *sell_order = filled_amount;
    volume_left_to_transact -= filled_amount;
}

The number of orders that are filled is indicated by modifying the input list. For example, if the first sell order is 1000 and the total volume is 500, then the first sell order will be modified to 500 and the second sell order will be modified to 0.

4. Do the fill operation also for the buy orders.

let mut volume_left_to_transact = total_volume;
for buy_order in buy_orders.iter_mut() {
    let filled_amount = std::cmp::min(volume_left_to_transact, *buy_order);
    *buy_order = filled_amount;
    volume_left_to_transact -= filled_amount;
}

The complete algorithm in plain Rust

fn volume_match_plain(sell_orders: &mut Vec, buy_orders: &mut Vec) {
    let total_sell_volume: u16 = sell_orders.iter().sum();
    let total_buy_volume: u16 = buy_orders.iter().sum();

    let total_volume = std::cmp::min(total_buy_volume, total_sell_volume);

    let mut volume_left_to_transact = total_volume;
    for sell_order in sell_orders.iter_mut() {
        let filled_amount = std::cmp::min(volume_left_to_transact, *sell_order);
        *sell_order = filled_amount;
        volume_left_to_transact -= filled_amount;
    }

    let mut volume_left_to_transact = total_volume;
    for buy_order in buy_orders.iter_mut() {
        let filled_amount = std::cmp::min(volume_left_to_transact, *buy_order);
        *buy_order = filled_amount;
        volume_left_to_transact -= filled_amount;
    }
}

FHE Implementation

For the FHE implementation, we first start with finding the right bit size for our algorithm to work without overflows. The variables that are declared in the algorithm and their maximum values are described in the table below:

Variable	Maximum Value	Bit Size
total_sell_volume	50000	16
total_buy_volume	50000	16
total_volume	50000	16
volume_left_to_transact	50000	16
sell_order	100	7
buy_order	100	7

As we can observe from the table, we need 16 bits of message space to be able to run the algorithm without overflows. TFHE-rs provides different presets for the different bit sizes. Since we need 16 bits of message, we are going to use the integer module to implement the algorithm.

Here are the input types of our algorithm:

• [.c-inline-code]sell_orders[.c-inline-code] is of type [.c-inline-code]Vec<tfhe::integer::RadixCipherText>[.c-inline-code]

• [.c-inline-code]buy_orders[.c-inline-code] is of type [.c-inline-code]Vec<tfhe::integer::RadixCipherText>[.c-inline-code]

• [.c-inline-code]server_key[.c-inline-code] is of type [.c-inline-code]tfhe::integer::ServerKey[.c-inline-code]

Now, we can start implementing the algorithm with FHE:

1. Calculate the total sell volume and the total buy volume.

let mut total_sell_volume = server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS);
for sell_order in sell_orders.iter_mut() {
    server_key.smart_add_assign(&mut total_sell_volume, sell_order);
}

let mut total_buy_volume = server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS);
for buy_order in buy_orders.iter_mut() {
    server_key.smart_add_assign(&mut total_buy_volume, buy_order);
}

2. Find the total volume that will be transacted by taking the minimum of the total sell volume and the total buy volume.

let total_volume = server_key.smart_min(&mut total_sell_volume, &mut total_buy_volume);

3. Beginning with the first item, start filling the sell and buy orders one by one. We can create [.c-inline-code]fill_orders[.c-inline-code] closure to reduce code duplication since the code for filling buy orders and sell orders are the same.

let fill_orders = |orders: &mut [RadixCiphertext]| {
    let mut volume_left_to_transact = total_volume.clone();
    for mut order in orders.iter_mut() {
        let mut filled_amount = server_key.smart_min(&mut volume_left_to_transact, &mut order);
        server_key.smart_sub_assign(&mut volume_left_to_transact, &mut filled_amount);
        *order = filled_amount;
    }
};

fill_orders(sell_orders);
fill_orders(buy_orders);

The complete algorithm in TFHE-rs

const NUMBER_OF_BLOCKS: usize = 8;

fn volume_match_fhe(
    sell_orders: &mut [RadixCiphertext],
    buy_orders: &mut [RadixCiphertext],
    server_key: &ServerKey,
) {
    let mut total_sell_volume = server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS);
    for sell_order in sell_orders.iter_mut() {
        server_key.smart_add_assign(&mut total_sell_volume, sell_order);
    }

    let mut total_buy_volume = server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS);
    for buy_order in buy_orders.iter_mut() {
        server_key.smart_add_assign(&mut total_buy_volume, buy_order);
    }

    let total_volume = server_key.smart_min(&mut total_sell_volume, &mut total_buy_volume);

    let fill_orders = |orders: &mut [RadixCiphertext]| {
        let mut volume_left_to_transact = total_volume.clone();
        for mut order in orders.iter_mut() {
            let mut filled_amount = server_key.smart_min(&mut volume_left_to_transact, &mut order);
            server_key.smart_sub_assign(&mut volume_left_to_transact, &mut filled_amount);
            *order = filled_amount;
        }
    };

    fill_orders(sell_orders);
    fill_orders(buy_orders);
}

Optimizing the implementation

• TFHE-rs provides parallelized implementations of the operations. We can use these parallelized implementations to speed up the algorithm. For example, we can use [.c-inline-code]smart_add_assign_parallelized[.c-inline-code] instead of [.c-inline-code]smart_add_assign[.c-inline-code].

• We can parallelize vector sum with Rayon and [.c-inline-code]reduce[.c-inline-code] operation.

let parallel_vector_sum = |vec: &mut [RadixCiphertext]| {
    vec.to_vec().into_par_iter().reduce(
        || server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS),
        |mut acc: RadixCiphertext, mut ele: RadixCiphertext| {
            server_key.smart_add_parallelized(&mut acc, &mut ele)
        },
    )
};

• We can run vector summation on [.c-inline-code]buy_orders[.c-inline-code] and [.c-inline-code]sell_orders[.c-inline-code] in parallel since these operations do not depend on each other.

let (mut total_sell_volume, mut total_buy_volume) =
    rayon::join(|| vector_sum(sell_orders), || vector_sum(buy_orders));

• We can match sell and buy orders in parallel since the matching does not depend on each other.

rayon::join(|| fill_orders(sell_orders), || fill_orders(buy_orders));

Optimized algorithm

Modified Algorithm

When observed closely, there is only a small amount of concurrency introduced in the [.c-inline-code]fill_orders[.c-inline-code] part of the algorithm. The reason is that the [.c-inline-code]volume_left_to_transact[.c-inline-code] is shared between all the orders and should be modified sequentially. This means that the orders cannot be filled in parallel. If we can somehow remove this dependency, we can fill the orders in parallel.

In order to do so, we closely observe the function of [.c-inline-code]volume_left_to_transact[.c-inline-code] variable in the algorithm. We can see that it is being used to check whether we can fill the current order or not. Instead of subtracting the current order value from [.c-inline-code]volume_left_to_transact[.c-inline-code] in each loop, we can add this value to the next order index and check the availability by comparing the current order value with the total volume. If the current order value (now representing the sum of values before this order plus this order) is smaller than the total number of matching orders, we can safely fill all the orders and continue the loop. If not, we should partially fill the orders with what is left from matching orders. We will call the new list the "prefix sum" of the array.

The new version for the plain [.c-inline-code]fill_orders[.c-inline-code] is as follows:

let fill_orders = |orders: &mut [u64], prefix_sum: &[u64], total_orders: u64|{
    orders.iter().for_each(|order : &mut u64| {
        if (total_orders >= prefix_sum[i]) {
            continue;
        } else if total_orders >= prefix_sum.get(i-1).unwrap_or(0) {
            *order = total_orders - prefix_sum.get(i-1).unwrap_or(0);
        } else {
            *order = 0;
        }
    });
};

To write this new function we need transform the conditional code into a mathematical expression since FHE does not support conditional operations.


let fill_orders = |orders: &mut [u64], prefix_sum: &[u64], total_orders: u64| {
    orders.iter().for_each(|order| : &mut){
        *order = *order + ((total_orders >= prefix_sum - std::cmp::min(total_orders, prefix_sum.get(i - 1).unwrap_or(&0).clone()) - *order);
    }
};

New [.c-inline-code]fill_order[.c-inline-code] function requires a prefix sum array. We are going to calculate this prefix sum array in parallel with the algorithm described here.

The sample code in the paper is written in CUDA. When we try to implement the algorithm in Rust we see that the compiler does not allow us to do so. The reason for that is while the algorithm does not access the same array element in any of the threads(the index calculations using [.c-inline-code]d[.c-inline-code] and [.c-inline-code]k[.c-inline-code] values never overlap), Rust compiler cannot understand this and does not let us share the same array between threads. So we modify how the algorithm is implemented, but we don't change the algorithm itself.

Here is the modified version of the algorithm in TFHE-rs:

fn volume_match_fhe_modified(
    sell_orders: &mut [RadixCiphertext],
    buy_orders: &mut [RadixCiphertext],
    server_key: &ServerKey,
) {
    let compute_prefix_sum = |arr: &[RadixCiphertext]| {
        if arr.is_empty() {
            return arr.to_vec();
        }
        let mut prefix_sum: Vec = (0..arr.len().next_power_of_two())
            .into_par_iter()
            .map(|i| {
                if i < arr.len() {
                    arr[i].clone()
                } else {
                    server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS)
                }
            })
            .collect();
        // Up sweep
        for d in 0..(prefix_sum.len().ilog2() as u32) {
            prefix_sum
                .par_chunks_exact_mut(2_usize.pow(d + 1))
                .for_each(move |chunk| {
                    let length = chunk.len();
                    let mut left = chunk.get((length - 1) / 2).unwrap().clone();
                    server_key.smart_add_assign_parallelized(chunk.last_mut().unwrap(), &mut left)
                });
        }
        // Down sweep
        let last = prefix_sum.last().unwrap().clone();
        *prefix_sum.last_mut().unwrap() = server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS);
        for d in (0..(prefix_sum.len().ilog2() as u32)).rev() {
            prefix_sum
                .par_chunks_exact_mut(2_usize.pow(d + 1))
                .for_each(move |chunk| {
                    let length = chunk.len();
                    let t = chunk.last().unwrap().clone();
                    let mut left = chunk.get((length - 1) / 2).unwrap().clone();
                    server_key.smart_add_assign_parallelized(chunk.last_mut().unwrap(), &mut left);
                    chunk[(length - 1) / 2] = t;
                });
        }
        prefix_sum.push(last);
        prefix_sum[1..=arr.len()].to_vec()
    };

    println!("Creating prefix sum arrays...");
    let time = Instant::now();
    let (prefix_sum_sell_orders, prefix_sum_buy_orders) = rayon::join(
        || compute_prefix_sum(sell_orders),
        || compute_prefix_sum(buy_orders),
    );
    println!("Created prefix sum arrays in {:?}", time.elapsed());

    let fill_orders = |total_orders: &RadixCiphertext,
                        orders: &mut [RadixCiphertext],
                        prefix_sum_arr: &[RadixCiphertext]| {
        orders
            .into_par_iter()
            .enumerate()
            .for_each(move |(i, order)| {
                server_key.smart_add_assign_parallelized(
                    order,
                    &mut server_key.smart_mul_parallelized(
                        &mut server_key
                            .smart_ge_parallelized(&mut order.clone(), &mut total_orders.clone()),
                        &mut server_key.smart_sub_parallelized(
                            &mut server_key.smart_sub_parallelized(
                                &mut total_orders.clone(),
                                &mut server_key.smart_min_parallelized(
                                    &mut total_orders.clone(),
                                    &mut prefix_sum_arr
                                        .get(i - 1)
                                        .unwrap_or(
                                            &server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS),
                                        )
                                        .clone(),
                                ),
                            ),
                            &mut order.clone(),
                        ),
                    ),
                );
            });
    };

    let total_buy_orders = &mut prefix_sum_buy_orders
        .last()
        .unwrap_or(&server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS))
        .clone();

    let total_sell_orders = &mut prefix_sum_sell_orders
        .last()
        .unwrap_or(&server_key.create_trivial_zero_radix(NUMBER_OF_BLOCKS))
        .clone();

    println!("Matching orders...");
    let time = Instant::now();
    rayon::join(
        || fill_orders(total_sell_orders, buy_orders, &prefix_sum_buy_orders),
        || fill_orders(total_buy_orders, sell_orders, &prefix_sum_sell_orders),
    );
    println!("Matched orders in {:?}", time.elapsed());
}

Running the tutorial

The plain, FHE and parallel FHE implementations are available here and can be run by providing respective arguments as described below.

# Runs FHE implementation
cargo run --release --package tfhe --example dark_market --features="integer internal-keycache" -- fhe

# Runs parallelized FHE implementation
cargo run --release --package tfhe --example dark_market --features="integer internal-keycache" -- fhe-parallel

# Runs modified FHE implementation
cargo run --release --package tfhe --example dark_market --features="integer internal-keycache" -- fhe-modified

# Runs plain implementation
cargo run --release --package tfhe --example dark_market --features="integer internal-keycache" -- plain

# Multiple implementations can be run within same instance
cargo run --release --package tfhe --example dark_market --features="integer internal-keycache" -- plain fhe-parallel

Conclusion

In this tutorial, we've learned how to implement the volume matching algorithm described in this paper in plain Rust and in TFHE-rs. We've identified the right bit size for our problem at hand, used operations defined in TFHE-rs, and introduced concurrency to the algorithm to increase its performance.

Additional links

Star the TFHE-rs Github repository to endorse our work.
Review the TFHE-rs documentation.
Get support on our community channels.
Learn FHE, help us advance the space and make money with The Zama Bounty Program.

Read more related posts

[Video tutorial] Implement GPU Acceleration on Homomorphic Computation using TFHE-rs

In this tutorial, Zama team member Agnes Leroy shows you how to implement GPU acceleration using TFHE-rs.

May 6, 2024

Agnes Leroy

TFHE-rs

Tutorials

[Video tutorial] Implement Signed Integers Using TFHE-rs

In this tutorial, Zama team member Thomas Montaingu shows you how to implement signed integers using TFHE-rs.

November 8, 2023

Thomas Montaingu

TFHE-rs

Tutorials

TFHE-rs v0.4.0: Signed Integers and Encrypted Conditionals

The new version of TFHE-rs introduces support for signed integers

October 17, 2023

Jean-Baptiste Orfila

TFHE-rs

Announcements

TFHE-rs v0.3.0: Faster Operations, Wider API, Shorter Keys

Read more about the TFHE-rs v0.3.0 release

July 25, 2023

Jean-Baptiste Orfila

TFHE-rs

Announcements

Boolean SHA256 with TFHE-rs

In this tutorial, we show how to implement a SHA256 function with booleans in TFHE-rs

July 9, 2023

The Zama Team

Tutorials

TFHE-rs

Regular Expression Engine with TFHE-rs

In this tutorial, we show how to implement a FHE Regular Expression Engine on encrypted texts.

June 30, 2023

The Zama Team

TFHE-rs

Tutorials

Privacy is necessary for an open society in the electronic age. Privacy is not secrecy. A private matter is something one doesn't want the whole world to know, but a secret matter is something one doesn't want anybody to know. Privacy is the power to selectively reveal oneself to the world.If two parties have some sort of dealings, then each has a memory of their interaction. Each party can speak about their own memory of this; how could anyone prevent it? One could pass laws against it, but the freedom of speech, even more than privacy, is fundamental to an open society; we seek not to restrict any speech at all. If many parties speak together in the same forum, each can speak to all the others and aggregate together knowledge about individuals and other parties. The power of electronic communications has enabled such group speech, and it will not go away merely because we might want it to.Since we desire privacy, we must ensure that each party to a transaction have knowledge only of that which is directly necessary for that transaction. Since any information can be spoken of, we must ensure that we reveal as little as possible. In most cases personal identity is not salient. When I purchase a magazine at a store and hand cash to the clerk, there is no need to know who I am. When I ask my electronic mail provider to send and receive messages, my provider need not know to whom I am speaking or what I am saying or what others are saying to me; my provider only need know how to get the message there and how much I owe them in fees. When my identity is revealed by the underlying mechanism of the transaction, I have no privacy. I cannot here selectively reveal myself; I must always reveal myself.Therefore, privacy in an open society requires anonymous transaction systems. Until now, cash has been the primary such system. An anonymous transaction system is not a secret transaction system. An anonymous system empowers individuals to reveal their identity when desired and only when desired; this is the essence of privacy.Privacy in an open society also requires cryptography. If I say something, I want it heard only by those for whom I intend it. If the content of my speech is available to the world, I have no privacy. To encrypt is to indicate the desire for privacy, and to encrypt with weak cryptography is to indicate not too much desire for privacy. Furthermore, to reveal one's identity with assurance when the default is anonymity requires the cryptographic signature.We cannot expect governments, corporations, or other large, faceless organizations to grant us privacy out of their beneficence. It is to their advantage to speak of us, and we should expect that they will speak. To try to prevent their speech is to fight against the realities of information. Information does not just want to be free, it longs to be free. Information expands to fill the available storage space. Information is Rumor's younger, stronger cousin; Information is fleeter of foot, has more eyes, knows more, and understands less than Rumor.We must defend our own privacy if we expect to have any. We must come together and create systems which allow anonymous transactions to take place. People have been defending their own privacy for centuries with whispers, darkness, envelopes, closed doors, secret handshakes, and couriers. The technologies of the past did not allow for strong privacy, but electronic technologies do.We the Cypherpunks are dedicated to building anonymous systems. We are defending our privacy with cryptography, with anonymous mail forwarding systems, with digital signatures, and with electronic money.Cypherpunks write code. We know that someone has to write software to defend privacy, and since we can't get privacy unless we all do, we're going to write it. We publish our code so that our fellow Cypherpunks may practice and play with it. Our code is free for all to use, worldwide. We don't much care if you don't approve of the software we write. We know that software can't be destroyed and that a widely dispersed system can't be shut down.Cypherpunks deplore regulations on cryptography, for encryption is fundamentally a private act. The act of encryption, in fact, removes information from the public realm. Even laws against cryptography reach only so far as a nation's border and the arm of its violence. Cryptography will ineluctably spread over the whole globe, and with it the anonymous transactions systems that it makes possible.For privacy to be widespread it must be part of a social contract. People must come and together deploy these systems for the common good. Privacy only extends so far as the cooperation of one's fellows in society. We the Cypherpunks seek your questions and your concerns and hope we may engage you so that we do not deceive ourselves. We will not, however, be moved out of our course because some may disagree with our goals.The Cypherpunks are actively engaged in making the networks safer for privacy. Let us proceed together apace.Onward. By Eric Hughes. 9 March 1993.

Dark Market with TFHE-rs

Specifications

Inputs

Input constraints

Outputs

Example input and output

Example 1

Example 2

Plain Implementation

The complete algorithm in plain Rust

FHE Implementation

The complete algorithm in TFHE-rs

Optimizing the implementation

Optimized algorithm

Modified Algorithm

Running the tutorial

Conclusion

Additional links

Read more related posts

[Video tutorial] Implement GPU Acceleration on Homomorphic Computation using TFHE-rs

[Video tutorial] Implement Signed Integers Using TFHE-rs

TFHE-rs v0.4.0: Signed Integers and Encrypted Conditionals

TFHE-rs v0.3.0: Faster Operations, Wider API, Shorter Keys

Boolean SHA256 with TFHE-rs

Regular Expression Engine with TFHE-rs

Libraries

Products & Services

Developers

Company

Contact