What is frequency smoothing encryption, in essence? This is a procedure of “splitting” the repeated plaintexts and “converting” them into different ciphertexts. Consider the most extreme case: RND, which is done at the supreme granularity.

So the difference between RND, DET, and FSE is their granularity!

For FSE, we could always split each repeated plaintext into several groups. The goal is to prevent a snapshot-based, offline adversary (at least, current goal) from correctly guessing the mapping from the plaintext to the ciphertext.

So here comes the big question: how do we define a “correct guess” done by the adversary?

Most the these attacks are inherent to any OPE scheme because they do not leverage any weakness in the cryptographic security guarantee of the schemes but instead utilize the ordering information of the plaintexts which is revealed as per the definitional requirements of the scheme.

We introduce an intermediate Window One-Wayness based security notion which was introduced by [BCO11]. It measures the probability that an adversary, given a set of ciphertexts corresponding to messages chosen at random from the underlying plaintext distribution, decrypts one of them.

Security definition attempt 1:

Define the list of attributes $\mathsf{L}{attr} = \{a_1,a_2,\cdots,a_n\}$, and let $(a_1', \cdots,a{n'}'), n' \leq n$ be the set of all unique elements in $\sf L$. The histogram for the list is defined as

$$ \mathsf{Hist}(\mathsf{L}{attr})[j] = \sum{i \in [n]} \mathbb{I}(a_i = a_j') $$

就是重复元素的个数呗。

Define $\gamma$-approximate histogram as follows:

$$ \mathsf{Hist}[i] \in [\mathsf{Hist}(L)[i] - \gamma, \mathsf{Hist}(L)[i] +\gamma], \quad \forall i \in n, $$

In our case, the goal of the adversary is to guess an entry of $\gamma$-approximate histogram of the plaintext histogram.