For many applications, it will be critical that multiple users are not able to claim the same credential, at least without being detected. For example, suppose a user proves they own some funds in a certain bank account, and then later shares that their banking login to another user who then proves the same credential. An application that requires users t have some minimum collateral will need to be able to detect if the same bank account is being claimed multiple times.
Furthermore, the two users may be using their duplicated credentials for different applications, but these applications may still want to know that these users’ credentials are duplicated. Note that detecting duplicate credentials is trivial if all users must post their credentials to a public chain in an unencrypted format. However, for many kinds of credentials this public chain would leak sensitive information.
We now fix some notation. Consider a single kind of credential, where each user $i$ proves their credential using data $D_i$. We divide the (useful) parts of each $D_i$ into the triple $D_i = (d^1_i, d^2_i, d^3_i)$ where:
For example, if the data is returned from a banking website, then $d_i^1$ could be the amount of money in the bank account, $d_i^2$ the name of the account owner, and $d_i^3$ the routing number of the account. This is because users might be comfortable with the amount of money in their account being public, as nobody can link that information to them personally. Additionally, users might be comfortable sharing their name to specific applications that need this information, but will not want to share the routing number of their account to anyone.
Finally, we denote the set of all possible $D_i$ values as $\mathcal D$.
We will assume that users are not required to have continued interaction with the system after they prove and share their credentials with applications. We will call this the non-interactive assumption. Therefore, all information about a user’s credential that will be used for future duplication counting must be saved to a public database at the time that the credential is created. So, without loss of generality we can assume that for some fixed function $H$, when a user creates a credential using data $D_i$ , they must post to a public chain the data
$$ c_i = (d_i^1 ,H(d_i^2,d_i^3)) $$
as well as the signature of the attestor. When a user shows a credential to an application, they will also have to share the true value of $d_i^2$ .
For $H$ to work as a feasible solution, it must satisfy the following two conditions:
(1) $H$ sufficiently obscures $d_i^2$ and $d_i^3$ so that no sensitive information is leaked.
(2) $H$ outputs enough information so that distinct credentials are not counted as duplicates. That is, if $D_i \neq D_j$ then $c_i \neq c_j$.