DKGs in Groups

This is a short post on distributed key generation (DKG) in the context of cryptographic groups, such as elliptic curves.

Many cryptographic schemes use a secret key, and a public key. As their names suggest, one of them has to be kept private, and the other can---and should be---shared. The function these serve depends on the protocol: for signatures, the public key is used to verify signatures, and the secret key is used to create them; for encryption, the public key is used to encrypt, and the secret key is used to decrypt.

Distributed key generation is a way for a group of people to generate a public key, along with shares of the secret key. These shares should be such that nobody knows the actual secret, and yet further protocols can make use of these shares to do useful operations, such as signing, or decryption.

I've been thinking about DKGs recently, so I wanted to get a few thoughts out there as a blog post.

Key Generation

First, let's look at what kind of keys we'll be generating anyhow.

We'll start with a cryptographic group G\mathbb{G}, generated by GG, of order qq. This gives us an associated field of scalars, Fq\mathbb{F}_q. We also assume that various problems related to the discrete logarithm in G\mathbb{G} are hard. In particular, given A=aGA = a \cdot G, finding aa should be intractable.

When working with such a group, the most common setup for keys is that a scalar xFqx \in \mathbb{F}_q is the secret key, and the associated point X:=xGX := x \cdot G is the corresponding public key.

Key generation here is quite simple:

  1. Sample xRFqx \xleftarrow{R} \mathbb{F}_q.
  2. Return (x,xG)(x, x \cdot G).

Now, just generating a key to use yourself isn't all that interesting. What is more interesting is generating shares of a key for multiple people. One basic strategy here is linear sharing. We pick shares x1,,xnx_1, \ldots, x_n such that ixi=x\sum_i x_i = x.

If we have a trusted third party, then they can be the "dealer", and the distribute the shares x1,,xnx_1, \ldots, x_n, while informing the group of the public key X:=xGX := x \cdot G.

/

What's interesting is how to remove this dealer, so that a group of people can set up a key without needing to trust a third party.

A Broken Protocol

The way that they'll do this will be with some kind of protocol. This is a program that they'll run together, by doing computation locally, and exchanging messages. The goal of the protocol will be for each party PiP_i to end up with their share xix_i, along with the public key XX, which should satisfy ixiG=X\sum_i x_i \cdot G = X.

This protocol shouldn't let a party learn the other shares either. Ideally, even if participants deviate from the protocol, they shouldn't be able to learn private information or bias the result either, but we'll get to security later.

/

For now, let's just look at the basic principle behind how you'd make this work.

One simple idea is that each person can generate their share xix_i, which implicitly creates a secret ixi\sum_i x_i, shared among the participants. The "hard" part is then learning XX without revealing xx, or any of the shares. This can be done by having each person send Xi:=xiGX_i := x_i \cdot G to everyone else. You can then sum up these "shares" of the public key, getting X:=iXiX := \sum_i X_i.

More formally, the protocol would be:

  1. Each PiP_i generates xiRFqx_i \xleftarrow{R} \mathbb{F}_q.
  2. Each PiP_i sets XixiGX_i \gets x_i \cdot G.
  3. \star Each PiP_i sends XiX_i to every other party.
  4. \bullet Each PiP_i waits to receive XjX_j from every other PjP_j.
  5. Each PiP_i sets XiXiX \gets \sum_i X_i.
  6. \square Each PiP_i returns (xi,X)(x_i, X).

[!note] Note: To easily identify different parts of a protocol, I use the following notation:

Now, because computing xix_i from XiX_i is hard, this protocol is secure, for honest participants. If the parties don't deviate from the protocol, then they won't reveal extra information.

Why this is broken

However, this protocol is broken for malicious participants. The reason why is that a malicious participant gains a tremendous advantage by going "last". If they wait to receive the messages from other participants, then they can actually choose the private key xx and the public key XX.

After seeing X2,,XnX_2, \ldots, X_n, the malicious party can send XiXiX - \sum_{i} X_i, as their "share" X1X_1. This results in the final public key being XX. If the party happens to know the logarithm of XX, then they now control the key for the entire group, defeating the purpose of the key generation.

A Fixed Protocol

There are two fundamental issues with the protocol so far, with regards to malicious security:

  1. A malicious party can choose their share XiX_i after seeing all other XjX_j.
  2. A malicious party can choose thier XiX_i without knowing its discrete logarithm.

We can fix 2. by using a Maurer proof, for the function:

φ(x):=xG\varphi(x) := x \cdot G

which will force a participant to prove that they know the discrete logarithm.

We also still need to fix 1. even with this proof, although the reason why is a bit more subtle.

Basically, even if you're forced to know the discrete logarithm of the point you're submitting, it's still possible to bias the resulting public key. For example, it's possible to try a bunch of values for your share xix_i until the point XX has a particular starting bit in its binary representation. Can you get a useful bias in a reasonable amount of time? Probably not. That said, the functionality that the protocol is trying to implement says that XX is completely random, so any kind of bias introduced by a malicious participant counts as a violation of the security properties of the protocol.

To fix this, we need to have each party set their share before they learn the shares of the other parties. We can do this using commitments. A commitment hides the value underneath it, but once a commitment is sent, the value hidden underneath can't be changed. We can then make each party send a commitment to their share, and then only after receiving every other commitment will a party reveal their own share.

Since our public shares have a lot of entropy, we can commit to them by simply using a hash function.

So, the idea now is that each person will generate xix_i, set XixiGX_i \gets x_i \cdot G, and then send H(Xi)H(X_i) to everyone else. They then wait to receive these commitments before sending out XiX_i. They then check that the received shares match the commitments, and then sum them together.

Writing this out, we get the following protocol:

Round 1:

  1. Each PiP_i generates xiRFqx_i \xleftarrow{R} \mathbb{F}_q, and sets XixiGX_i \gets x_i \cdot G.
  2. Each PiP_i sets CiH(Xi)C_i \gets H(X_i).
  3. \star Each PiP_i broadcasts CiC_i to all other parties.

Round 2:

  1. \bullet Each PiP_i waits to receive CjC_j from each other PjP_j.
  2. Each PiP_i sets πiProve(φ,Xi;xi)\pi_i \gets \text{Prove}(\varphi, X_i; x_i), where:
φ(x):=xG\varphi(x) := x \cdot G
  1. \star Each PiP_i sends (Xi,πi)(X_i, \pi_i) to all other parties.

Round 3:

  1. \bullet Each PiP_i waits to receive (Xj,πj)(X_j, \pi_j) from each other PjP_j.
  2. \blacktriangle Each PiP_i asserts that H(Xj)=CjH(X_j) = C_j and that Verify(φ,πj,Xj)\text{Verify}(\varphi, \pi_j, X_j).
  3. Each PiP_i sets XPiPXiX \gets \sum_{P_i \in \mathcal{P}} X_i.
  4. \square Each PiP_i returns (xi,X)(x_i, X).

[!note] Note: At step 3, I mentioned that you need to broadcast a message. By this I mean that you need a mechanism to enforce that the same message is sent to all parties, even if the sender is malicious.

One way of accomplishing this is to have each party send all of the CjC_j they received in round 2, letting the other parties check that they received the same commitments.

And so, this is a simple protocol for doing distributed key generation with linear shares. What's neat with linear sharing is that as soon as each person generates a random value xix_i, they already have a sharing x:=ixix := \sum_i x_i, and so the hard part is just sharing XiX_i in a safe way, so that you can learn XX.

The disadvantage of this kind of sharing is that you need all of the shares in order to get the secret. This is what we'll remedy next.

Threshold Key Generation

We'd like a way to share a secret in such a way that just a subset of the shares is enough to reconstruct the secret. For example, with nn shares, we might want to be able to use just t<nt < n shares to recreate the secret.

The most common way to do this is via polynomials. The basic property we use is that given a polynomial ff with tt coefficients, it suffices to evaluate that polynomial at tt points to learn the coefficients.

This reconstruction starts with a special polynomial:

Lj,S(X):=iS,ij(Xi)(ji)L_{j,S}(X) := \prod_{i \in S, i \neq j} \frac{(X - i)}{(j - i)}

This polynomial is such that Lj(x)=0L_j(x) = 0 for all x[t]x \in [t] except jj. For jj, we have Lj(j)=1L_j(j) = 1. As an example, for the elements {0,1,2,3}\{0, 1, 2, 3\}, we'd have L1(X)L_1(X):

/

We can then reconstruct our polynomial by linear combination, given enough points:

f(X)=iSf(i)Li,S(X)f(X) = \sum_{i \in S} f(i) \cdot L_{i, S}(X)

Very often, we don't care about the polynomial, but rather it's value at 00, which is often used as our secret. This is because f(0)f(0) is just the first coefficient (depending on how you count), so it's easy to extend a secret value into a whole polynomial, by simply adding extra random coefficients.

In case you just care about a particular value, the equation above becomes:

f(0)=iSf(i)Li,S(0)f(0) = \sum_{i \in S} f(i) \cdot L_{i, S}(0)

The value Li,S(0)L_{i, S}(0) is usually written as λi\lambda_i (with SS implicit), and is known as a Lagrange coefficient.

What's interesting is that if each person has a value f(i)f(i), then they can create a linear share of f(0)f(0) locally, by multiplying their share f(i)f(i) with λi\lambda_i. This is convenient, because then the rest of a protocol making use of the shares only has to work with linear values.

So, to recap, for sharing a secret xx among nn participants, such that any tt of them can reconstruct it, you first generate a polynomial ff at random, such that f(0)=xf(0) = x, and then the shares become f(1),,f(n)f(1), \ldots, f(n).

Given tt shares, you can create linear shares by multiplying with λi\lambda_i, using the set of participants SS.

With a Trusted Dealer

We want to distribute these shares without people actually learning the polynomial, of course.

The simplest way to do this would be to trust someone to generate this polynomial, send the shares f(1),,f(n)f(1), \ldots, f(n), along with X=f(0)GX = f(0) \cdot G, and then forget the secret.

There are two natural flaws in this approach, one being that you need to trust the dealer to forget the secret, and the other being that the dealer might generate shares which don't actually correspond to polynomial evaluations, and thus don't let the participants reconstruct the secret.

Without a Dealer

A graver flaw in using a dealer is that it's just not fun.

Let's remedy this with a protocol, like we did previously with just linear shares.

Instead of having a single person act as a dealer, each person acts as a dealer. To do this, we use the same trick with linear shares. If each person generates a polynomial fif_i locally, then there's an implicit polynomial ff shared linearly among them, defined as f=ifif = \sum_i f_i.

Another neat property is that polynomial evaluation is homomorphic, i.e.

f(i)+g(i)=(f+g)(i)f(i) + g(i) = (f + g)(i)

This immediately gives us a semi-honest protocol for threshold key generation:

  1. Each PiP_i generates a random polynomial fif_i of degree tt.
  2. \star Each PiP_i sends Xifi(0)GX_i \gets f_i(0) \cdot G to every other party.
  3. \textcolor{red}{\star} Each PiP_i privately sends xijfi(j)x_i^j \gets f_i(j) to every other PjP_j.
  4. \bullet Each PiP_i waits to receive XjX_j and xjix_j^i from every other PjP_j.
  5. Each PiP_i sets XiXiX \gets \sum_i X_i, and xijxjix_i \gets \sum_j x_j^i.
  6. \square Each PiP_i returns (xi,X)(x_i, X).

The idea follows the sketch from before. After generating fif_i locally, there's an implicit polynomial ff shared linearly. Each person needs to learn f(i)f(i), and f(0)Gf(0) \cdot G is the public key. Since f=jfjf = \sum_j f_j, it suffices to let each person act as a dealer, sending fj(i)f_j(i) and fj(0)Gf_j(0) \cdot G, which can then be summed locally.

Of course, this only works if parties follow the protocol. If parties are malicious, we run into the same two issues we had with our original naive key generation protocol. Namely, a party can choose their inputs after seeing those of other parties, and a party can send a value XiX_i without knowing its discrete logarithm.

Furthermore, we also run into a new issue, namely that the shares xijx_i^j may not actually correspond to evaluations of a polynomial with the right degree, which is bad.

We'll fix the first two with commitments and zero-knowledge proofs, as before. For the new issue, we have a bit more flexibility in how we tackle it.

First Version

The first approach to ensuring that the shares are correct is to reveal the polynomial, but hidden behind a group element.

In essence, if you knew the coefficients of the polynomial, you would be able to check that your share was correct, via:

xij=?kfi,kjkx_i^j \stackrel{?}{=} \sum_k f_{i, k} j^k

where fi,kf_{i, k} is the kk-th coefficient of fif_{i}.

Now, revealing fi,kf_{i, k} is obviously bad. But, revealing fi,kGf_{i, k} \cdot G would be ok. And if we knew Fi,k=fi,kGF_{i, k} = f_{i, k} \cdot G, we could still check our equation:

xijG=?kjkFi,kx_i^j \cdot G \stackrel{?}{=} \sum_k j^k \cdot F_{i, k}

Thus, our strategy here will be to reveal these Fi,kF_{i, k} values, allowing others to check that our share was correctly distributed.

Naturally, we also need commitments, along with a zero-knowledge proof for Fi,0F_{i, 0}.

In more detail, we would have:

Round 1:

  1. Each PiP_i generates a random polynomial fif_i of degree t1t - 1.
  2. Each PiP_i sets Fi,kfi,kGF_{i, k} \gets f_{i, k} \cdot G for k[0t1]k \in [0 \ldots t - 1].
  3. \star Each PiP_i sets CiH(Fi,0)C_i \gets H(F_{i, 0}), and then broadcasts CiC_i.

Round 2:

  1. \bullet Each PiP_i waits to receive CjC_j from every PjP_j.
  2. Each PiP_i sets πiProve(φ,Fi,0;fi,0)\pi_i \gets \text{Prove}(\varphi, F_{i, 0}; f_{i, 0}), where:
φ(x):=xG\varphi(x) := x \cdot G
  1. \star Each PiP_i sends Fi,kF_{i, k} and πi\pi_i to every other PjP_j.
  2. \textcolor{red}{\star} Each PiP_i privately sends fi(j)f_i(j) to every other PjP_j.

Round 3:

  1. \bullet Each PiP_i waits to receive Fj,kF_{j, k} and πj\pi_j from every other PjP_j.
  2. \blacktriangle Each PiP_i asserts that Verify(φ,πj,Fj,0)\text{Verify}(\varphi, \pi_j, F_{j, 0}).
  3. \bullet Each PiP_i waits to receive xjix_j^i from every other PjP_j.
  4. Each PiP_i sets xijxjix_i \gets \sum_j x_j^i and XjFj,0X \gets \sum_j F_{j, 0}.
  5. Each PiP_i asserts that xi=kikjFj,kx_i = \sum_k i^k \cdot \sum_j F_{j, k}.
  6. \square Each PiP_i returns (xi,X)(x_i, X).

One extra optimization is that we aggregate our verification of our share, checking the final share xix_i against the polynomial ff, rather than each part xjix_j^i against fjf_j.

One disadvantage of the aggregation is that it prevents us from identifying that a specific person gave us a bad share. If our final share is bad, we don't know which person is at fault. You could also do an individual check, which makes attributing blame a lot easier.

Otherwise, we follow the sketch from above.

This is the most common kind of distributed key generation, at least when it comes to threshold sharings of scalars.

Second Version

Another idea is that instead of sending fi,kGf_{i, k} \cdot G, you instead send Fij:=fi(j)GF_i^j := f_i(j) \cdot G, for each other jj.

Then, you'll be able to check your share xijx_i^j via:

xijG=Fijx_i^j \cdot G = F_i^j

Now, you do still want to check that the FijF_i^j actually come from evaluating some polynomial fif_i.

This can be done with a Maurer proof, using:

φ(f):=[f(j)Gj[0,,n]]\varphi(f) := [f(j) \cdot G | j \in [0, \ldots, n]]

you also want to check that they know the secret f(0)f(0), so we include j=0j = 0 in this list of elements as well.

Now, notice that since polynomial evaluation is homomorphic, then so will φ\varphi be.

A sketch of the resulting protocol is then that you first generate fif_i, then calculate Fijfi(j)GF_i^j \gets f_i(j) \cdot G. You then send a commitment H(Fi0,,Fin)H(F_i^0, \ldots, F_i^n). After seeing everyone else's commitment, you reveal FijF_i^j, along with your proof πi\pi_i, and then privately send fi(j)f_i(j) to everybody else. You verify everyone else's proof, and then sum up your private shares and the Fj0F_j^0 values to get the public key.

This protocol is usually worse than the other protocol we saw, but might be better for large thresholds, and where you also want to know xjGx_j \cdot G, for each person's share xjx_j. This last value is useful if you want identifiable aborts in subsequent protocols using the shares.

Conclusion

This was just a brief note on distributed key generation, and hopefully of some use. Another good post on DKGs is "Walking Through Distributed Key Generation", although unfortunately it uses multiplicative group notation, but this is an affront worth tolerating to read the nice introduction to the subject.

In most situations the first protocol I showed is what you'll end up using, but I thought including the second protocol was fun, just to show that there is some flexibility in how these things are designed.