How do Bitcoin’s new Taproot signatures interact with the good old key derivation methods from BIP-32? It turns out that the answer isn’t all that straightforward.
BIP-32, Briefly
A Bitcoin key-pair is composed of a secret scalar , and a public point on the secp256k1 curve. This key-pair can be used for the ECDSA signature scheme, and is used to sign transactions in the Bitcoin protocol.
BIP-32 is a standard for deriving a new Bitcoin key-pair from a pre-existing one. Given a key-pair , this standard specifies a way to derive a child key-pair for some index .
The essential idea is to hash in the index, along with information about the key, in order to adjust the current key-pair in a deterministic way.
In more detail, we first augment key-pairs with a public “chaining key” , which is just an extra bit of public randomness. Then, to derive the child key at index , we first create a 512 bit hash:
The first 256 bits are interpreted as a scalar , and the last 256 bits are interpreted as a new chaining key .
We then adjust our key-pair by adding in :
The new key-pair is then .
Note:
We can also include the secret key in the Hash, instead of the public key . This mode is called “hardened child” and is used when .
What we need to remember is that you hash in some information, including the public key, to get a new scalar, which you add to the existing secret key, adjusting the public key accordingly.
Taproot Signatures, Briefly
BIP-340 is one of the specifications going into Bitcoin’s major “Taproot” changes. This standard replaces the older ECDSA signature scheme with a new Schnorr signature scheme.
What matters for this post is not how signatures are produced, but rather how keys are represented. Once again, key-pairs are represented as a scalar , and point on the secp256k1 curve.
The crucial difference is in how the point is encoded.
A point on the curve (in affine coordinates) is represented by two field elements . Each of these elements, alone, occupies a full 32 bytes of space. But, because of the structure of the curve, if you know the coordinate of a point, you only need one bit of information about to recover the full point.
Bitcoin stores public keys in compressed form. We use 32 bytes for , and then an extra byte for : we store if is even, and if is odd. So, 33 bytes in total.
Taproot, on the other hand, only uses 32 bytes. It does this by only storing the coordinate, and assuming that is even. You can think of a Taproot public key as a Bitcoin key with an implicit in front.
This means that there can potentially be a mismatch between our public key and our private key. If our public key has an odd coordinate, this means that will have an even coordinate. Because of this, when we generate our private key, we need to potentially negate it, so that the corresponding public key has an even coordinate. Otherwise we’d lose this information, and our secret key wouldn’t match our public key.
Note:
In practice, Taproot does this adjustment when signing, and not when generating keys, but what needs to happen is the same.
Also, the reason this adjustment works is that given a point , negating yields . Since the order of our field is odd, negating an even element yields an odd element, and vice-versa.
Ambiguity
Because Taproot stores public keys differently, you have two new sources of ambiguity when trying to use BIP-32 key derivation:
- How do you hash in a Taproot public key?
- What do you do if the derived public point has an odd coordinate?
It’s also likely a bad idea to use the same key for both Schnorr and ECDSA signatures, so you’d like a way to organize wallets to avoid mixing up the two key types.
Workarounds
There are relatively obvious ways to clear up these ambiguities.
For hashing in public keys, you can do things in a “backwards-compatible” way by recovering the full point from just the coordinate, implicitly choosing the even coordinate. Then you just hash in this full point according to BIP-32. The specification needs a point on the curve, and you’re giving it exactly what it wants.
After deriving a new key, you have:
If has an odd y coordinate, we can use the same trick as for key generation (or signing), and negate both and so that the resulting point has an even y coordinate.
For wallet organization, this mailing list post proposes a simple adjustment. Basically, you first organize your wallet into an ECDSA half, and a Schnorr half, and then do whatever wallet organization you’d normally use from there. This makes it difficult to accidentally use one type of key for the other type of signature, because the two domains are separated so early.
Conclusion
It’s certainly possible to use BIP-32 key derivation with Taproot keys, there’s just a few hurdles to clear. Thankfully, the workaround in each case is fairly obvious. That being said, it would be nice to have a short specification explicitly detailing these adjustments, so that different wallets can interoperate correctly.