Disclaimer: The views in this article are my own, and do not necessarily represent the views of my employer.
As part of the blockchain work I've been doing, I've been examining designs of the popular existing networks. Ethereum had my attention this week, and I was digging into its transaction authentication mechanisms when I found something confusing. I think it's easiest to demonstrate with quick example.
Say I'm running a private network, and I submit a transaction to transfer value 1000 from one account to another. I can do that like this:
Once the transaction gets mined, I can see the value has moved:
And I can see the raw transaction:
A key component of blockchains is that every transaction is signed by its issuer, and meddling with the details of the transaction will be obvious to all parties. Let's verify that. Let's try to resubmit that transaction, and have it transfer value 1001, which requires modifying just one bit:
Why did we get a receipt? Surely we should have been told that transaction was invalid. Did it just get logged as a failure?
No. So did value move?
Kind of. Value was deposited in the target account, but wasn't debited from the source account. Where did it come from?
That's not the account debited in the original transaction. So, I guess it's true that we weren't able to replay that transaction with modifications. But what about this other account? It didn't sign this transaction.
This is where I had to learn the details of how transaction signing works in Ethereum. To submit a signed transaction, your client must encode a string containing: nonce, gas price, gas limit, destination address, value, contract data, and chain ID. The exact encoding is irrelevant here, but those are the components of the transaction (see this post and EIP 155 for the full details). A representation (hash) of those values is passed to an elliptic curve signing function, along with your account's private key. That function produces what is called a "recoverable signature". This recoverable signature is added to the end of the previous list, and the resulting string is a "signed raw transaction".
A signed raw transaction can be submitted to any Ethereum node, without that node needing to know the private key of the account submitting the transaction. Notice that the list of fields included doesn't include the address of the account submitting the transaction. Anyone can recover that address using the signature and the original list of signed fields.
But there's the trouble. To get back the address of the originating account, you have to have both the signature and the original field values. If you supply different field values, you don't get, "This signature doesn't match," you get, "This transaction came from a completely different account."
This is what happened in the example above. If we recover the address using the original values and signature, we get the address we used to sign the original transaction:
But if we recover the address using the altered values and the signature, we get the other address:
Before you run off to tweet about this, let me say: trying to produce a set of field values to make some signature point to a particular account is not within the realm of your powers. I precalculated the account address that would match, and gave it value in my genesis block for the purposes of this demonstration. If I we try again with a value two greater, we get a completely different address that has no value:
Trying to match a particular address is a process of mashing numbers hoping to accidentally hit one in 2160. Even if you just wanted to hit any in-use address, and each person alive on Earth had their own, you'd still be looking at one in 2127(=2160/233, 233 ≈ 8 billion).
So why do I care? Two reasons:
- Being corrected for the wrong mistake makes the protocol harder to use. The error above for the "two greater value" mismatch points a debugger toward balances, not toward signatures.
- Fixing this seems simple.
Number 2 is the naïve thing to say. If it's so simple, why hasn't it been done? It's more likely that I just don't understand the domain and/or design decisions made elsewhere. I'm going to trudge on with explaining anyway, and hope it leads to my education.
I think this can be fixed by including the originating address in the details that are signed. If what was signed was instead: nonce, gas price, gas limit, originating address, destination address, value, contract data, and chain ID; I think the problem would disappear entirely. We can try the same single-bit modification as last time:
But this time, we can compare the recovered address to the "from" address in the transaction. They don't match, so we can say, "This signature doesn't match."
And hey, the "attack" gets harder too. It's not as simple as just changing the from address in the transaction. If we do that, the recovered address also changes, to yet something different:
With this scheme, to find a valid transaction, you're forced to find a match for a specific address. So, as a side effect of improving usability, we also return to a collision probability of one in 2160.
Is it the case that EIPs 712 and 191 are attempting to address some of this situation, but not directly? It seems like "malleability" of ECDSA signatures, while perhaps slightly different than what is described above, is something that has caused trouble elsewhere.
Finally, thanks to the makers of two tools that helped me debug what was going on: Ethereumjs-tx, which includes a nice "from" recovery function, and Keythereum, which can extract a private key from a geth keystore.
Am I on track, or have I missed something?
Categories: Blockchain Development
Post Copyright © 2018 Bryan Fink