Patch Proposals For The Ongoing Wallet Homomoroph Attacks

I have thought about this a bit more, and removing attacker control over the _salt value I believe is the strongest patch here. Legitimate code won’t care what the ultimate hash value is, and by removing the attacker’s ability to influence the hash value we will stop seeing this attack. The salt is still needed, but it only needs to be a network-controlled nonce.

… Or if we want to keep the same call convention - we allow the salt value to be attacker controlled, and just XOR it with a Pepper value that is generated by a CSPRNG and is unique to each block. (Pepper is the name given to a salt value that is unknown to the attacker.)

Noted. I will disregard stack overflow “cryptographers” in the future. As for my credentials, I authored GPU software that makes it easier to generate addresses that match arbitrary criteria, and I have no doubt my development empowers these attackers.

There is no vulnerability. No funds are being “stolen”. “homomorphism” as you call it is intrinsic to similar addresses looking similar, and no address generation scheme is going to fix this. Enough computational work will be able to generate similar addresses, whether they are derived from EOA, CREATE, CREATE2, or some new scheme.

No it is not.

Another way to think about why there is no fix for this is to assume user behavior isn’t going to change. They are only going to eyeball the first 4 hexcharacters of an address. No matter the scheme, if there are more than 2**16 accounts, the system is no-longer idiot proof. This makes no assumptions about how the addresses are generated; there will be collisions.

Wallets can prevent users from doing dumb stuff but this is not the role of the EVM.

2 Likes

Hey WJ,

Secure software takes effort, this is not something that happens by accident. In the real world when people fall off of a banister and hurt themselves, they sue you - and the issue is fixed. Well how is this any different? The only difference I see is that usually when someone takes a tumble - they don’t loose $60,000,000.00 USD. This isn’t just isn’t one person that made a mistake, and we all know that people keep making this mistake because it is a patchable software defect, and leaving bugs unpatched reflects badly upon all of us.

This pre-computation attack is only possible because of how contract address are generated. The patch is really quite simple - but convincing the community that we deserve secure software maybe more difficult than I anticipated.

Another way to think about why there is no fix for this is to assume user behavior isn’t going to change.

If an attacker can no longer generate homomorph wallets, then people will stop sending their coins to look-alikes. I will do absolutely everything in my power to make sure this vulnerability is patched.

But that being said. I need you start treating others with respect. I have worked very hard to be where I am, and I am here to help others. Flinging personal attacks someone for trying to help isn’t going to help us make a more secure network.

No, I have already proven that it is independent of how the addresses are generated.

I recommend you check out some of my proposed solutions to the “finality problem”, such as EIP-3455 and EIP-6810.

@rook I hope you are able to understand why homomorphism is just as possible with CREATE and EOA as with CREATE2, and that your insistence that CREATE2 makes it easier is way out of proportion. I hope you will re-read the UX proposals mentioned by everyone else in this thread because they are the solution. Scams are a cat and mouse game though; the scammers will adopt a new strategy once Metamask patches this.

I am not as patient as shemnon. My patience with you was over when you brought your cReDeNtIaLs to a technical discussion as a response to why you were wrong and tried monologuing to us about the importance of security. I thought that was disrespectful but I am glad you have since edited it out of your post.

I am thinking about how you two are thinking about this problem in more detail. My concern is that you are intending to punish everyday users for not using “the correct software” or “not being careful enough” - when really the majority of people aren’t well versed in this topic - and just want to use a currency.

The best analogy for this argument is “You shouldn’t have clicked on that link” vs well maybe “We shouldn’t have XSS to begin with”… and please keep an open mind in this discussion - everyone on this tread wants our users to be safe as possible and to build the very best network - I know that. Math is the universal language after all, and I think it can help us all reason about this issue.

Is using keccak() alone for wallet generation more of a concern than than keccak(secp256k1)?

Is this a shortcut useful to anyone?

Let’s start of with keccak() - which is more commonly known as SHA3. This is hash function is very interesting, it is one of the few hash functions created specifically to be accelerated on 64bit architectures and in fact we are seeing phenomenal hash-rates from hardware acceleration. Keyword being accelerated, which is the exact opposite of the bcrypt(), and PBKDF2() key-derivation functions.

Let’s assume the attacker is modifying the salt with an O(1) operation. Now let’s assume the hacker has a modern 64bit machine at their disposal. The address is calculated as follows:
contract address=keccak(concatenate(sender address,nonce,code hash,salt))

So this is a sha3() hash of 1024bits. We will assume that the attacker isn’t using string-concatenation, but rather bit flipping the last 256bits of the message to search the key-space. Looking up the performance we can see keccakc256treed2 takes roughly 40 cycles per byte in order to calculate this hash.

That gives us:
40 cycles * 64 bytes = 2560 cycles per hash

Now lets assume our hackers are poor, and they purchased a used-server from eBay for a few thousand dollars. For the purposes of this lets say the attacker with with an AMD Epyc with 64 cores at 2.66ghz.

So that gives us
2.66 billion cycles * 64 cores = 170,240,000,000 cycles/second

Thus yielding a theoretical maximum of:
66,500,000 hashes per second

Now keep in mind - the same hardware that is used in Bitcoin mining, will have a higher hashrate using keccak/sha3. This is because the sha3 hash function was created to be accelerated by hardware - where as sha2 wasn’t.

That allows for an attacker to search a very large key-space without a lot of resources. Yes, of course this poor AMD chip isn’t going to hit its maximum today - but hardware only gets faster over time. So this calculation will be a low-ball estimate in just a few years.

Now the fun part! ECC key generation isn’t something that can be easily accelerated because the numbers are too large to fit in the CPU’s registries, which is an important feature of a KDF. So even if a manufacturer decided to make a hardware accelerated secp256k1, no processor has ever been created which can accelerate this calculation, where as sha3 was specifically designed to be accelerated. This means ECC-key generation isn’t cycle-bound - its memory-bound - and we can see this this limit affects the calculation times.

There is another curve-ball here: secp256k1 keys are harder to generate, in fact the link below claims that it is 10 times slower on average than some of the more performant curves used in TLS. The benchmark didn’t think to test secp256k1… but we can choose some other Weierstraß curve which should have very similar performance characteristics. By looking at the real-world benchmarks for ECC key generation. For the purposes of this calculation we will use the data from ecfp256h which is also 256bit Weierstraß curve.

So if it takes roughly 31,738 cycles to generate a single key - we still to calculate the sha3() hash for 32-bytes of data:

31,738 cycles + 40 cycles * 32 bytes

Yielding:
33,018 cycles per key

Meaning that we can at most generate 5,365,197.07 keys in a second… But wait, we know that this is wrong because no has ever generated 5 million secp256k1 keys in a second on any chip, thats impossible because this calculation memory-bound and not CPU-bound - which is a very useful feature to have in a KDF.

In practice, if you where to benchmark an AMD Epyc system today - I think you would get only around 40k ECC keys per second - making it 3,000 times slower than sha3() alone.

But you don’t have to take my word for it, this is all easily verifiable. In fact, here is a wonderful discussion between cryptographers about the performance differences between SHA2 and secp256k1 key generation:

In some sense you could think of this vulnerability as a KDF bypass, that allows the attacker to search the keyspace thousands of times faster. If we added some value that made the contract’s hash unpredictable (otherwise known as a ‘pepper’) - then the attacker would have to search this keyspace using secp256k1 key generation instead of simple addition or string concatenation.

Using a formal memory-hard KDF like bcrypt() would also make it more difficult to search this keyspace using any method.

A reasonable patch is as follows:

keccak(concatenate(sender address, nonce, code hash, salt ⊕ pepper))

One could make the argument that BLOCK_HASH is a suitable nonce for this purpose because it cannot be known by the attacker when the transaction is formed - and should not be influenced in a MEV-style attack because PoS validators are no longer responsible for block assembly.

Correct me if i’m wrong with any of this, I like being wrong because it gives me a opportunity to improve.

Modifying CREATE2 in Ethereum could have far-reaching implications, as it’s pivotal for ensuring predictable smart contract addresses—a feature heavily relied upon by numerous systems. Altering it might arguably fix certain aspects, but it risks breaking existing protocols and introducing more significant issues.

On another note, setting up a local Ethereum node is indeed a viable solution for deploying contracts cost-effectively. By mimicking the mainnet environment, you can deploy and test your contracts without spending a dime, achieving the same end result. Am I wrong?

@sullof

After digging into this issue a little more, I think the article is mistaken. It looks like it is primarily an off-chain pre-computation attack.

While, it nice to know the contract address ahead of time, this “feature” has become very useful in crime. Knowing what I know now, I would make this calculation strictly on-chain, and emit some kind of event to notify the developer of contract’s wallet. This is a much more straight-forward UX change then what has been proposed thus far. Adding the BLOCK_HASH would prevent the value from being computed off-chain with virtually zero overhead… but this decision is above of my pay grade.

You bring up a really good point - developers have become accustomed to predictable addressing, and that still leaves us with a few solutions. Using an official KDF like bcrypt() would dramatically reduce the rate by which homomorophs can be generated - without introducing too much overhead. Legitimate blockchain operations only use this method a handful of times, where as an attacker needs to execute it many millions of times before it becomes viable in an attack. This asymmetry in computability is very useful.

KDFs are commonly used in authentication and identity, so why not here? Hindsight is 2020, and I don’t think anyone could have predicted the fallout from this feature. Swapping out sha3() for bcrypt() when calculating the wallet would be the solution where we can have our cake and eat it too, and by all accounts we will see a measurable impact in the rate by which funds are lost.

I built the first Host-proof Hosting password manager in 2006, and a file-system-like secret manager in 2020, and I used iterations anywhere it made sense. So, I agree with the general principle. But here we are talking of blockchain, and things are very different. The opcode must consume as little gas as possible. Implementing something like bcrypt would consume so much gas that it would make create2 unusable. Often, we must choose the best possible compromise.

1 Like

Correct me if I’m wrong, but I believe contract creation would be the only time this method would be used. This would have to be a new opcode, so we are the ones that set its gas consumption.

If that is the case, the gas costs could be the same cost, or be pretty similar without breaking the bank - keep in mind the major benefit is being memory-hard, not CPU hard - And “memory hard” here is just enough to knock out FPGAs GPUs taking up even 0.5kb of memory would be more than enough to obviate specialized mining-hardware, I think most KDFs don’t even use that much - but we can tune it to something that makes sense.

We would have to run some tests, but you could use a moderate number of iterations and still make a measurable impact.

I have measured the advantage, on a NVIDIA GeForce GTX 1050 GPU, using profanity and ERADICATE2, both by johguse. For CREATE I get 5.3 MH/s, and for CREATE2 I get 182 MH/s. This is a multiplier of ~34x, which corresponds to a gain of between one and two matching hexcharacters for similar work. I don’t think this makes a huge difference, especially compared to other approaches.

Your estimate of 3000x, almost 3 hexcharacters of advantage, was based on some CPU analysis. Do you have an explanation for this discrepancy between your theory and practice?

Even if the advantage was a full 3 hexcharacters, I don’t think it’s worth slowing the EVM. The responsibility of keeping user funds safe belongs to wallets. Fool-proofing is not generally possible.

2 Likes

@wjmelements

As was mentioned prior, the article that detailed this attack is mistaken and this issue has nothing to do with create vs create2. The create opcode does not calculate a secp256k1 key, so I’m not sure what your benchmarks are trying to prove.

We are comparing the time it takes to sha3(1024 bits) key vs sha3(unique secp256k1 key). You are free to calculate the difference between these two, and by all accounts this should be more than a factor of 1,000, which is why its being used in this ongoing attack on the network.

I’m not sure why @sullof and @shemnon liked this benchmark…

Sorry that my calculations weren’t clear, but yes - what i’m saying is that calculating a secp256k1 key is hard to do in an accelerated mode, because it is memory-hard calculation, which is common in KDF construction. I think we all know hash functions are fast.

I’m looking for a vanity wallet generator to make this easier, but it looks like the eth vanity wallet generators mostly use the sha3() shortcut, and aren’t even trying to generate a keypair:

It looks like this project is generating a keypair, you are welcome to run it:

I can help your confusion. profanity does public key derivation for address mining of both CREATE addresses and EOAs. My benchmark did it for CREATE. Both methods do public key derivation. ERADICATE2 does CREATE2, and so does not do public key derivation. I measured that CREATE2 is 34x faster than CREATE, not 1000x+ faster.

The final step of address calculation is keccak; private key account addresses are a hash of their public key. In python using web3 and coincurve this is:

"0x" + Web3.keccak(private_key.public_key.format(False)[1:65]).hex()[26:66]

The CPU-based vanity generators are far behind the GPU-based ones.

Worth noting that keys generated by profanity are mostly pwned due to low initial seed entropy. I have a patch but upstream is not maintained any-longer.

I’m pretty sure if you used profanity to generate homomorphs you’d find it is slower than 34x, and that was and still is my entire point. Bitcoin doesn’t use sha3(1024bit string) to generate identities and they aren’t being exploited like we are. The only way to conduct this attack on bitcoin is using a tool like profanity.

Yes, profanity keys are weak… who’s side are you on here? Why do you care how strong the attacker’s keys are? Forcing the attacker to use both expensive and weak keys is a good thing.

Correct me if I’m wrong, but I don’t believe there is any concern with CPU consumption of the EVM because processing power is increasing, and our demands on the network are capped. (Remember when we had PoW? That didn’t seem to be a problem…)

We have developed proper KDFs that prefer CPU execution over GPU for this exact purpose. bcrypt with a cost factor of 2 or 4 would make a difference for a GPU but not a CPU. Or pbkdf2 with a small number of iterations and using the public-key as both the input and salt:

pbkdf2_sha3(public_key, public_key, 1000, 20)

I don’t think this change would even be measurable. Not fixing this issue however, is measurable and we can see its getting worse over time.

We use computational effort to ensure safety of the system as a whole. At no point did anyone recommend “slowing down” the EVM - we still forming blocks every ~12s right?

There are ways of patching this issue with using little or no resources.

For example, networks that use base58 representation of wallets instead of base16 make the computation of homomophs more expensive, this is due to the fact that base58 encodes ~5.858-bits of information per character, where as base16 can only encode 4-bits. Simply using a greater encoding density wouldn’t incur any performance impact on behalf of the network, and the community would benefit from having smaller wallet addresses. Everyone wins from this change, except of course: the attacker.

The age of a bug doesn’t make it any less valuable in an attacker’s hands. Just because the bug is well known, doesn’t make it any less effective. Just because someone you respect wrote insecure code, doesn’t mean that the author doesn’t regret doing so. Hindsight is 2020, and I’m sure that every victim will tell you that this is worth fixing. By all accounts if we do not address this problem, it will become worse.

Now there are multiple ways of addressing this issue. Punting and requiring 3rd party vendors to come up with their own patch is an one approach, but I’m not sure it is in the best interest of the community. We have these powerful computers at our command, shouldn’t they be used to make a more trustworthy system at every level?

I have a proposal to fix spoofing

1 Like