EIP proposal to fix the "billion dolar mistake" in EIPs

earizon · October 28, 2021, 10:13am

Hi all,

Before reading the rest of this post I suggest to read next links to put into context the real risk of null pointers in practice:

NULL pointer billion dolar mistake
“billion dolar mistake” at “Google”

I’m studying different EIP standards (EIP-1155, EIP-1820, ERC-20) and I have observed a big design risk in the external API of most (if not) all of them. They all arbitrarily choose ZERO as input addresses for minting, burning, returning non-found result, …

It looks to me that this is an intuitive but wrong and risky decision. It follows the inertia of the EVM “default-to-ZERO” return value for data not-found/not-initialized in the key/value internal storage. The decision of EVM / Solidity to return “ZERO-value” is “good-enough” since the EVM lacks any context and this convention is good-enough for controlled EVM internal behavior. A better/safer alternative would have been to throw an exception when a key has not been initialized but this would introduce added complexity to the EVM implementation. (Throwing and exception is not so weird. For example the Python “VM” does throw an exception when trying to access a non-existing key in a dictionary and nobody will neglect that Python works in the real world). Counterbalancing safety for simplicity in controlled (EVM) environments can be considered good-enough.

The problem comes when allowing input ZERO-addresses as default values coming from external/non-controlled signed transactions. The mechanism used to build such transaction is the result of user-input passing through some sort of custom user interface (wallet, marketplace, …) and then some standard or custom wallet. Such custom user interfaces and wallets can NOT be considered to be implementation-safe and in general many of them will be potentially be tainted by the billion dollar mistake, with none/null/undefined/empty strings being accidentally converted to a ZERO address by the external non-controlled applications/front-end/wallets/middleware/bizantine attacker.

To put more in context the problem with non-controlled external input, I just “copy-and-paste” my last comment on the ERC-1155 discussion “issue” topic (ERC: Multi Token Standard · Issue #1155 · ethereum/EIPs · GitHub) :

“”" Probably it is a minor issue when using strongly type safe languages (Elm, VLang, …), the risk augment with (weak) type safe languages (C/C++/GoLang, Java, dotNet, TypeScript, …) and arises everywhere when using non-typed languages (Javascript, …)"

Unfortunately most Dapps are developed, due to factors out of the control of Ethereum EIPs, in Javascript/TypeScript. Such languages are (very) far away from being null-free safe. Even if a careful QA control is done, any minor upgrade can result in new nulls that accidentally can result in new non-desired “zero-address”. This could potentially result in tokens being burned or minted at random.

A proposal to fix this issue can be to create a (meta?) EIP about contracts being null safe.

Contracts adhering to this interface will not allow sending 0x addresses for any purpose (other than backward compatibility) and will define alternative default addresses (e.g: address( uint160( uint256( sha256("burn_address_dst") ))) ) to force client apps building the transaction to explicitly use an address that can not be inadvertently generated by a programming error.
Contracts adhering to this interface can introduce some mechanism to warn about 0x address being deprecated.

Another advantage of such approach is that it introduces a pattern for resilient upgrade of contract interfaces that maintains backward compatibility. For example, ERC-20/1155/… can be extended in functionally by just defining new hardcoded addresses as new use-cases arises. In such sense the address data meaning becomes twofold:

It can be a real EOA / Contract address or
it can be a “command”: Different EIPs implementations just need to “switch-case” around the command value to add new functionality.

A simple pseudo-code would look similar to:

if (address == BURN_ADDRESS || address == 0x /*backward compatibility*/ ) {
   // ...
} else if (address == LOCK_BY_OWNER_ADDRESS ) {
   // ... lock and let user unlock
} else if (address == HASH_LOCK_TYPE0_ADDRESS ) {
   // ... atomic swap implementing lock type 0
} else if (address == HASH_LOCK_TYPE1_ADDRESS ) {
   // ... atomic swap implementing lock type 1
}

HASH_LOCK_TYPE0_ADDRESS could be added to the EIP standard after initial publishing, and HASH_LOCK_TYPE1_ADDRESS can be added two years later, with no break in compatibility.

Comments and Feedback welcome!

PhABC · October 28, 2021, 3:26pm

ERC-1155 mandates that transfers revert if the recipient is the 0x0 address ;

MUST revert if _to is the zero address.

Only the Transfer events on mint/burn require using the 0x0 address.

earizon · November 17, 2021, 5:08pm

Hi, sorry for late reply (System overload). Yes, mandating 0x0 fix the problem for the particular use-case of ERC-1155 transfers. It will probably fix other scenarios, but many others will rest undefined.

For example, without leaving the ERC-1155 EIP, an external client listening for events, can be waiting for 0x0 addresses (mint/burn) to trigger some action (maybe a costly action such as another transfer in an cross-chain atomic-swap). Unfortunately it could be the case that ERC-1155 emits a transfer with a non 0x0 address, the external client wrongly leaves (due to buggy frameworks/libraries/languages/coding-skills) with an uninitiated variable that finally translates to false 0x0 triggering an non-desired action. If we force (through an EIP) clients to react to some “arbitrary-but-never-zero” “mint” or “burn” address (defined in the EIP standard) the risk of such error is drastically decreased, since it is hard to believe than even the worst dynamic programming language (“Javascript”) will leave an uninitiated variable to an arbitrary value matching 20 bytes in the EIP defined addresses.

A related problem arise with functions returning true | false. I am suffering it right now: EIP165 returns true to indicate that an interface is correct and false otherwise. The truffle framework I am using for testing has a mistake and queries returning “0x0000000000000000000000000000000000000000000000000000000000000000” in the JSON-RPC response are correctly parsed to false in JS. Anything is parsed, 100% arbitrarely, to true. Unfortunately when an exception is triggered it produces an exception that is again, parsed to true. (a similar bug existing after more than 15 years in Java Boolean.parse ). The result is wrong code throwing an exception wrongly passing as “OK tests”.
Using some arbitrary bytes32(“IS_IMPLEMENTED”) and bytes32(“IS_NOT_IMPLEMENTED”) (or similar) will avoid such problem since testing client will need to explicetely compare such 2 values in a row or raise an exception otherwise.
Again, the problem is not with EIPs or Solidity but with external existing/buggy-for-always clients/programming languages. But not taking such a common problem into consideration can be “unrealistic” in real deployments.

lukehutch · December 16, 2021, 2:29am

@PhABC is correct, any good implementation of these contracts rejects transactions that are passed address(0) into an address-typed parameter.

If address(0) is present in an emitted event, it is only a placeholder for event parameters that are not even provided in the function parameters (for example the sender for mint, or the recipient for burn).

earizon · February 4, 2022, 5:02pm

@lukehutch I agree that good/correct/carefully-developed implementations will have no problem.

The original post is concerned with non-carefully developed implementations. Also, getting used to never-zero values open the possibility to add an extra “safe” attribute to types definition in Solidity. For example Solidity could default to revert transactions whose input/ouput data (input parameters in public/external functions, output values in emitted events) is zero (unless the parameter is explicitly marked as “zeroable” in the same way that addresses must be marked as payable). By forcing developers to think about arbitrarily-chosen addresses sent to transactions and also the addressed values mapped to some sort of “descriptive string” the risk decreases exponentially. This augment the global security of the blockchain solution (considering the solution as the sum of under-control EVMs and out-of-control wallets, dApps, middleware, …).