Immutables, invariants, and upgradability

fubuloubu · January 17, 2019, 2:41pm

For Vyper, we’ve discussed adding function-level recursion locks that would attempt and prevent mutal recursion between a set of contracts, but it would involve a lot of overhead and be too complex as to open a lot of attack surface in practice I think.

I really like the proposal of adding a callback-safe transfer opcode because it allows the developer an additional option to explicitly reduce their attack surface so they can protect themselves if a particular protocol would have safety issues that need to be protected. Re-entrancy is probably one of the most complex bugs possible with smart contracts, and I think giving protocol-level tools to protect against unintended behaviors is important to provide as it will actually mitigate the problem instead of band-aiding it as the 2300 gas stipend does.

This “callback-safe” version of transfer could allow STATICCALLs back but no mutating function calls. This might also be more broadly useful as an a method of calling, something like FINALCALL that does not allow mutating calls to itself after the call is forwarded e.g. “I don’t care what you do with this, but don’t come crawling back to me with it because I won’t be listening”.

I do agree with @Arachnid that this starts to break the “composable” behavior that developers tend to tout of Ethereum smart contracts, but it’s a trade of interoperability for safety that I think would be very helpful to developers.

I’ll caveat all of the above with “I am not a VM expert, and this all could be very difficult to design”.

lrettig · January 17, 2019, 5:43pm

The backwards compatibility of x86 is a helpful example, and I’m thankful that @jpitts brought it up. I’ve heard @gcolvin speak about this before as well. But lest we compare ourselves too closely to Intel, I just want to point out a glaring difference: ours is an adversarial environment where the attacker can see, and execute, code on our “machine” at will. For this reason I think we should adopt a different set of principles and priorities in our design, and safety should be an even higher priority for us.

lrettig · January 17, 2019, 5:46pm

The honest answer to your question is that IMHO we should not have just one kind of “gas.” It should be multi-dimensional, and we should try to more accurately reflect the orthogonal costs of bandwidth, storage, compute, etc. I fear that monolithic gas is too great an abstraction for a functional, safe, efficient machine.

lrettig · January 17, 2019, 5:55pm

Before this behaviour was discovered, nobody considered reducing the gas cost of SSTORE a potentially breaking change; reducing a cost is less likely to cause problems with contract execution than increasing it

What this issue taught us (yes, with the benefit of hindsight) is that even something as seemingly benign as reducing the gas cost of an opcode can have unintended knock-on effects that violate perceived “invariants.” Unless someone can generate some sort of formal proof that reducing gas costs in the future cannot have this effect, then I’m afraid we are stuck. We had months to evaluate this EIP and prepare for this hard fork and many intelligent people missed this potential issue, and the same could happen with any future change of this sort. And that’s just for reductions in gas cost - what about for other types of changes?

It seems to me that, fundamentally, we are faced with a stark choice between the following:

Violate this “tacit social contract” (as I am referring to it) and accept whatever may come of that, including hacks and breakage, or
Do not make changes to existing opcodes, no matter how benign

Versioning seems like a more practical approach, but will likely require consensus-level changes in order to function. On the other hand, it will also open the door to EWASM, which would require some kind of versioning anyway

Yes, this is promising and we need to give this more thought, but you’re right that it’s necessary for Ewasm anyway and it’s something we’ve begun to explore. We can continue that conversation here. In general I think we should not attempt to reinvent the wheel and should take as much as we can from existing package management systems such as npm, yarn, cargo, etc.

Ethernian · January 17, 2019, 6:26pm

I can understand your point about multi-dimensional gas if you are talking about opcodes sampling. Yes, the sampled gas cost for some opcode is a “sum” of “orthogonal costs of bandwidth, storage, compute, etc”.

But once the “combined” gas cost is sampled into single number, I don’t see any reason for user to split it back into dimensions. Multi-dimensional gas should imply multidimensional gas price, but I don’t see who will need it. What should a user express by setting a network gas price higher than computational gas price? Unclear to me.

Ethernian · January 17, 2019, 6:39pm

@lrettig, shouldn’t we better extend the topic to “Immutables, invariants and upgradability”?
Objects are: smart contracts, EVM and social contract around it.

currently we have it fragmented:

there are works on contract upgradability,
there are discussions on EVM upgradability,
social contract upgrades (like gas cost changes) are even not in discussion yet

I think all this stuff is in the same domain and tightly coupled.
It is really worth of thoughtful research and specification.

lrettig · January 17, 2019, 6:44pm

Back to your original question:

Do you think there will be no need to tune single opcode’s cost in the future even if the hardware will change significant?

You’re right that multidimensional gas cost does not really help address this question. I think, yes, we probably do want to/need to be able to tune an opcode’s gas cost in the future. There are two ways we could tune:

Up, in case it’s too low, which would probably only happen to mitigate a DoS attack, which I would consider an emergency, and which in any case would definitely not increase the risk of re-entrancy
Down, in which case we might introduce a new, cheaper version of the opcode, or alternatively a new EVM version with a cheaper opcode

lrettig · January 17, 2019, 6:44pm

Agree, good point, will update the subject

Ethernian · January 17, 2019, 6:47pm

I meant re-entrance locks. Unsure whether you mean the same with recursion locks.
Re-entrance locks are simple and intuitive in solidity, although quite expensive (exact this issue was targeted by EIP-1283).
I am wondering what do you mean by “too complex / a lot of overhead” in Viper exactly?

lrettig · January 17, 2019, 6:57pm

Copying over some relevant posts on this topic from the other thread:

Remediations for EIP-1283 reentrancy bug

One more (much technically challenging) solution would be to assign EVM version and gas prices to contracts at deployment. That means, that smart contract that is deployed before the hard fork is always executed with old gas prices (and old features of EVM).

So, we will have EVM0 (pre Constantinople) and EVM1 (Constantinople). When a new contract (running EVM1) calls anything that is deployed before that, EVM1 communicates to EVM0, and the old contract will use old gas prices and old assumptions will stay the same. This communication isn’t trivial, but since contracts have very specific interfaces it is not impossible.

Cons:

more complicated codebase and testing;

more complicated contract interaction;

bloating codebase with any hardforks;

Pros:

contracts that are already deployed aways will stay the same and behave the same;

incentive for those who can to upgrade their contracts to the new version because cheaper gas, etc.

I still think that this might solve the whole class of problems like that and might be worth it in the long run because the contracts behaviour would be truly immutable.

Remediations for EIP-1283 reentrancy bug

So, we will have EVM0 (pre Constantinople) and EVM1 (Constantinople). When a new contract (running EVM1) calls anything that is deployed before that, EVM1 communicates to EVM0, and the old contract will use old gas prices and old assumptions will stay the same. This communication isn’t trivial, but since contracts have very specific interfaces it is not impossible.

This is a reasonable idea - I’ll add it to the list. The main barrier is that it will require either a new consensus field for accounts, or some other means of communicating EVM versioning.

It’s worth noting that this doesn’t require two entirely separate EVMs, just some context that gets passed around for the current execution environment. Nodes already need most of this functionality to handle previous hard forks that have changed execution rules.

One way to handle this would be to introduce a new opcode, along these lines:

VERSION : Pops one element from the stack and changes the execution environment to the specified version. Clears stack and local memory before handing control to the new version, which begins executing at the next PC value.

Each new contract would then start with a prologue along the lines of PUSH 1 VERSION to enable the new EVM. This avoids the need to introduce new consensus data structures.

This can even be used for a transition to Web Assembly; contracts would just start with a prologue that switches the execution environment to EWASM.

Alternately, this could be a pseudo-opcode that’s only valid at the start of a contract, for simplicity reasons.

CC @mandrigin, @Arachnid, @rajeevgopalakrishna

veox · January 17, 2019, 7:20pm

No. It introduces a condition without considering which nearly all currently-deployed contracts were written. This is exactly ~~against~~ opposite the rest of your post.

Operation of all contracts will have to be re-considered, and many (most?) will need to be rewritten, to answer a new question: “Who pays the rent?”

Either this, or some form of special-casing is introduced for “pre-rent” contracts; this increases system complexity a little, and incentivises “state hoarding” up until the feature is enabled (like described in this post). The latter can (probably) be worked around, but then it increases complexity greatly.

The problem is, we can’t reasonably expect both decentralisation and pay-once general storage.

Sharding (at best) delays this, and (at worst) allows a much more rapid growth.

Personally, I would much rather see exodus from the (future) Ethereum 1.x shard into rent-enabled shards, rather than the same free-for-all. At least if we’re to expect people to run PoS-enabled clients on their laptops. (Replace “shard” with “side-chain” if needed.)

Whether rent should be enabled on 1.x is (still) an open question, IMO. But it should be, eventually, somewhere. [All] costs should be internalised, otherwise the protocol will suffer a “tragedy of the commons”.

fubuloubu · January 17, 2019, 7:22pm

Same thing. I call it a “mutal recursion” issue since we protect against recursion internally (a Vyper contract cannot recursively call itself)

We were brainstorming a way to do it behind the scenes, basically some sort of bloom filter mechanism that would be efficient enough in practice (1 word per contract). Decided against it. The alternative is to track the call addresses explicitly per call, which would be very expensive.

fubuloubu · January 17, 2019, 7:24pm

Technical note:
Could they be EVM0.1, EVM1.0, EVM1.1, etc?

lrettig · January 17, 2019, 7:48pm

Those already exist: https://github.com/ethereum/py-evm/tree/master/eth/vm/forks

jpitts · January 17, 2019, 8:04pm

It is true that a key idea of first principles thinking is to not reason based on analogy.

Different principles and priorities must be applied given that this is on a live network and all operations are costed out. Still, it is important to learn from how the PC and other platforms evolved, what principles they adopted for upgradability, and how they survived against competing ecosystems.

Every platform I have developed for or deployed, and every device we use, has defined certain known points of stability over time, but eventually most apps will break… because the platform must move forward or die. The saving grace is being able to quickly understand the context in which an app is running, breaking, potentially becoming insecure if deployed on a newer version of the platform.

In “tech talks”, conferences, network upgrades, and in this upgradability discussion, I sense that we are getting beyond copying industry and into understanding why they do what they do.

What industry players to do maintain stability for their developers:

Maintain an up-to-date specification that captures the current, full system. Clearly number & describe the milestone releases and the updates within those releases.
Delineate the key parts of the system, their versions, and how they fit together into a milestone release of the platform.
Use concise language for the categories of expected behavior for developers deploying apps targeting a certain milestone release (microprocessors and other hardware developers call them “series”). A release isn’t just a set of new features described in specs.
Point to implications, areas of risk due to other parts of the platform changing for developers in a given milestone release
Clearly describe policies around “what is supported” e.g. TLS, STS. We must find a way to position this social contract in our decentralized situation, and establish what should and can feasibly be guaranteed.

Are there other ways of “platforms communicating with their devs” that I am missing?

Ethernian · January 17, 2019, 8:48pm

Why you haven’t used locks in storage like in solidity? Are we talking about locks programmable by devs or built-in locks provided by language to any function?

fubuloubu · January 17, 2019, 9:28pm

Built-in.

I wasn’t aware they existed in Solidity, but I would hesitate to add the complexity.

Ethernian · January 17, 2019, 11:07pm

Re-entrance locks exist in Solidity as a pattern (modifier), not as a built-in feature. Nevertheless quite simple and easy to use. For example, this one.

fubuloubu · January 17, 2019, 11:25pm

Ah, that’s what I thought. We were proposing it as a feature.

Arachnid · January 18, 2019, 2:25am

Sorry, this wasn’t really a very clear description from me. What I meant to say is that it would likely require a change to consensus data structures (specifically, the accounts struct) to record the version. The version opcode I was referring to would avoid the need for that, though.

Can you give an example?

RE locks, personally I believe these are a code smell; I’ve yet to see a contract designed with locks that couldn’t be rewritten to be safe without them. I really think they’re a bandaid developers will use to avoid having to reason about how their code works properly, and will encourage bad development practice.

That said, 1283 would have made them more affordable to use.