Since this EIP is up for discussion for Shanghai again, maybe it’s good to clarify the following to avoid misunderstandings:
It is still my position (and I suspect this is shared by the solidity team in general), that gas savings cannot serve as justification for this EIP and the gas analysis in the EIP is a bit misleading (for example, the subroutine examples use RJUMP, which is per se independent of this EIP and would also reduce the gas costs of the non-subroutine versions, if used there; the swap for the return address can be nicely integrated in the overall stack layout of more complex functions, in which most of the time a globally swap-free stack layout cannot be achieved anyways, so this is no major issue for us; etc. - but I actually don’t think there is a lot of merit in arguing these points in detail). More importantly, any gas savings this EIP may theoretically bring will be insignificant in practice, since internal function calls only occur at very low frequency in real world Solidity-generated bytecode (among others due to inlining, etc.). So the overall real-world gas advantage of subroutines (for Solidity generated bytecode) will be negligible in practice.
That is not to say that I oppose this EIP. I just want to make sure that if it is considered, that it is considered for the right reasons. In combination with EIP 4200, the subroutines of this EIP can serve to completely eliminate the need for dynamic jumps, which may have significant merit independently of gas savings - but this merit has to be argued for independently and different approaches like EIP 4750 have to be considered. And such arguments and considerations will be largely independent of the use in Solidity (resp. of any theoretical gas savings).
And also: sorry, if I have missed or misunderstood any particular developments - we have quite a high workload these days and I did not find the time to review the updated EIP in all detail, but maybe the general sentiment expressed above is helpful.