Great thing @axic, @chfast, @gumb0.
As for the main subrotuine - actually Nethermind has a separation of the top level call and subcalls already. I was planning to refactor that but now it will make me think twice.
As for the jump dest analysis - currently is has barely any performance impact because of caching and single O(1) pass. It may be relevant for stateless clients but agree with Martin that taking into account how simple and optimized current implementations are any additional splitting would probably make it slower.