LLVM-EVM backend is now available

stev · February 10, 2020, 3:55pm

LLVM-EVM backend is alpha release is available. Here’s the announcement:
https://medium.com/etc-core/announcing-evm-llvm-alpha-release-the-next-level-in-smart-contract-evolution-91a2119bd442

more programming languages for smart contracts
more toolchain support
future-proof and long-term support

fubuloubu · February 10, 2020, 5:27pm

Other benefits:

almost 20 years of advanced optimization techniques (cheaper smart contracts)
nearly 100% adoption among the security community (safer smart contracts)
no JIT bombs or other concerns from trying to adapt web technologies to our needs.

chfast · February 10, 2020, 10:23pm

This is not so simple. There are a ton of “builtins” defined specific to EVM, and all of them must be exposed in all frontends. I’m guessing that’s the reason there are no examples of a smart contract written in any of the mentioned programming languages.

Here is a LLVM IR example I found on Compiling smart contracts · etclabscore/evm_llvm Wiki · GitHub.

declare i256 @llvm.evm.calldataload(i256)
declare void @llvm.evm.return(i256, i256)
declare void @llvm.evm.mstore(i256, i256)

define void @main() {
entry:
  call void @llvm.evm.mstore(i256 64, i256 128)
  %0 = call i256 @llvm.evm.calldataload(i256 0)
  %1 = call i256 @llvm.evm.calldataload(i256 32)
  %2 = call i256 @add(i256 %0, i256 %1)
  call void @llvm.evm.mstore(i256 0, i256 %2)
  call void @llvm.evm.return(i256 0, i256 32)
  unreachable
}

define i256 @add(i256, i256) #0 {
  %3 = alloca i256, align 4
  %4 = alloca i256, align 4
  store i256 %0, i256* %3, align 4
  store i256 %1, i256* %4, align 4
  %5 = load i256, i256* %3, align 4
  %6 = load i256, i256* %4, align 4
  %7 = add nsw i256 %5, %6
  ret i256 %7
}

Some questions to this code:

Why is EVM memory also handled by EVM-specific intrinsics? Why not using LLVM IR memory?
Why is the @add function generated? The %2 = call i256 @add(i256 %0, i256 %1) can be simply replaced with %2 = add i256 %0, %1.
The nsw in %7 = add nsw i256 %5, %6 is wrong.

The LLVM 10 is in the release process. Are you going to rebase your work on top of that or LLVM master? Also Rust requires very specific LLVM revision (see GitHub - rust-lang/llvm-project: Rust-specific fork of LLVM.). This work cannot be easily used in Rust language.

lialan · February 11, 2020, 2:04am

Hey @chfast, let me answer your questions:

yes the intrinsics must be generated by the frontend. This is obvious because EVM is a very different architecture than usual machines – storage space is not available in other architectures. You have to explicitly control the storage IO in any case. This is inevitable in any IR, unless you implicitly define the behaviour of storage IO in some higher-level IR.
In this case I was trying to 100% mimic the behaviours of function dispatcher in SOLC. Notice that this function is at the very top level and is close to “bare metal”, its purpose is to initialize the execution environment for the actual executing function (in this case, the “add” function), so it have to explicitly initialize a “mstore” to location 0x40. If you were to use LLVM IR memory, the location 0x40 is not guaranteed. The 0x40 location is fixed because it stores the stack frame pointer.
Yeah, the call can be simply replaced with an “add” operate. My purpose there was to show a very simple program being compiled to EVM: the function dispatcher (aka the “main” function) calls a function named “add”. and retrieves the returned value and return it. To make it less confusing, I should have used a more complicated example such as Fibonacci.
You are right again here, the “nsw” is an incorrect flag which will create poison values, it should be changed. I created the test cases using some c function as templates… So when creating LLVM IR, please follow the manual.
yes, back-porting is required if we need specific base LLVM versions. It is not technically difficult, just a lot of chores. We will figure out which version to backport to first. BTW, we are not targeting Rust at the beginning as it will be too much work for the team (so far the team consists only one person, but you are welcomed to participate!), we will start with smart contract DSLs and c-like languages.

Yup, the designs and implementations will definitely change over time. The alpha version serves as a start point for frontend integrations and a proof of concept of compiling EVM using LLVM infrastructure. Please provide more feedbacks, we need people like you to help find problems in the codebase and make it better over time! Thanks!

pinkiebell · February 12, 2020, 1:27pm

Great work !
This project is something I missed to have since the beginning.