mirror of
				https://github.com/ethereum/solidity
				synced 2023-10-03 13:03:40 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			1289 lines
		
	
	
		
			59 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			1289 lines
		
	
	
		
			59 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| .. _yul:
 | |
| 
 | |
| ###
 | |
| Yul
 | |
| ###
 | |
| 
 | |
| .. index:: ! assembly, ! asm, ! evmasm, ! yul, julia, iulia
 | |
| 
 | |
| Yul (previously also called JULIA or IULIA) is an intermediate language that can be
 | |
| compiled to bytecode for different backends.
 | |
| 
 | |
| Support for EVM 1.0, EVM 1.5 and eWASM is planned, and it is designed to
 | |
| be a usable common denominator of all three
 | |
| platforms. It can already be used in stand-alone mode and
 | |
| for "inline assembly" inside Solidity
 | |
| and there is an experimental implementation of the Solidity compiler
 | |
| that uses Yul as an intermediate language. Yul is a good target for
 | |
| high-level optimisation stages that can benefit all target platforms equally.
 | |
| 
 | |
| Motivation and High-level Description
 | |
| =====================================
 | |
| 
 | |
| The design of Yul tries to achieve several goals:
 | |
| 
 | |
| 1. Programs written in Yul should be readable, even if the code is generated by a compiler from Solidity or another high-level language.
 | |
| 2. Control flow should be easy to understand to help in manual inspection, formal verification and optimization.
 | |
| 3. The translation from Yul to bytecode should be as straightforward as possible.
 | |
| 4. Yul should be suitable for whole-program optimization.
 | |
| 
 | |
| In order to achieve the first and second goal, Yul provides high-level constructs
 | |
| like ``for`` loops, ``if`` and ``switch`` statements and function calls. These should
 | |
| be sufficient for adequately representing the control flow for assembly programs.
 | |
| Therefore, no explicit statements for ``SWAP``, ``DUP``, ``JUMP`` and ``JUMPI``
 | |
| are provided, because the first two obfuscate the data flow
 | |
| and the last two obfuscate control flow. Furthermore, functional statements of
 | |
| the form ``mul(add(x, y), 7)`` are preferred over pure opcode statements like
 | |
| ``7 y x add mul`` because in the first form, it is much easier to see which
 | |
| operand is used for which opcode.
 | |
| 
 | |
| Even though it was designed for stack machines, Yul does not expose the complexity of the stack itself.
 | |
| The programmer or auditor should not have to worry about the stack.
 | |
| 
 | |
| The third goal is achieved by compiling the
 | |
| higher level constructs to bytecode in a very regular way.
 | |
| The only non-local operation performed
 | |
| by the assembler is name lookup of user-defined identifiers (functions, variables, ...)
 | |
| and cleanup of local variables from the stack.
 | |
| 
 | |
| To avoid confusions between concepts like values and references,
 | |
| Yul is statically typed. At the same time, there is a default type
 | |
| (usually the integer word of the target machine) that can always
 | |
| be omitted to help readability.
 | |
| 
 | |
| To keep the language simple and flexible, Yul does not have
 | |
| any built-in operations, functions or types in its pure form.
 | |
| These are added together with their semantics when specifying a dialect of Yul,
 | |
| which allows to specialize Yul to the requirements of different
 | |
| target platforms and feature sets.
 | |
| 
 | |
| Currently, there is only one specified dialect of Yul. This dialect uses
 | |
| the EVM opcodes as builtin functions
 | |
| (see below) and defines only the type ``u256``, which is the native 256-bit
 | |
| type of the EVM. Because of that, we will not provide types in the examples below.
 | |
| 
 | |
| 
 | |
| Simple Example
 | |
| ==============
 | |
| 
 | |
| The following example program is written in the EVM dialect and computes exponentiation.
 | |
| It can be compiled using ``solc --strict-assembly``. The builtin functions
 | |
| ``mul`` and ``div`` compute product and division, respectively.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     {
 | |
|         function power(base, exponent) -> result
 | |
|         {
 | |
|             switch exponent
 | |
|             case 0 { result := 1 }
 | |
|             case 1 { result := base }
 | |
|             default
 | |
|             {
 | |
|                 result := power(mul(base, base), div(exponent, 2))
 | |
|                 switch mod(exponent, 2)
 | |
|                     case 1 { result := mul(base, result) }
 | |
|             }
 | |
|         }
 | |
|     }
 | |
| 
 | |
| It is also possible to implement the same function using a for-loop
 | |
| instead of with recursion. Here, ``lt(a, b)`` computes whether ``a`` is less than ``b``.
 | |
| less-than comparison.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     {
 | |
|         function power(base, exponent) -> result
 | |
|         {
 | |
|             result := 1
 | |
|             for { let i := 0 } lt(i, exponent) { i := add(i, 1) }
 | |
|             {
 | |
|                 result := mul(result, base)
 | |
|             }
 | |
|         }
 | |
|     }
 | |
| 
 | |
| At the :ref:`end of the section <erc20yul>`, a complete implementation of
 | |
| the ERC-20 standard can be found.
 | |
| 
 | |
| 
 | |
| 
 | |
| Stand-Alone Usage
 | |
| =================
 | |
| 
 | |
| You can use Yul in its stand-alone form in the EVM dialect using the Solidity compiler.
 | |
| This will use the :ref:`Yul object notation <yul-object>` so that it is possible to refer
 | |
| to code as data to deploy contracts. This Yul mode is available for the commandline compiler
 | |
| (use ``--strict-assembly``) and for the :ref:`standard-json interface <compiler-api>`:
 | |
| 
 | |
| .. code-block:: json
 | |
| 
 | |
|     {
 | |
|         "language": "Yul",
 | |
|         "sources": { "input.yul": { "content": "{ sstore(0, 1) }" } },
 | |
|         "settings": {
 | |
|             "outputSelection": { "*": { "*": ["*"], "": [ "*" ] } },
 | |
|             "optimizer": { "enabled": true, "details": { "yul": true } }
 | |
|         }
 | |
|     }
 | |
| 
 | |
| .. warning::
 | |
| 
 | |
|     Yul is in active development and bytecode generation is only fully implemented for the EVM dialect of Yul
 | |
|     with EVM 1.0 as target.
 | |
| 
 | |
| 
 | |
| Informal Description of Yul
 | |
| ===========================
 | |
| 
 | |
| In the following, we will talk about each individual aspect
 | |
| of the Yul language. In examples, we will use the default EVM dialect.
 | |
| 
 | |
| Syntax
 | |
| ------
 | |
| 
 | |
| Yul parses comments, literals and identifiers in the same way as Solidity,
 | |
| so you can e.g. use ``//`` and ``/* */`` to denote comments.
 | |
| There is one exception: Identifiers in Yul can contain dots: ``.``.
 | |
| 
 | |
| Yul can specify "objects" that consist of code, data and sub-objects.
 | |
| Please see :ref:`Yul Objects <yul-object>` below for details on that.
 | |
| In this section, we are only concerned with the code part of such an object.
 | |
| This code part always consists of a curly-braces
 | |
| delimited block. Most tools support specifying just a code block
 | |
| where an object is expected.
 | |
| 
 | |
| Inside a code block, the following elements can be used
 | |
| (see the later sections for more details):
 | |
| 
 | |
|  - literals, i.e. ``0x123``, ``42`` or ``"abc"`` (strings up to 32 characters)
 | |
|  - calls to builtin functions, e.g. ``add(1, mload(0))``
 | |
|  - variable declarations, e.g. ``let x := 7``, ``let x := add(y, 3)`` or ``let x`` (initial value of 0 is assigned)
 | |
|  - identifiers (variables), e.g. ``add(3, x)``
 | |
|  - assignments, e.g. ``x := add(y, 3)``
 | |
|  - blocks where local variables are scoped inside, e.g. ``{ let x := 3 { let y := add(x, 1) } }``
 | |
|  - if statements, e.g. ``if lt(a, b) { sstore(0, 1) }``
 | |
|  - switch statements, e.g. ``switch mload(0) case 0 { revert() } default { mstore(0, 1) }``
 | |
|  - for loops, e.g. ``for { let i := 0} lt(i, 10) { i := add(i, 1) } { mstore(i, 7) }``
 | |
|  - function definitions, e.g. ``function f(a, b) -> c { c := add(a, b) }```
 | |
| 
 | |
| Multiple syntactical elements can follow each other simply separated by
 | |
| whitespace, i.e. there is no terminating ``;`` or newline required.
 | |
| 
 | |
| Literals
 | |
| --------
 | |
| 
 | |
| You can use integer constants in decimal or hexadecimal notation.
 | |
| When compiling for the EVM, this will be translated into an
 | |
| appropriate ``PUSHi`` instruction. In the following example,
 | |
| ``3`` and ``2`` are added resulting in 5 and then the
 | |
| bitwise ``and`` with the string "abc" is computed.
 | |
| The final value is assigned to a local variable called ``x``.
 | |
| Strings are stored left-aligned and cannot be longer than 32 bytes.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     let x := and("abc", add(3, 2))
 | |
| 
 | |
| Unless it is the default type, the type of a literal
 | |
| has to be specified after a colon:
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     let x := and("abc":uint32, add(3:uint256, 2:uint256))
 | |
| 
 | |
| 
 | |
| Function Calls
 | |
| --------------
 | |
| 
 | |
| Both built-in and user-defined functions (see below) can be called
 | |
| in the same way as shown in the previous example.
 | |
| If the function returns a single value, it can be directly used
 | |
| inside an expression again. If it returns multiple values,
 | |
| they have to be assigned to local variables.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     mstore(0x80, add(mload(0x80), 3))
 | |
|     // Here, the user-defined function `f` returns
 | |
|     // two values. The definition of the function
 | |
|     // is missing from the example.
 | |
|     let x, y := f(1, mload(0))
 | |
| 
 | |
| For built-in functions of the EVM, functional expressions
 | |
| can be directly translated to a stream of opcodes:
 | |
| You just read the expression from right to left to obtain the
 | |
| opcodes. In the case of the first line in the example, this
 | |
| is ``PUSH1 3 PUSH1 0x80 MLOAD ADD PUSH1 0x80 MSTORE``.
 | |
| 
 | |
| For calls to user-defined functions, the arguments are also
 | |
| put on the stack from right to left and this is the order
 | |
| in which argument lists are evaluated. The return values,
 | |
| though, are expected on the stack from left to right,
 | |
| i.e. in this example, ``y`` is on top of the stack and ``x``
 | |
| is below it.
 | |
| 
 | |
| Variable Declarations
 | |
| ---------------------
 | |
| 
 | |
| You can use the ``let`` keyword to declare variables.
 | |
| A variable is only visible inside the
 | |
| ``{...}``-block it was defined in. When compiling to the EVM,
 | |
| a new stack slot is created that is reserved
 | |
| for the variable and automatically removed again when the end of the block
 | |
| is reached. You can provide an initial value for the variable.
 | |
| If you do not provide a value, the variable will be initialized to zero.
 | |
| 
 | |
| Since variables are stored on the stack, they do not directly
 | |
| influence memory or storage, but they can be used as pointers
 | |
| to memory or storage locations in the built-in functions
 | |
| ``mstore``, ``mload``, ``sstore`` and ``sload``.
 | |
| Future dialects migh introduce specific types for such pointers.
 | |
| 
 | |
| When a variable is referenced, its current value is copied.
 | |
| For the EVM, this translates to a ``DUP`` instruction.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     {
 | |
|         let zero := 0
 | |
|         let v := calldataload(zero)
 | |
|         {
 | |
|             let y := add(sload(v), 1)
 | |
|             v := y
 | |
|         } // y is "deallocated" here
 | |
|         sstore(v, zero)
 | |
|     } // v and zero are "deallocated" here
 | |
| 
 | |
| 
 | |
| If the declared variable should have a type different from the default type,
 | |
| you denote that following a colon. You can also declare multiple
 | |
| variables in one statement when you assign from a function call
 | |
| that returns multiple values.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     {
 | |
|         let zero:uint32 := 0:uint32
 | |
|         let v:uint256, t:uint32 := f()
 | |
|         let x, y := g()
 | |
|     }
 | |
| 
 | |
| Depending on the optimiser settings, the compiler can free the stack slots
 | |
| already after the variable has been used for
 | |
| the last time, even though it is still in scope.
 | |
| 
 | |
| 
 | |
| Assignments
 | |
| -----------
 | |
| 
 | |
| Variables can be assigned to after their definition using the
 | |
| ``:=`` operator. It is possible to assign multiple
 | |
| variables at the same time. For this, the number and types of the
 | |
| values have to match.
 | |
| If you want to assign the values returned from a function that has
 | |
| multiple return parameters, you have to provide multiple variables.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     let v := 0
 | |
|     // re-assign v
 | |
|     v := 2
 | |
|     let t := add(v, 2)
 | |
|     function f() -> a, b { }
 | |
|     // assign multiple values
 | |
|     v, t := f()
 | |
| 
 | |
| 
 | |
| If
 | |
| --
 | |
| 
 | |
| The if statement can be used for conditionally executing code.
 | |
| No "else" block can be defined. Consider using "switch" instead (see below) if
 | |
| you need multiple alternatives.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     if eq(value, 0) { revert(0, 0) }
 | |
| 
 | |
| The curly braces for the body are required.
 | |
| 
 | |
| Switch
 | |
| ------
 | |
| 
 | |
| You can use a switch statement as an extended version of the if statement.
 | |
| It takes the value of an expression and compares it to several literal constants.
 | |
| The branch corresponding to the matching constant is taken.
 | |
| Contrary to other programming languages, for safety reasons, control flow does
 | |
| not continue from one case to the next. There can be a fallback or default
 | |
| case called ``default`` which is taken if none of the literal constants matches.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     {
 | |
|         let x := 0
 | |
|         switch calldataload(4)
 | |
|         case 0 {
 | |
|             x := calldataload(0x24)
 | |
|         }
 | |
|         default {
 | |
|             x := calldataload(0x44)
 | |
|         }
 | |
|         sstore(0, div(x, 2))
 | |
|     }
 | |
| 
 | |
| The list of cases is not enclosed by curly braces, but the body of a
 | |
| case does require them.
 | |
| 
 | |
| Loops
 | |
| -----
 | |
| 
 | |
| Yul supports for-loops which consist of
 | |
| a header containing an initializing part, a condition, a post-iteration
 | |
| part and a body. The condition has to be an expression, while
 | |
| the other three are blocks. If the initializing part
 | |
| declares any variables at the top level, the scope of these variables extends to all other
 | |
| parts of the loop.
 | |
| 
 | |
| The ``break`` and ``continue`` statements can be used in the body to exit the loop
 | |
| or skip to the post-part, respectively.
 | |
| 
 | |
| The following example computes the sum of an area in memory.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     {
 | |
|         let x := 0
 | |
|         for { let i := 0 } lt(i, 0x100) { i := add(i, 0x20) } {
 | |
|             x := add(x, mload(i))
 | |
|         }
 | |
|     }
 | |
| 
 | |
| For loops can also be used as a replacement for while loops:
 | |
| Simply leave the initialization and post-iteration parts empty.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     {
 | |
|         let x := 0
 | |
|         let i := 0
 | |
|         for { } lt(i, 0x100) { } {     // while(i < 0x100)
 | |
|             x := add(x, mload(i))
 | |
|             i := add(i, 0x20)
 | |
|         }
 | |
|     }
 | |
| 
 | |
| Function Declarations
 | |
| ---------------------
 | |
| 
 | |
| Yul allows the definition of functions. These should not be confused with functions
 | |
| in Solidity since they are never part of an external interface of a contract and
 | |
| are part of a namespace separate from the one for Solidity functions.
 | |
| 
 | |
| For the EVM, Yul functions take their
 | |
| arguments (and a return PC) from the stack and also put the results onto the
 | |
| stack. User-defined functions and built-in functions are called in exactly the same way.
 | |
| 
 | |
| Functions can be defined anywhere and are visible in the block they are
 | |
| declared in. Inside a function, you cannot access local variables
 | |
| defined outside of that function.
 | |
| 
 | |
| Functions declare parameters and return variables, similar to Solidity.
 | |
| To return a value, you assign it to the return variable(s).
 | |
| 
 | |
| If you call a function that returns multiple values, you have to assign
 | |
| them to multiple variables using ``a, b := f(x)`` or ``let a, b := f(x)``.
 | |
| 
 | |
| The ``leave`` statement can be used to exit the current function. It
 | |
| works like the ``return`` statement in other languages just that it does
 | |
| not take a value to return, it just exits the functions and the function
 | |
| will return whatever values are currently assigned to the return variable(s).
 | |
| 
 | |
| Note that the EVM dialect has a built-in function called ``return`` that
 | |
| quits the full execution context (internal message call) and not just
 | |
| the current yul function.
 | |
| 
 | |
| The following example implements the power function by square-and-multiply.
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     {
 | |
|         function power(base, exponent) -> result {
 | |
|             switch exponent
 | |
|             case 0 { result := 1 }
 | |
|             case 1 { result := base }
 | |
|             default {
 | |
|                 result := power(mul(base, base), div(exponent, 2))
 | |
|                 switch mod(exponent, 2)
 | |
|                     case 1 { result := mul(base, result) }
 | |
|             }
 | |
|         }
 | |
|     }
 | |
| 
 | |
| Specification of Yul
 | |
| ====================
 | |
| 
 | |
| This chapter describes Yul code formally. Yul code is usually placed inside Yul objects,
 | |
| which are explained in their own chapter.
 | |
| 
 | |
| .. code-block:: none
 | |
| 
 | |
|     Block = '{' Statement* '}'
 | |
|     Statement =
 | |
|         Block |
 | |
|         FunctionDefinition |
 | |
|         VariableDeclaration |
 | |
|         Assignment |
 | |
|         If |
 | |
|         Expression |
 | |
|         Switch |
 | |
|         ForLoop |
 | |
|         BreakContinue |
 | |
|         Leave
 | |
|     FunctionDefinition =
 | |
|         'function' Identifier '(' TypedIdentifierList? ')'
 | |
|         ( '->' TypedIdentifierList )? Block
 | |
|     VariableDeclaration =
 | |
|         'let' TypedIdentifierList ( ':=' Expression )?
 | |
|     Assignment =
 | |
|         IdentifierList ':=' Expression
 | |
|     Expression =
 | |
|         FunctionCall | Identifier | Literal
 | |
|     If =
 | |
|         'if' Expression Block
 | |
|     Switch =
 | |
|         'switch' Expression ( Case+ Default? | Default )
 | |
|     Case =
 | |
|         'case' Literal Block
 | |
|     Default =
 | |
|         'default' Block
 | |
|     ForLoop =
 | |
|         'for' Block Expression Block Block
 | |
|     BreakContinue =
 | |
|         'break' | 'continue'
 | |
|     Leave = 'leave'
 | |
|     FunctionCall =
 | |
|         Identifier '(' ( Expression ( ',' Expression )* )? ')'
 | |
|     Identifier = [a-zA-Z_$] [a-zA-Z_$0-9.]*
 | |
|     IdentifierList = Identifier ( ',' Identifier)*
 | |
|     TypeName = Identifier
 | |
|     TypedIdentifierList = Identifier ( ':' TypeName )? ( ',' Identifier ( ':' TypeName )? )*
 | |
|     Literal =
 | |
|         (NumberLiteral | StringLiteral | HexLiteral | TrueLiteral | FalseLiteral) ( ':' TypeName )?
 | |
|     NumberLiteral = HexNumber | DecimalNumber
 | |
|     HexLiteral = 'hex' ('"' ([0-9a-fA-F]{2})* '"' | '\'' ([0-9a-fA-F]{2})* '\'')
 | |
|     StringLiteral = '"' ([^"\r\n\\] | '\\' .)* '"'
 | |
|     TrueLiteral = 'true'
 | |
|     FalseLiteral = 'false'
 | |
|     HexNumber = '0x' [0-9a-fA-F]+
 | |
|     DecimalNumber = [0-9]+
 | |
| 
 | |
| 
 | |
| Restrictions on the Grammar
 | |
| ---------------------------
 | |
| 
 | |
| Apart from those directly imposed by the grammar, the following
 | |
| restrictions apply:
 | |
| 
 | |
| Switches must have at least one case (including the default case).
 | |
| All case values need to have the same type and distinct values.
 | |
| If all possible values of the expression type are covered, a default case is
 | |
| not allowed (i.e. a switch with a ``bool`` expression that has both a
 | |
| true and a false case do not allow a default case).
 | |
| 
 | |
| Every expression evaluates to zero or more values. Identifiers and Literals
 | |
| evaluate to exactly
 | |
| one value and function calls evaluate to a number of values equal to the
 | |
| number of return variables of the function called.
 | |
| 
 | |
| In variable declarations and assignments, the right-hand-side expression
 | |
| (if present) has to evaluate to a number of values equal to the number of
 | |
| variables on the left-hand-side.
 | |
| This is the only situation where an expression evaluating
 | |
| to more than one value is allowed.
 | |
| 
 | |
| Expressions that are also statements (i.e. at the block level) have to
 | |
| evaluate to zero values.
 | |
| 
 | |
| In all other situations, expressions have to evaluate to exactly one value.
 | |
| 
 | |
| The ``continue`` and ``break`` statements can only be used inside loop bodies
 | |
| and have to be in the same function as the loop (or both have to be at the
 | |
| top level). The ``continue`` and ``break`` statements cannot be used
 | |
| in other parts of a loop, not even when it is scoped inside a second loop's body.
 | |
| 
 | |
| The condition part of the for-loop has to evaluate to exactly one value.
 | |
| 
 | |
| The ``leave`` statement can only be used inside a function.
 | |
| 
 | |
| Functions cannot be defined anywhere inside for loop init blocks.
 | |
| 
 | |
| Literals cannot be larger than the their type. The largest type defined is 256-bit wide.
 | |
| 
 | |
| During assignments and function calls, the types of the respective values have to match.
 | |
| There is no implicit type conversion. Type conversion in general can only be achieved
 | |
| if the dialect provides an appropriate built-in function that takes a value of one
 | |
| type and returns a value of a different type.
 | |
| 
 | |
| Scoping Rules
 | |
| -------------
 | |
| 
 | |
| Scopes in Yul are tied to Blocks (exceptions are functions and the for loop
 | |
| as explained below) and all declarations
 | |
| (``FunctionDefinition``, ``VariableDeclaration``)
 | |
| introduce new identifiers into these scopes.
 | |
| 
 | |
| Identifiers are visible in
 | |
| the block they are defined in (including all sub-nodes and sub-blocks).
 | |
| 
 | |
| As an exception, the scope of the "init" part of the or-loop
 | |
| (the first block) extends across all other parts of the for loop.
 | |
| This means that variables declared in the init part (but not inside a
 | |
| block inside the init part) are visible in all other parts of the for-loop.
 | |
| 
 | |
| Identifiers declared in the other parts of the for loop respect the regular
 | |
| syntactical scoping rules.
 | |
| 
 | |
| This means a for-loop of the form ``for { I... } C { P... } { B... }`` is equivalent
 | |
| to ``{ I... for {} C { P... } { B... } }``.
 | |
| 
 | |
| 
 | |
| The parameters and return parameters of functions are visible in the
 | |
| function body and their names have to be distinct.
 | |
| 
 | |
| Variables can only be referenced after their declaration. In particular,
 | |
| variables cannot be referenced in the right hand side of their own variable
 | |
| declaration.
 | |
| Functions can be referenced already before their declaration (if they are visible).
 | |
| 
 | |
| Shadowing is disallowed, i.e. you cannot declare an identifier at a point
 | |
| where another identifier with the same name is also visible, even if it is
 | |
| not accessible.
 | |
| 
 | |
| Inside functions, it is not possible to access a variable that was declared
 | |
| outside of that function.
 | |
| 
 | |
| Formal Specification
 | |
| --------------------
 | |
| 
 | |
| We formally specify Yul by providing an evaluation function E overloaded
 | |
| on the various nodes of the AST. As builtin functions can have side effects,
 | |
| E takes two state objects and the AST node and returns two new
 | |
| state objects and a variable number of other values.
 | |
| The two state objects are the global state object
 | |
| (which in the context of the EVM is the memory, storage and state of the
 | |
| blockchain) and the local state object (the state of local variables, i.e. a
 | |
| segment of the stack in the EVM).
 | |
| 
 | |
| If the AST node is a statement, E returns the two state objects and a "mode",
 | |
| which is used for the ``break``, ``continue`` and ``leave`` statements.
 | |
| If the AST node is an expression, E returns the two state objects and
 | |
| as many values as the expression evaluates to.
 | |
| 
 | |
| 
 | |
| The exact nature of the global state is unspecified for this high level
 | |
| description. The local state ``L`` is a mapping of identifiers ``i`` to values ``v``,
 | |
| denoted as ``L[i] = v``.
 | |
| 
 | |
| For an identifier ``v``, let ``$v`` be the name of the identifier.
 | |
| 
 | |
| We will use a destructuring notation for the AST nodes.
 | |
| 
 | |
| .. code-block:: none
 | |
| 
 | |
|     E(G, L, <{St1, ..., Stn}>: Block) =
 | |
|         let G1, L1, mode = E(G, L, St1, ..., Stn)
 | |
|         let L2 be a restriction of L1 to the identifiers of L
 | |
|         G1, L2, mode
 | |
|     E(G, L, St1, ..., Stn: Statement) =
 | |
|         if n is zero:
 | |
|             G, L, regular
 | |
|         else:
 | |
|             let G1, L1, mode = E(G, L, St1)
 | |
|             if mode is regular then
 | |
|                 E(G1, L1, St2, ..., Stn)
 | |
|             otherwise
 | |
|                 G1, L1, mode
 | |
|     E(G, L, FunctionDefinition) =
 | |
|         G, L, regular
 | |
|     E(G, L, <let var_1, ..., var_n := rhs>: VariableDeclaration) =
 | |
|         E(G, L, <var_1, ..., var_n := rhs>: Assignment)
 | |
|     E(G, L, <let var_1, ..., var_n>: VariableDeclaration) =
 | |
|         let L1 be a copy of L where L1[$var_i] = 0 for i = 1, ..., n
 | |
|         G, L1, regular
 | |
|     E(G, L, <var_1, ..., var_n := rhs>: Assignment) =
 | |
|         let G1, L1, v1, ..., vn = E(G, L, rhs)
 | |
|         let L2 be a copy of L1 where L2[$var_i] = vi for i = 1, ..., n
 | |
|         G, L2, regular
 | |
|     E(G, L, <for { i1, ..., in } condition post body>: ForLoop) =
 | |
|         if n >= 1:
 | |
|             let G1, L, mode = E(G, L, i1, ..., in)
 | |
|             // mode has to be regular or leave due to the syntactic restrictions
 | |
|             if mode is leave then
 | |
|                 G1, L1 restricted to variables of L, leave
 | |
|             otherwise
 | |
|                 let G2, L2, mode = E(G1, L1, for {} condition post body)
 | |
|                 G2, L2 restricted to variables of L, mode
 | |
|         else:
 | |
|             let G1, L1, v = E(G, L, condition)
 | |
|             if v is false:
 | |
|                 G1, L1, regular
 | |
|             else:
 | |
|                 let G2, L2, mode = E(G1, L, body)
 | |
|                 if mode is break:
 | |
|                     G2, L2, regular
 | |
|                 otherwise if mode is leave:
 | |
|                     G2, L2, leave
 | |
|                 else:
 | |
|                     G3, L3, mode = E(G2, L2, post)
 | |
|                     if mode is leave:
 | |
|                         G2, L3, leave
 | |
|                     otherwise
 | |
|                         E(G3, L3, for {} condition post body)
 | |
|     E(G, L, break: BreakContinue) =
 | |
|         G, L, break
 | |
|     E(G, L, continue: BreakContinue) =
 | |
|         G, L, continue
 | |
|     E(G, L, leave: Leave) =
 | |
|         G, L, leave
 | |
|     E(G, L, <if condition body>: If) =
 | |
|         let G0, L0, v = E(G, L, condition)
 | |
|         if v is true:
 | |
|             E(G0, L0, body)
 | |
|         else:
 | |
|             G0, L0, regular
 | |
|     E(G, L, <switch condition case l1:t1 st1 ... case ln:tn stn>: Switch) =
 | |
|         E(G, L, switch condition case l1:t1 st1 ... case ln:tn stn default {})
 | |
|     E(G, L, <switch condition case l1:t1 st1 ... case ln:tn stn default st'>: Switch) =
 | |
|         let G0, L0, v = E(G, L, condition)
 | |
|         // i = 1 .. n
 | |
|         // Evaluate literals, context doesn't matter
 | |
|         let _, _, v1 = E(G0, L0, l1)
 | |
|         ...
 | |
|         let _, _, vn = E(G0, L0, ln)
 | |
|         if there exists smallest i such that vi = v:
 | |
|             E(G0, L0, sti)
 | |
|         else:
 | |
|             E(G0, L0, st')
 | |
| 
 | |
|     E(G, L, <name>: Identifier) =
 | |
|         G, L, L[$name]
 | |
|     E(G, L, <fname(arg1, ..., argn)>: FunctionCall) =
 | |
|         G1, L1, vn = E(G, L, argn)
 | |
|         ...
 | |
|         G(n-1), L(n-1), v2 = E(G(n-2), L(n-2), arg2)
 | |
|         Gn, Ln, v1 = E(G(n-1), L(n-1), arg1)
 | |
|         Let <function fname (param1, ..., paramn) -> ret1, ..., retm block>
 | |
|         be the function of name $fname visible at the point of the call.
 | |
|         Let L' be a new local state such that
 | |
|         L'[$parami] = vi and L'[$reti] = 0 for all i.
 | |
|         Let G'', L'', mode = E(Gn, L', block)
 | |
|         G'', Ln, L''[$ret1], ..., L''[$retm]
 | |
|     E(G, L, l: HexLiteral) = G, L, hexString(l),
 | |
|         where hexString decodes l from hex and left-aligns it into 32 bytes
 | |
|     E(G, L, l: StringLiteral) = G, L, utf8EncodeLeftAligned(l),
 | |
|         where utf8EncodeLeftAligned performs a utf8 encoding of l
 | |
|         and aligns it left into 32 bytes
 | |
|     E(G, L, n: HexNumber) = G, L, hex(n)
 | |
|         where hex is the hexadecimal decoding function
 | |
|     E(G, L, n: DecimalNumber) = G, L, dec(n),
 | |
|         where dec is the decimal decoding function
 | |
| 
 | |
| .. _opcodes:
 | |
| 
 | |
| EVM Dialect
 | |
| -----------
 | |
| 
 | |
| The default dialect of Yul currently is the EVM dialect for the currently selected version of the EVM.
 | |
| with a version of the EVM. The only type available in this dialect
 | |
| is ``u256``, the 256-bit native type of the Ethereum Virtual Machine.
 | |
| Since it is the default type of this dialect, it can be omitted.
 | |
| 
 | |
| The following table lists all builtin functions
 | |
| (depending on the EVM version) and provides a short description of the
 | |
| semantics of the function / opcode.
 | |
| This document does not want to be a full description of the Ethereum virtual machine.
 | |
| Please refer to a different document if you are interested in the precise semantics.
 | |
| 
 | |
| Opcodes marked with ``-`` do not return a result and all others return exactly one value.
 | |
| Opcodes marked with ``F``, ``H``, ``B``, ``C`` or ``I`` are present since Frontier, Homestead,
 | |
| Byzantium, Constantinople or Istanbul, respectively.
 | |
| 
 | |
| In the following, ``mem[a...b)`` signifies the bytes of memory starting at position ``a`` up to
 | |
| but not including position ``b`` and ``storage[p]`` signifies the storage contents at slot ``p``.
 | |
| 
 | |
| Since Yul manages local variables and control-flow,
 | |
| opcodes that interfere with these features are not available. This includes
 | |
| the ``dup`` and ``swap`` instructions as well as ``jump`` instructions, labels and the ``push`` instructions.
 | |
| 
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | Instruction             |     |   | Explanation                                                     |
 | |
| +=========================+=====+===+=================================================================+
 | |
| | stop()                  + `-` | F | stop execution, identical to return(0, 0)                       |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | add(x, y)               |     | F | x + y                                                           |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | sub(x, y)               |     | F | x - y                                                           |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | mul(x, y)               |     | F | x * y                                                           |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | div(x, y)               |     | F | x / y or 0 if y == 0                                            |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | sdiv(x, y)              |     | F | x / y, for signed numbers in two's complement, 0 if y == 0      |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | mod(x, y)               |     | F | x % y, 0 if y == 0                                              |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | smod(x, y)              |     | F | x % y, for signed numbers in two's complement, 0 if y == 0      |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | exp(x, y)               |     | F | x to the power of y                                             |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | not(x)                  |     | F | bitwise "not" of x (every bit of x is negated)                  |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | lt(x, y)                |     | F | 1 if x < y, 0 otherwise                                         |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | gt(x, y)                |     | F | 1 if x > y, 0 otherwise                                         |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | slt(x, y)               |     | F | 1 if x < y, 0 otherwise, for signed numbers in two's complement |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | sgt(x, y)               |     | F | 1 if x > y, 0 otherwise, for signed numbers in two's complement |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | eq(x, y)                |     | F | 1 if x == y, 0 otherwise                                        |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | iszero(x)               |     | F | 1 if x == 0, 0 otherwise                                        |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | and(x, y)               |     | F | bitwise "and" of x and y                                        |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | or(x, y)                |     | F | bitwise "or" of x and y                                         |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | xor(x, y)               |     | F | bitwise "xor" of x and y                                        |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | byte(n, x)              |     | F | nth byte of x, where the most significant byte is the 0th byte  |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | shl(x, y)               |     | C | logical shift left y by x bits                                  |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | shr(x, y)               |     | C | logical shift right y by x bits                                 |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | sar(x, y)               |     | C | signed arithmetic shift right y by x bits                       |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | addmod(x, y, m)         |     | F | (x + y) % m with arbitrary precision arithmetic, 0 if m == 0    |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | mulmod(x, y, m)         |     | F | (x * y) % m with arbitrary precision arithmetic, 0 if m == 0    |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | signextend(i, x)        |     | F | sign extend from (i*8+7)th bit counting from least significant  |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | keccak256(p, n)         |     | F | keccak(mem[p...(p+n)))                                          |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | pc()                    |     | F | current position in code                                        |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | pop(x)                  | `-` | F | discard value x                                                 |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | mload(p)                |     | F | mem[p...(p+32))                                                 |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | mstore(p, v)            | `-` | F | mem[p...(p+32)) := v                                            |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | mstore8(p, v)           | `-` | F | mem[p] := v & 0xff (only modifies a single byte)                |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | sload(p)                |     | F | storage[p]                                                      |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | sstore(p, v)            | `-` | F | storage[p] := v                                                 |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | msize()                 |     | F | size of memory, i.e. largest accessed memory index              |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | gas()                   |     | F | gas still available to execution                                |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | address()               |     | F | address of the current contract / execution context             |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | balance(a)              |     | F | wei balance at address a                                        |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | selfbalance()           |     | I | equivalent to balance(address()), but cheaper                   |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | caller()                |     | F | call sender (excluding ``delegatecall``)                        |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | callvalue()             |     | F | wei sent together with the current call                         |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | calldataload(p)         |     | F | call data starting from position p (32 bytes)                   |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | calldatasize()          |     | F | size of call data in bytes                                      |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | calldatacopy(t, f, s)   | `-` | F | copy s bytes from calldata at position f to mem at position t   |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | codesize()              |     | F | size of the code of the current contract / execution context    |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | codecopy(t, f, s)       | `-` | F | copy s bytes from code at position f to mem at position t       |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | extcodesize(a)          |     | F | size of the code at address a                                   |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | extcodecopy(a, t, f, s) | `-` | F | like codecopy(t, f, s) but take code at address a               |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | returndatasize()        |     | B | size of the last returndata                                     |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | returndatacopy(t, f, s) | `-` | B | copy s bytes from returndata at position f to mem at position t |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | extcodehash(a)          |     | C | code hash of address a                                          |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | create(v, p, n)         |     | F | create new contract with code mem[p...(p+n)) and send v wei     |
 | |
| |                         |     |   | and return the new address                                      |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | create2(v, p, n, s)     |     | C | create new contract with code mem[p...(p+n)) at address         |
 | |
| |                         |     |   | keccak256(0xff . this . s . keccak256(mem[p...(p+n)))           |
 | |
| |                         |     |   | and send v wei and return the new address, where ``0xff`` is a  |
 | |
| |                         |     |   | 1 byte value, ``this`` is the current contract's address        |
 | |
| |                         |     |   | as a 20 byte value and ``s`` is a big-endian 256-bit value      |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | call(g, a, v, in,       |     | F | call contract at address a with input mem[in...(in+insize))     |
 | |
| | insize, out, outsize)   |     |   | providing g gas and v wei and output area                       |
 | |
| |                         |     |   | mem[out...(out+outsize)) returning 0 on error (eg. out of gas)  |
 | |
| |                         |     |   | and 1 on success                                                |
 | |
| |                         |     |   | :ref:`See more <yul-call-return-area>`                          |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | callcode(g, a, v, in,   |     | F | identical to ``call`` but only use the code from a and stay     |
 | |
| | insize, out, outsize)   |     |   | in the context of the current contract otherwise                |
 | |
| |                         |     |   | :ref:`See more <yul-call-return-area>`                          |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | delegatecall(g, a, in,  |     | H | identical to ``callcode`` but also keep ``caller``              |
 | |
| | insize, out, outsize)   |     |   | and ``callvalue``                                               |
 | |
| |                         |     |   | :ref:`See more <yul-call-return-area>`                          |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | staticcall(g, a, in,    |     | B | identical to ``call(g, a, 0, in, insize, out, outsize)`` but do |
 | |
| | insize, out, outsize)   |     |   | not allow state modifications                                   |
 | |
| |                         |     |   | :ref:`See more <yul-call-return-area>`                          |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | return(p, s)            | `-` | F | end execution, return data mem[p...(p+s))                       |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | revert(p, s)            | `-` | B | end execution, revert state changes, return data mem[p...(p+s)) |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | selfdestruct(a)         | `-` | F | end execution, destroy current contract and send funds to a     |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | invalid()               | `-` | F | end execution with invalid instruction                          |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | log0(p, s)              | `-` | F | log without topics and data mem[p...(p+s))                      |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | log1(p, s, t1)          | `-` | F | log with topic t1 and data mem[p...(p+s))                       |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | log2(p, s, t1, t2)      | `-` | F | log with topics t1, t2 and data mem[p...(p+s))                  |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | log3(p, s, t1, t2, t3)  | `-` | F | log with topics t1, t2, t3 and data mem[p...(p+s))              |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | log4(p, s, t1, t2, t3,  | `-` | F | log with topics t1, t2, t3, t4 and data mem[p...(p+s))          |
 | |
| | t4)                     |     |   |                                                                 |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | chainid()               |     | I | ID of the executing chain (EIP 1344)                            |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | origin()                |     | F | transaction sender                                              |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | gasprice()              |     | F | gas price of the transaction                                    |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | blockhash(b)            |     | F | hash of block nr b - only for last 256 blocks excluding current |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | coinbase()              |     | F | current mining beneficiary                                      |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | timestamp()             |     | F | timestamp of the current block in seconds since the epoch       |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | number()                |     | F | current block number                                            |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | difficulty()            |     | F | difficulty of the current block                                 |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| | gaslimit()              |     | F | block gas limit of the current block                            |
 | |
| +-------------------------+-----+---+-----------------------------------------------------------------+
 | |
| 
 | |
| .. _yul-call-return-area:
 | |
| 
 | |
| .. note::
 | |
|   The ``call*`` instructions use the ``out`` and ``outsize`` parameters to define an area in memory where
 | |
|   the return data is placed. This area is written to depending on how many bytes the called contract returns.
 | |
|   If it returns more data, only the first ``outsize`` bytes are written. You can access the rest of the data
 | |
|   using the ``returndatacopy`` opcode. If it returns less data, then the remaining bytes are not touched at all.
 | |
|   You need to use the ``returndatasize`` opcode to check which part of this memory area contains the return data.
 | |
|   The remaining bytes will retain their values as of before the call. If the call fails (it returns ``0``),
 | |
|   nothing is written to that area, but you can still retrieve the failure data using ``returndatacopy``.
 | |
| 
 | |
| 
 | |
| In some internal dialects, there are additional functions:
 | |
| 
 | |
| datasize, dataoffset, datacopy
 | |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 | |
| 
 | |
| The functions ``datasize(x)``, ``dataoffset(x)`` and ``datacopy(t, f, l)``,
 | |
| are used to access other parts of a Yul object.
 | |
| 
 | |
| ``datasize`` and ``dataoffset`` can only take string literals (the names of other objects)
 | |
| as arguments and return the size and offset in the data area, respectively.
 | |
| For the EVM, the ``datacopy`` function is equivalent to ``codecopy``.
 | |
| 
 | |
| 
 | |
| setimmutable, loadimmutable
 | |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 | |
| 
 | |
| The functions ``setimmutable("name", value)`` and ``loadimmutable("name")`` are
 | |
| used for the immutable mechanism in Solidity and do not nicely map to pur Yul.
 | |
| The function ``setimmutable`` assumes that the runtime code of a contract
 | |
| is currently copied to memory at offsot zero. The call to ``setimmutable("name", value)``
 | |
| will store ``value`` at all points in memory that contain a call to
 | |
| ``loadimmutable("name")``.
 | |
| 
 | |
| 
 | |
| 
 | |
| .. _yul-object:
 | |
| 
 | |
| Specification of Yul Object
 | |
| ===========================
 | |
| 
 | |
| Yul objects are used to group named code and data sections.
 | |
| The functions ``datasize``, ``dataoffset`` and ``datacopy``
 | |
| can be used to access these sections from within code.
 | |
| Hex strings can be used to specify data in hex encoding,
 | |
| regular strings in native encoding. For code,
 | |
| ``datacopy`` will access its assembled binary representation.
 | |
| 
 | |
| .. code-block:: none
 | |
| 
 | |
|     Object = 'object' StringLiteral '{' Code ( Object | Data )* '}'
 | |
|     Code = 'code' Block
 | |
|     Data = 'data' StringLiteral ( HexLiteral | StringLiteral )
 | |
|     HexLiteral = 'hex' ('"' ([0-9a-fA-F]{2})* '"' | '\'' ([0-9a-fA-F]{2})* '\'')
 | |
|     StringLiteral = '"' ([^"\r\n\\] | '\\' .)* '"'
 | |
| 
 | |
| Above, ``Block`` refers to ``Block`` in the Yul code grammar explained in the previous chapter.
 | |
| 
 | |
| An example Yul Object is shown below:
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     // A contract consists of a single object with sub-objects representing
 | |
|     // the code to be deployed or other contracts it can create.
 | |
|     // The single "code" node is the executable code of the object.
 | |
|     // Every (other) named object or data section is serialized and
 | |
|     // made accessible to the special built-in functions datacopy / dataoffset / datasize
 | |
|     // The current object, sub-objects and data items inside the current object
 | |
|     // are in scope.
 | |
|     object "Contract1" {
 | |
|         // This is the constructor code of the contract.
 | |
|         code {
 | |
|             function allocate(size) -> ptr {
 | |
|                 ptr := mload(0x40)
 | |
|                 if iszero(ptr) { ptr := 0x60 }
 | |
|                 mstore(0x40, add(ptr, size))
 | |
|             }
 | |
| 
 | |
|             // first create "Contract2"
 | |
|             let size := datasize("Contract2")
 | |
|             let offset := allocate(size)
 | |
|             // This will turn into codecopy for EVM
 | |
|             datacopy(offset, dataoffset("Contract2"), size)
 | |
|             // constructor parameter is a single number 0x1234
 | |
|             mstore(add(offset, size), 0x1234)
 | |
|             pop(create(offset, add(size, 32), 0))
 | |
| 
 | |
|             // now return the runtime object (the currently
 | |
|             // executing code is the constructor code)
 | |
|             size := datasize("runtime")
 | |
|             offset := allocate(size)
 | |
|             // This will turn into a memory->memory copy for eWASM and
 | |
|             // a codecopy for EVM
 | |
|             datacopy(offset, dataoffset("runtime"), size)
 | |
|             return(offset, size)
 | |
|         }
 | |
| 
 | |
|         data "Table2" hex"4123"
 | |
| 
 | |
|         object "runtime" {
 | |
|             code {
 | |
|                 function allocate(size) -> ptr {
 | |
|                     ptr := mload(0x40)
 | |
|                     if iszero(ptr) { ptr := 0x60 }
 | |
|                     mstore(0x40, add(ptr, size))
 | |
|                 }
 | |
| 
 | |
|                 // runtime code
 | |
| 
 | |
|                 mstore(0, "Hello, World!")
 | |
|                 return(0, 0x20)
 | |
|             }
 | |
|         }
 | |
| 
 | |
|         // Embedded object. Use case is that the outside is a factory contract,
 | |
|         // and Contract2 is the code to be created by the factory
 | |
|         object "Contract2" {
 | |
|             code {
 | |
|                 // code here ...
 | |
|             }
 | |
| 
 | |
|             object "runtime" {
 | |
|                 code {
 | |
|                     // code here ...
 | |
|                 }
 | |
|             }
 | |
| 
 | |
|             data "Table1" hex"4123"
 | |
|         }
 | |
|     }
 | |
| 
 | |
| Yul Optimizer
 | |
| =============
 | |
| 
 | |
| The Yul optimizer operates on Yul code and uses the same language for input, output and
 | |
| intermediate states. This allows for easy debugging and verification of the optimizer.
 | |
| 
 | |
| Please see the
 | |
| `documentation in the source code <https://github.com/ethereum/solidity/blob/develop/libyul/optimiser/README.md>`_
 | |
| for more details about its internals.
 | |
| 
 | |
| If you want to use Solidity in stand-alone Yul mode, you activate the optimizer using ``--optimize``:
 | |
| 
 | |
| .. code-block:: sh
 | |
| 
 | |
|     solc --strict-assembly --optimize
 | |
| 
 | |
| In Solidity mode, the Yul optimizer is activated together with the regular optimizer.
 | |
| 
 | |
| Optimization step sequence
 | |
| --------------------------
 | |
| 
 | |
| By default the Yul optimizer applies its predefined sequence of optimization steps to the generated assembly.
 | |
| You can override this sequence and supply your own using the ``--yul-optimizations`` option:
 | |
| 
 | |
| .. code-block:: sh
 | |
| 
 | |
|     solc --optimize --ir-optimized --yul-optimizations 'dhfoD[xarrscLMcCTU]uljmul'
 | |
| 
 | |
| The order of steps is significant and affects the quality of the output.
 | |
| Moreover, applying a step may uncover new optimization opportunities for others that were already
 | |
| applied so repeating steps is often beneficial.
 | |
| By enclosing part of the sequence in square brackets (``[]``) you tell the optimizer to repeatedly
 | |
| apply that part until it no longer improves the size of the resulting assembly.
 | |
| You can use brackets multiple times in a single sequence but they cannot be nested.
 | |
| 
 | |
| The following optimization steps are available:
 | |
| 
 | |
| ============ ===============================
 | |
| Abbreviation Full name
 | |
| ============ ===============================
 | |
| ``f``        ``BlockFlattener``
 | |
| ``l``        ``CircularReferencesPruner``
 | |
| ``c``        ``CommonSubexpressionEliminator``
 | |
| ``C``        ``ConditionalSimplifier``
 | |
| ``U``        ``ConditionalUnsimplifier``
 | |
| ``n``        ``ControlFlowSimplifier``
 | |
| ``D``        ``DeadCodeEliminator``
 | |
| ``v``        ``EquivalentFunctionCombiner``
 | |
| ``e``        ``ExpressionInliner``
 | |
| ``j``        ``ExpressionJoiner``
 | |
| ``s``        ``ExpressionSimplifier``
 | |
| ``x``        ``ExpressionSplitter``
 | |
| ``I``        ``ForLoopConditionIntoBody``
 | |
| ``O``        ``ForLoopConditionOutOfBody``
 | |
| ``o``        ``ForLoopInitRewriter``
 | |
| ``i``        ``FullInliner``
 | |
| ``g``        ``FunctionGrouper``
 | |
| ``h``        ``FunctionHoister``
 | |
| ``T``        ``LiteralRematerialiser``
 | |
| ``L``        ``LoadResolver``
 | |
| ``M``        ``LoopInvariantCodeMotion``
 | |
| ``r``        ``RedundantAssignEliminator``
 | |
| ``m``        ``Rematerialiser``
 | |
| ``V``        ``SSAReverser``
 | |
| ``a``        ``SSATransform``
 | |
| ``t``        ``StructuralSimplifier``
 | |
| ``u``        ``UnusedPruner``
 | |
| ``d``        ``VarDeclInitializer``
 | |
| ============ ===============================
 | |
| 
 | |
| Some steps depend on properties ensured by ``BlockFlattener``, ``FunctionGrouper``, ``ForLoopInitRewriter``.
 | |
| For this reason the Yul optimizer always applies them before applying any steps supplied by the user.
 | |
| 
 | |
| 
 | |
| .. _erc20yul:
 | |
| 
 | |
| Complete ERC20 Example
 | |
| ======================
 | |
| 
 | |
| .. code-block:: yul
 | |
| 
 | |
|     object "Token" {
 | |
|         code {
 | |
|             // Store the creator in slot zero.
 | |
|             sstore(0, caller())
 | |
| 
 | |
|             // Deploy the contract
 | |
|             datacopy(0, dataoffset("runtime"), datasize("runtime"))
 | |
|             return(0, datasize("runtime"))
 | |
|         }
 | |
|         object "runtime" {
 | |
|             code {
 | |
|                 // Protection against sending Ether
 | |
|                 require(iszero(callvalue()))
 | |
| 
 | |
|                 // Dispatcher
 | |
|                 switch selector()
 | |
|                 case 0x70a08231 /* "balanceOf(address)" */ {
 | |
|                     returnUint(balanceOf(decodeAsAddress(0)))
 | |
|                 }
 | |
|                 case 0x18160ddd /* "totalSupply()" */ {
 | |
|                     returnUint(totalSupply())
 | |
|                 }
 | |
|                 case 0xa9059cbb /* "transfer(address,uint256)" */ {
 | |
|                     transfer(decodeAsAddress(0), decodeAsUint(1))
 | |
|                     returnTrue()
 | |
|                 }
 | |
|                 case 0x23b872dd /* "transferFrom(address,address,uint256)" */ {
 | |
|                     transferFrom(decodeAsAddress(0), decodeAsAddress(1), decodeAsUint(2))
 | |
|                     returnTrue()
 | |
|                 }
 | |
|                 case 0x095ea7b3 /* "approve(address,uint256)" */ {
 | |
|                     approve(decodeAsAddress(0), decodeAsUint(1))
 | |
|                     returnTrue()
 | |
|                 }
 | |
|                 case 0xdd62ed3e /* "allowance(address,address)" */ {
 | |
|                     returnUint(allowance(decodeAsAddress(0), decodeAsAddress(1)))
 | |
|                 }
 | |
|                 case 0x40c10f19 /* "mint(address,uint256)" */ {
 | |
|                     mint(decodeAsAddress(0), decodeAsUint(1))
 | |
|                     returnTrue()
 | |
|                 }
 | |
|                 default {
 | |
|                     revert(0, 0)
 | |
|                 }
 | |
| 
 | |
|                 function mint(account, amount) {
 | |
|                     require(calledByOwner())
 | |
| 
 | |
|                     mintTokens(amount)
 | |
|                     addToBalance(account, amount)
 | |
|                     emitTransfer(0, account, amount)
 | |
|                 }
 | |
|                 function transfer(to, amount) {
 | |
|                     executeTransfer(caller(), to, amount)
 | |
|                 }
 | |
|                 function approve(spender, amount) {
 | |
|                     revertIfZeroAddress(spender)
 | |
|                     setAllowance(caller(), spender, amount)
 | |
|                     emitApproval(caller(), spender, amount)
 | |
|                 }
 | |
|                 function transferFrom(from, to, amount) {
 | |
|                     decreaseAllowanceBy(from, caller(), amount)
 | |
|                     executeTransfer(from, to, amount)
 | |
|                 }
 | |
| 
 | |
|                 function executeTransfer(from, to, amount) {
 | |
|                     revertIfZeroAddress(to)
 | |
|                     deductFromBalance(from, amount)
 | |
|                     addToBalance(to, amount)
 | |
|                     emitTransfer(from, to, amount)
 | |
|                 }
 | |
| 
 | |
| 
 | |
|                 /* ---------- calldata decoding functions ----------- */
 | |
|                 function selector() -> s {
 | |
|                     s := div(calldataload(0), 0x100000000000000000000000000000000000000000000000000000000)
 | |
|                 }
 | |
| 
 | |
|                 function decodeAsAddress(offset) -> v {
 | |
|                     v := decodeAsUint(offset)
 | |
|                     if iszero(iszero(and(v, not(0xffffffffffffffffffffffffffffffffffffffff)))) {
 | |
|                         revert(0, 0)
 | |
|                     }
 | |
|                 }
 | |
|                 function decodeAsUint(offset) -> v {
 | |
|                     let pos := add(4, mul(offset, 0x20))
 | |
|                     if lt(calldatasize(), add(pos, 0x20)) {
 | |
|                         revert(0, 0)
 | |
|                     }
 | |
|                     v := calldataload(pos)
 | |
|                 }
 | |
|                 /* ---------- calldata encoding functions ---------- */
 | |
|                 function returnUint(v) {
 | |
|                     mstore(0, v)
 | |
|                     return(0, 0x20)
 | |
|                 }
 | |
|                 function returnTrue() {
 | |
|                     returnUint(1)
 | |
|                 }
 | |
| 
 | |
|                 /* -------- events ---------- */
 | |
|                 function emitTransfer(from, to, amount) {
 | |
|                     let signatureHash := 0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef
 | |
|                     emitEvent(signatureHash, from, to, amount)
 | |
|                 }
 | |
|                 function emitApproval(from, spender, amount) {
 | |
|                     let signatureHash := 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925
 | |
|                     emitEvent(signatureHash, from, spender, amount)
 | |
|                 }
 | |
|                 function emitEvent(signatureHash, indexed1, indexed2, nonIndexed) {
 | |
|                     mstore(0, nonIndexed)
 | |
|                     log3(0, 0x20, signatureHash, indexed1, indexed2)
 | |
|                 }
 | |
| 
 | |
|                 /* -------- storage layout ---------- */
 | |
|                 function ownerPos() -> p { p := 0 }
 | |
|                 function totalSupplyPos() -> p { p := 1 }
 | |
|                 function accountToStorageOffset(account) -> offset {
 | |
|                     offset := add(0x1000, account)
 | |
|                 }
 | |
|                 function allowanceStorageOffset(account, spender) -> offset {
 | |
|                     offset := accountToStorageOffset(account)
 | |
|                     mstore(0, offset)
 | |
|                     mstore(0x20, spender)
 | |
|                     offset := keccak256(0, 0x40)
 | |
|                 }
 | |
| 
 | |
|                 /* -------- storage access ---------- */
 | |
|                 function owner() -> o {
 | |
|                     o := sload(ownerPos())
 | |
|                 }
 | |
|                 function totalSupply() -> supply {
 | |
|                     supply := sload(totalSupplyPos())
 | |
|                 }
 | |
|                 function mintTokens(amount) {
 | |
|                     sstore(totalSupplyPos(), safeAdd(totalSupply(), amount))
 | |
|                 }
 | |
|                 function balanceOf(account) -> bal {
 | |
|                     bal := sload(accountToStorageOffset(account))
 | |
|                 }
 | |
|                 function addToBalance(account, amount) {
 | |
|                     let offset := accountToStorageOffset(account)
 | |
|                     sstore(offset, safeAdd(sload(offset), amount))
 | |
|                 }
 | |
|                 function deductFromBalance(account, amount) {
 | |
|                     let offset := accountToStorageOffset(account)
 | |
|                     let bal := sload(offset)
 | |
|                     require(lte(amount, bal))
 | |
|                     sstore(offset, sub(bal, amount))
 | |
|                 }
 | |
|                 function allowance(account, spender) -> amount {
 | |
|                     amount := sload(allowanceStorageOffset(account, spender))
 | |
|                 }
 | |
|                 function setAllowance(account, spender, amount) {
 | |
|                     sstore(allowanceStorageOffset(account, spender), amount)
 | |
|                 }
 | |
|                 function decreaseAllowanceBy(account, spender, amount) {
 | |
|                     let offset := allowanceStorageOffset(account, spender)
 | |
|                     let currentAllowance := sload(offset)
 | |
|                     require(lte(amount, currentAllowance))
 | |
|                     sstore(offset, sub(currentAllowance, amount))
 | |
|                 }
 | |
| 
 | |
|                 /* ---------- utility functions ---------- */
 | |
|                 function lte(a, b) -> r {
 | |
|                     r := iszero(gt(a, b))
 | |
|                 }
 | |
|                 function gte(a, b) -> r {
 | |
|                     r := iszero(lt(a, b))
 | |
|                 }
 | |
|                 function safeAdd(a, b) -> r {
 | |
|                     r := add(a, b)
 | |
|                     if or(lt(r, a), lt(r, b)) { revert(0, 0) }
 | |
|                 }
 | |
|                 function calledByOwner() -> cbo {
 | |
|                     cbo := eq(owner(), caller())
 | |
|                 }
 | |
|                 function revertIfZeroAddress(addr) {
 | |
|                     require(addr)
 | |
|                 }
 | |
|                 function require(condition) {
 | |
|                     if iszero(condition) { revert(0, 0) }
 | |
|                 }
 | |
|             }
 | |
|         }
 | |
|     }
 |