mirror of
				https://github.com/ethereum/solidity
				synced 2023-10-03 13:03:40 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			160 lines
		
	
	
		
			6.6 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			160 lines
		
	
	
		
			6.6 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| Note that the Yul optimiser is still in research phase. Because of that,
 | |
| the following description might not fully reflect the current or even
 | |
| planned state of the optimiser.
 | |
| 
 | |
| ## Yul Optimiser
 | |
| 
 | |
| The Yul optimiser consists of several stages and components that all transform
 | |
| the AST in a semantically equivalent way. The goal is to end up either with code
 | |
| that is shorter or at least only marginally longer but will allow further
 | |
| optimisation steps.
 | |
| 
 | |
| The optimiser currently follows a purely greedy strategy and does not do any
 | |
| backtracking.
 | |
| 
 | |
| ## Disambiguator
 | |
| 
 | |
| The disambiguator takes an AST and returns a fresh copy where all identifiers have
 | |
| names unique to the input AST. This is a prerequisite for all other optimiser stages.
 | |
| One of the benefits is that identifier lookup does not need to take scopes into account
 | |
| and we can basically ignore the result of the analysis phase.
 | |
| 
 | |
| All subsequent stages have the property that all names stay unique. This means if
 | |
| a new identifier needs to be introduced, a new unique name is generated.
 | |
| 
 | |
| ## Function Hoister
 | |
| 
 | |
| The function hoister moves all function definitions to the end of the topmost block. This is
 | |
| a semantically equivalent transformation as long as it is performed after the
 | |
| disambiguation stage. The reason is that moving a definition to a higher-level block cannot decrease
 | |
| its visibility and it is impossible to reference variables defined in a different function.
 | |
| 
 | |
| The benefit of this stage is that function definitions can be looked up more easily.
 | |
| 
 | |
| ## Function Grouper
 | |
| 
 | |
| The function grouper has to be applied after the disambiguator and the function hoister.
 | |
| Its effect is that all topmost elements that are not function definitions are moved
 | |
| into a single block which is the first statement of the root block.
 | |
| 
 | |
| After this step, a program has the following normal form:
 | |
| 
 | |
| 	{ I F... }
 | |
| 
 | |
| Where I is a block that does not contain any function definitions (not even recursively)
 | |
| and F is a list of function definitions such that no function contains a function definition.
 | |
| 
 | |
| ## Functional Inliner
 | |
| 
 | |
| The functional inliner depends on the disambiguator, the function hoister and function grouper.
 | |
| It performs function inlining such that the result of the inlining is an expression. This can
 | |
| only be done if the body of the function to be inlined has the form ``{ r := E }`` where ``r``
 | |
| is the single return value of the function, ``E`` is an expression and all arguments in the
 | |
| function call are so-called movable expressions. A movable expression is either a literal, a
 | |
| variable or a function call (or EVM opcode) which does not have side-effects and also does not
 | |
| depend on any side-effects.
 | |
| 
 | |
| As an example, neither ``mload`` nor ``mstore`` would be allowed.
 | |
| 
 | |
| ## Expression Splitter
 | |
| 
 | |
| The expression splitter turns expressions like ``add(mload(x), mul(mload(y), 0x20))``
 | |
| into a sequence of declarations of unique variables that are assigned sub-expressions
 | |
| of that expression so that each function call has only variables or literals
 | |
| as arguments.
 | |
| 
 | |
| The above would be transformed into
 | |
| 
 | |
|     {
 | |
|         let _1 := mload(y)
 | |
|         let _2 := mul(_1, 0x20)
 | |
|         let _3 := mload(x)
 | |
|         let z := add(_3, _2)
 | |
|     }
 | |
| 
 | |
| Note that this transformation does not change the order of opcodes or function calls.
 | |
| 
 | |
| It is not applied to loop conditions, because the loop control flow does not allow
 | |
| this "outlining" of the inner expressions in all cases.
 | |
| 
 | |
| The final program should be in a form such that with the exception of loop conditions,
 | |
| function calls can only appear in the right-hand side of a variable declaration,
 | |
| assignments or expression statements and all arguments have to be constants or variables.
 | |
| 
 | |
| The benefits of this form are that it is much easier to re-order the sequence of opcodes
 | |
| and it is also easier to perform function call inlining. The drawback is that
 | |
| such code is much harder to read for humans.
 | |
| 
 | |
| ## Expression Joiner
 | |
| 
 | |
| This is the opposite operation of the expression splitter. It turns a sequence of
 | |
| variable declarations that have exactly one reference into a complex expression.
 | |
| This stage again fully preserves the order of function calls and opcode executions.
 | |
| It does not make use of any information concerning the commutability of opcodes;
 | |
| if moving the value of a variable to its place of use would change the order
 | |
| of any function call or opcode execution, the transformation is not performed.
 | |
| 
 | |
| Note that the component will not move the assigned value of a variable assignment
 | |
| or a variable that is referenced more than once.
 | |
| 
 | |
| ## Common Subexpression Eliminator
 | |
| 
 | |
| This step replaces a subexpression by the value of a pre-existing variable
 | |
| that currently has the same value (only if the value is movable), based
 | |
| on a syntactic comparison.
 | |
| 
 | |
| This can be used to compute a local value numbering, especially if the
 | |
| expression splitter is used before.
 | |
| 
 | |
| The expression simplifier will be able to perform better replacements
 | |
| if the common subexpression eliminator was run right before it.
 | |
| 
 | |
| Prerequisites: Disambiguator
 | |
| 
 | |
| ## Full Function Inliner
 | |
| 
 | |
| ## Rematerialisation
 | |
| 
 | |
| The rematerialisation stage tries to replace variable references by the expression that
 | |
| was last assigned to the variable. This is of course only beneficial if this expression
 | |
| is comparatively cheap to evaluate. Furthermore, it is only semantically equivalent if
 | |
| the value of the expression did not change between the point of assignment and the
 | |
| point of use. The main benefit of this stage is that it can save stack slots if it
 | |
| leads to a variable being eliminated completely (see below), but it can also
 | |
| save a DUP opcode on the EVM if the expression is very cheap.
 | |
| 
 | |
| The algorithm only allows movable expressions (see above for a definition) in this case.
 | |
| Expressions that contain other variables are also disallowed if one of those variables
 | |
| have been assigned to in the meantime. This is also not applied to variables where
 | |
| assignment and use span across loops and conditionals.
 | |
| 
 | |
| ## Unused Definition Pruner
 | |
| 
 | |
| If a variable or function is not referenced, it is removed from the code.
 | |
| If there are two assignments to a variable where the first one is a movable expression
 | |
| and the variable is not used between the two assignments (and the second is not inside
 | |
| a loop or conditional, the first one is not inside), the first assignment is removed.
 | |
| 
 | |
| This step also removes movable expression statements.
 | |
| 
 | |
| 
 | |
| ## Function Unifier
 | |
| 
 | |
| ## Expression Simplifier
 | |
| 
 | |
| This step can only be applied for the EVM-flavoured dialect of Yul. It applies
 | |
| simple rules like ``x + 0 == x`` to simplify expressions.
 | |
| 
 | |
| ## Ineffective Statement Remover
 | |
| 
 | |
| This step removes statements that have no side-effects.
 | |
| 
 | |
| ## WebAssembly specific
 | |
| 
 | |
| ### Main Function
 | |
| 
 | |
| Changes the topmost block to be a function with a specific name ("main") which has no
 | |
| inputs nor outputs.
 | |
| 
 | |
| Depends on the Function Grouper.
 |