mirror of
				https://github.com/ethereum/solidity
				synced 2023-10-03 13:03:40 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			72 lines
		
	
	
		
			3.6 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			72 lines
		
	
	
		
			3.6 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| .. index:: optimizer, common subexpression elimination, constant propagation
 | |
| 
 | |
| *************
 | |
| The Optimiser
 | |
| *************
 | |
| 
 | |
| This section discusses the optimiser that was first added to Solidity,
 | |
| which operates on opcode streams. For information on the new Yul-based optimiser,
 | |
| please see the `readme on github <https://github.com/ethereum/solidity/blob/develop/libyul/optimiser/README.md>`_.
 | |
| 
 | |
| The Solidity optimiser operates on assembly. It splits the sequence of instructions into basic blocks
 | |
| at ``JUMPs`` and ``JUMPDESTs``. Inside these blocks, the optimiser
 | |
| analyses the instructions and records every modification to the stack,
 | |
| memory, or storage as an expression which consists of an instruction and
 | |
| a list of arguments which are pointers to other expressions. The optimiser
 | |
| uses a component called "CommonSubexpressionEliminator" that amongst other
 | |
| tasks, finds expressions that are always equal (on every input) and combines
 | |
| them into an expression class. The optimiser first tries to find each new
 | |
| expression in a list of already known expressions. If this does not work,
 | |
| it simplifies the expression according to rules like
 | |
| ``constant + constant = sum_of_constants`` or ``X * 1 = X``. Since this is
 | |
| a recursive process, we can also apply the latter rule if the second factor
 | |
| is a more complex expression where we know that it always evaluates to one.
 | |
| Modifications to storage and memory locations have to erase knowledge about
 | |
| storage and memory locations which are not known to be different. If we first
 | |
| write to location x and then to location y and both are input variables, the
 | |
| second could overwrite the first, so we do not know what is stored at x after
 | |
| we wrote to y. If simplification of the expression x - y evaluates to a
 | |
| non-zero constant, we know that we can keep our knowledge about what is stored at x.
 | |
| 
 | |
| After this process, we know which expressions have to be on the stack at
 | |
| the end, and have a list of modifications to memory and storage. This information
 | |
| is stored together with the basic blocks and is used to link them. Furthermore,
 | |
| knowledge about the stack, storage and memory configuration is forwarded to
 | |
| the next block(s). If we know the targets of all ``JUMP`` and ``JUMPI`` instructions,
 | |
| we can build a complete control flow graph of the program. If there is only
 | |
| one target we do not know (this can happen as in principle, jump targets can
 | |
| be computed from inputs), we have to erase all knowledge about the input state
 | |
| of a block as it can be the target of the unknown ``JUMP``. If the optimiser
 | |
| finds a ``JUMPI`` whose condition evaluates to a constant, it transforms it
 | |
| to an unconditional jump.
 | |
| 
 | |
| As the last step, the code in each block is re-generated. The optimiser creates
 | |
| a dependency graph from the expressions on the stack at the end of the block,
 | |
| and it drops every operation that is not part of this graph. It generates code
 | |
| that applies the modifications to memory and storage in the order they were
 | |
| made in the original code (dropping modifications which were found not to be
 | |
| needed). Finally, it generates all values that are required to be on the
 | |
| stack in the correct place.
 | |
| 
 | |
| These steps are applied to each basic block and the newly generated code
 | |
| is used as replacement if it is smaller. If a basic block is split at a
 | |
| ``JUMPI`` and during the analysis, the condition evaluates to a constant,
 | |
| the ``JUMPI`` is replaced depending on the value of the constant. Thus code like
 | |
| 
 | |
| ::
 | |
| 
 | |
|     uint x = 7;
 | |
|     data[7] = 9;
 | |
|     if (data[x] != x + 2)
 | |
|       return 2;
 | |
|     else
 | |
|       return 1;
 | |
| 
 | |
| still simplifies to code which you can compile even though the instructions contained
 | |
| a jump in the beginning of the process:
 | |
| 
 | |
| ::
 | |
| 
 | |
|     data[7] = 9;
 | |
|     return 1;
 |