mirror of
https://github.com/ethereum/solidity
synced 2023-10-03 13:03:40 +00:00
Merge pull request #11360 from maurelian/patch-2
Some improvements to optimizer documentation
This commit is contained in:
commit
d07c85db67
@ -6,24 +6,24 @@ The Optimizer
|
||||
*************
|
||||
|
||||
The Solidity compiler uses two different optimizer modules: The "old" optimizer
|
||||
that operates at opcode level and the "new" optimizer that operates on Yul IR code.
|
||||
that operates at the opcode level and the "new" optimizer that operates on Yul IR code.
|
||||
|
||||
The opcode-based optimizer applies a set of `simplification rules <https://github.com/ethereum/solidity/blob/develop/libevmasm/RuleList.h>`_
|
||||
to opcodes. It also combines equal code sets and removes unused code.
|
||||
|
||||
The Yul-based optimizer is much more powerful, because it can work across function
|
||||
calls: In Yul, it is not possible to perform arbitrary jumps, so it is for example
|
||||
calls. For example, arbitrary jumps are not possible in Yul, so it is
|
||||
possible to compute the side-effects of each function. Consider two function calls,
|
||||
where the first does not modify the storage and the second modifies the storage.
|
||||
If their arguments and return values does not depend on each other, we can reorder
|
||||
where the first does not modify storage and the second does modify storage.
|
||||
If their arguments and return values do not depend on each other, we can reorder
|
||||
the function calls. Similarly, if a function is
|
||||
side-effect free and its result is multiplied by zero, you can remove the function
|
||||
call completely.
|
||||
|
||||
Currently, the parameter ``--optimize`` activates the opcode-based optimizer for the
|
||||
generated bytecode and the Yul optimizer for the Yul code generated internally, for example for ABI coder v2.
|
||||
One can use ``solc --ir-optimized --optimize`` to produce
|
||||
optimized experimental Yul IR for a Solidity source. Similarly, use ``solc --strict-assembly --optimize``
|
||||
One can use ``solc --ir-optimized --optimize`` to produce an
|
||||
optimized experimental Yul IR for a Solidity source. Similarly, one can use ``solc --strict-assembly --optimize``
|
||||
for a stand-alone Yul mode.
|
||||
|
||||
You can find more details on both optimizer modules and their optimization steps below.
|
||||
@ -32,7 +32,7 @@ Benefits of Optimizing Solidity Code
|
||||
====================================
|
||||
|
||||
Overall, the optimizer tries to simplify complicated expressions, which reduces both code
|
||||
size and execution cost, i.e., it can reduce gas needed for contract deployment as well as for external calls to the contract.
|
||||
size and execution cost, i.e., it can reduce gas needed for contract deployment as well as for external calls made to the contract.
|
||||
It also specializes or inlines functions. Especially
|
||||
function inlining is an operation that can cause much bigger code, but it is
|
||||
often done because it results in opportunities for more simplifications.
|
||||
@ -41,11 +41,11 @@ often done because it results in opportunities for more simplifications.
|
||||
Differences between Optimized and Non-Optimized Code
|
||||
====================================================
|
||||
|
||||
Generally, the most visible difference would be constant expressions getting evaluated.
|
||||
When it comes to the ASM output, one can also notice reduction of equivalent/duplicate
|
||||
"code blocks" (compare the output of the flags ``--asm`` and ``--asm --optimize``). However,
|
||||
Generally, the most visible difference is that constant expressions are evaluated at compile time.
|
||||
When it comes to the ASM output, one can also notice a reduction of equivalent or duplicate
|
||||
code blocks (compare the output of the flags ``--asm`` and ``--asm --optimize``). However,
|
||||
when it comes to the Yul/intermediate-representation, there can be significant
|
||||
differences, for example, functions can get inlined, combined, rewritten to eliminate
|
||||
differences, for example, functions may be inlined, combined, or rewritten to eliminate
|
||||
redundancies, etc. (compare the output between the flags ``--ir`` and
|
||||
``--optimize --ir-optimized``).
|
||||
|
||||
@ -55,7 +55,9 @@ Optimizer Parameter Runs
|
||||
The number of runs (``--optimize-runs``) specifies roughly how often each opcode of the
|
||||
deployed code will be executed across the life-time of the contract. This means it is a
|
||||
trade-off parameter between code size (deploy cost) and code execution cost (cost after deployment).
|
||||
A "runs" parameter of "1" will produce short but expensive code. The largest value is ``2**32-1``.
|
||||
A "runs" parameter of "1" will produce short but expensive code. In contrast, a larger "runs"
|
||||
parameter will produce longer but more gas efficient code. The maximum value of the parameter
|
||||
is ``2**32-1``.
|
||||
|
||||
.. note::
|
||||
|
||||
@ -65,31 +67,81 @@ A "runs" parameter of "1" will produce short but expensive code. The largest val
|
||||
Opcode-Based Optimizer Module
|
||||
=============================
|
||||
|
||||
The opcode-based optimizer module operates on assembly. It splits the
|
||||
The opcode-based optimizer module operates on assembly code. It splits the
|
||||
sequence of instructions into basic blocks at ``JUMPs`` and ``JUMPDESTs``.
|
||||
Inside these blocks, the optimizer analyzes the instructions and records every modification to the stack,
|
||||
memory, or storage as an expression which consists of an instruction and
|
||||
a list of arguments which are pointers to other expressions. The opcode-based optimizer
|
||||
uses a component called "CommonSubexpressionEliminator" that amongst other
|
||||
a list of arguments which are pointers to other expressions.
|
||||
|
||||
Additionally, the opcode-based optimizer
|
||||
uses a component called "CommonSubexpressionEliminator" that, amongst other
|
||||
tasks, finds expressions that are always equal (on every input) and combines
|
||||
them into an expression class. It first tries to find each new
|
||||
expression in a list of already known expressions. If no such matches are found,
|
||||
it simplifies the expression according to rules like
|
||||
``constant + constant = sum_of_constants`` or ``X * 1 = X``. Since this is
|
||||
a recursive process, we can also apply the latter rule if the second factor
|
||||
is a more complex expression where we know that it always evaluates to one.
|
||||
Modifications to storage and memory locations have to erase knowledge about
|
||||
storage and memory locations which are not known to be different. If we first
|
||||
write to location x and then to location y and both are input variables, the
|
||||
second could overwrite the first, so we do not know what is stored at x after
|
||||
we wrote to y. If simplification of the expression ``x - y`` evaluates to a
|
||||
non-zero constant, we know that we can keep our knowledge about what is stored at ``x``.
|
||||
is a more complex expression which we know always evaluates to one.
|
||||
|
||||
Certain optimizer steps symbolically track the storage and memory locations. For example, this
|
||||
information is used to compute Keccak-256 hashes that can be evaluated during compile time. Consider
|
||||
the sequence:
|
||||
|
||||
::
|
||||
|
||||
PUSH 32
|
||||
PUSH 0
|
||||
CALLDATALOAD
|
||||
PUSH 100
|
||||
DUP2
|
||||
MSTORE
|
||||
KECCAK256
|
||||
|
||||
or the equivalent Yul
|
||||
|
||||
::
|
||||
|
||||
let x := calldataload(0)
|
||||
mstore(x, 100)
|
||||
let value := keccak256(x, 32)
|
||||
|
||||
In this case, the optimizer tracks the value at a memory location ``calldataload(0)`` and then
|
||||
realizes that the Keccak-256 hash can be evaluated at compile time. This only works if there is no
|
||||
other instruction that modifies memory between the ``mstore`` and ``keccak256``. So if there is an
|
||||
instruction that writes to memory (or storage), then we need to erase the knowledge of the current
|
||||
memory (or storage). There is, however, an exception to this erasing, when we can easily see that
|
||||
the instruction doesn't write to a certain location.
|
||||
|
||||
For example,
|
||||
|
||||
::
|
||||
|
||||
let x := calldataload(0)
|
||||
mstore(x, 100)
|
||||
// Current knowledge memory location x -> 100
|
||||
let y := add(x, 32)
|
||||
// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)
|
||||
mstore(y, 200)
|
||||
// This Keccak-256 can now be evaluated
|
||||
let value := keccak256(x, 32)
|
||||
|
||||
Therefore, modifications to storage and memory locations, of say location ``l``, must erase
|
||||
knowledge about storage or memory locations which may be equal to ``l``. More specifically, for
|
||||
storage, the optimizer has to erase all knowledge of symbolic locations, that may be equal to ``l``
|
||||
and for memory, the optimizer has to erase all knowledge of symbolic locations that may not be at
|
||||
least 32 bytes away. If ``m`` denotes an arbitrary location, then this decision on erasure is done
|
||||
by computing the value ``sub(l, m)``. For storage, if this value evaluates to a literal that is
|
||||
non-zero, then the knowledge about ``m`` will be kept. For memory, if the value evaluates to a
|
||||
literal that is between ``32`` and ``2**256 - 32``, then the knowledge about ``m`` will be kept. In
|
||||
all other cases, the knowledge about ``m`` will be erased.
|
||||
|
||||
After this process, we know which expressions have to be on the stack at
|
||||
the end, and have a list of modifications to memory and storage. This information
|
||||
is stored together with the basic blocks and is used to link them. Furthermore,
|
||||
knowledge about the stack, storage and memory configuration is forwarded to
|
||||
the next block(s). If we know the targets of all ``JUMP`` and ``JUMPI`` instructions,
|
||||
the next block(s).
|
||||
|
||||
If we know the targets of all ``JUMP`` and ``JUMPI`` instructions,
|
||||
we can build a complete control flow graph of the program. If there is only
|
||||
one target we do not know (this can happen as in principle, jump targets can
|
||||
be computed from inputs), we have to erase all knowledge about the input state
|
||||
@ -108,19 +160,18 @@ stack in the correct place.
|
||||
These steps are applied to each basic block and the newly generated code
|
||||
is used as replacement if it is smaller. If a basic block is split at a
|
||||
``JUMPI`` and during the analysis, the condition evaluates to a constant,
|
||||
the ``JUMPI`` is replaced depending on the value of the constant. Thus code like
|
||||
the ``JUMPI`` is replaced based on the value of the constant. Thus code like
|
||||
|
||||
::
|
||||
|
||||
uint x = 7;
|
||||
data[7] = 9;
|
||||
if (data[x] != x + 2)
|
||||
if (data[x] != x + 2) // this condition is never true
|
||||
return 2;
|
||||
else
|
||||
return 1;
|
||||
|
||||
still simplifies to code which you can compile even though the instructions contained
|
||||
a jump in the beginning of the process:
|
||||
simplifies to this:
|
||||
|
||||
::
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user