mirror of
https://github.com/ethereum/solidity
synced 2023-10-03 13:03:40 +00:00
Merge pull request #11360 from maurelian/patch-2
Some improvements to optimizer documentation
This commit is contained in:
commit
d07c85db67
@ -6,24 +6,24 @@ The Optimizer
|
|||||||
*************
|
*************
|
||||||
|
|
||||||
The Solidity compiler uses two different optimizer modules: The "old" optimizer
|
The Solidity compiler uses two different optimizer modules: The "old" optimizer
|
||||||
that operates at opcode level and the "new" optimizer that operates on Yul IR code.
|
that operates at the opcode level and the "new" optimizer that operates on Yul IR code.
|
||||||
|
|
||||||
The opcode-based optimizer applies a set of `simplification rules <https://github.com/ethereum/solidity/blob/develop/libevmasm/RuleList.h>`_
|
The opcode-based optimizer applies a set of `simplification rules <https://github.com/ethereum/solidity/blob/develop/libevmasm/RuleList.h>`_
|
||||||
to opcodes. It also combines equal code sets and removes unused code.
|
to opcodes. It also combines equal code sets and removes unused code.
|
||||||
|
|
||||||
The Yul-based optimizer is much more powerful, because it can work across function
|
The Yul-based optimizer is much more powerful, because it can work across function
|
||||||
calls: In Yul, it is not possible to perform arbitrary jumps, so it is for example
|
calls. For example, arbitrary jumps are not possible in Yul, so it is
|
||||||
possible to compute the side-effects of each function. Consider two function calls,
|
possible to compute the side-effects of each function. Consider two function calls,
|
||||||
where the first does not modify the storage and the second modifies the storage.
|
where the first does not modify storage and the second does modify storage.
|
||||||
If their arguments and return values does not depend on each other, we can reorder
|
If their arguments and return values do not depend on each other, we can reorder
|
||||||
the function calls. Similarly, if a function is
|
the function calls. Similarly, if a function is
|
||||||
side-effect free and its result is multiplied by zero, you can remove the function
|
side-effect free and its result is multiplied by zero, you can remove the function
|
||||||
call completely.
|
call completely.
|
||||||
|
|
||||||
Currently, the parameter ``--optimize`` activates the opcode-based optimizer for the
|
Currently, the parameter ``--optimize`` activates the opcode-based optimizer for the
|
||||||
generated bytecode and the Yul optimizer for the Yul code generated internally, for example for ABI coder v2.
|
generated bytecode and the Yul optimizer for the Yul code generated internally, for example for ABI coder v2.
|
||||||
One can use ``solc --ir-optimized --optimize`` to produce
|
One can use ``solc --ir-optimized --optimize`` to produce an
|
||||||
optimized experimental Yul IR for a Solidity source. Similarly, use ``solc --strict-assembly --optimize``
|
optimized experimental Yul IR for a Solidity source. Similarly, one can use ``solc --strict-assembly --optimize``
|
||||||
for a stand-alone Yul mode.
|
for a stand-alone Yul mode.
|
||||||
|
|
||||||
You can find more details on both optimizer modules and their optimization steps below.
|
You can find more details on both optimizer modules and their optimization steps below.
|
||||||
@ -32,7 +32,7 @@ Benefits of Optimizing Solidity Code
|
|||||||
====================================
|
====================================
|
||||||
|
|
||||||
Overall, the optimizer tries to simplify complicated expressions, which reduces both code
|
Overall, the optimizer tries to simplify complicated expressions, which reduces both code
|
||||||
size and execution cost, i.e., it can reduce gas needed for contract deployment as well as for external calls to the contract.
|
size and execution cost, i.e., it can reduce gas needed for contract deployment as well as for external calls made to the contract.
|
||||||
It also specializes or inlines functions. Especially
|
It also specializes or inlines functions. Especially
|
||||||
function inlining is an operation that can cause much bigger code, but it is
|
function inlining is an operation that can cause much bigger code, but it is
|
||||||
often done because it results in opportunities for more simplifications.
|
often done because it results in opportunities for more simplifications.
|
||||||
@ -41,11 +41,11 @@ often done because it results in opportunities for more simplifications.
|
|||||||
Differences between Optimized and Non-Optimized Code
|
Differences between Optimized and Non-Optimized Code
|
||||||
====================================================
|
====================================================
|
||||||
|
|
||||||
Generally, the most visible difference would be constant expressions getting evaluated.
|
Generally, the most visible difference is that constant expressions are evaluated at compile time.
|
||||||
When it comes to the ASM output, one can also notice reduction of equivalent/duplicate
|
When it comes to the ASM output, one can also notice a reduction of equivalent or duplicate
|
||||||
"code blocks" (compare the output of the flags ``--asm`` and ``--asm --optimize``). However,
|
code blocks (compare the output of the flags ``--asm`` and ``--asm --optimize``). However,
|
||||||
when it comes to the Yul/intermediate-representation, there can be significant
|
when it comes to the Yul/intermediate-representation, there can be significant
|
||||||
differences, for example, functions can get inlined, combined, rewritten to eliminate
|
differences, for example, functions may be inlined, combined, or rewritten to eliminate
|
||||||
redundancies, etc. (compare the output between the flags ``--ir`` and
|
redundancies, etc. (compare the output between the flags ``--ir`` and
|
||||||
``--optimize --ir-optimized``).
|
``--optimize --ir-optimized``).
|
||||||
|
|
||||||
@ -55,7 +55,9 @@ Optimizer Parameter Runs
|
|||||||
The number of runs (``--optimize-runs``) specifies roughly how often each opcode of the
|
The number of runs (``--optimize-runs``) specifies roughly how often each opcode of the
|
||||||
deployed code will be executed across the life-time of the contract. This means it is a
|
deployed code will be executed across the life-time of the contract. This means it is a
|
||||||
trade-off parameter between code size (deploy cost) and code execution cost (cost after deployment).
|
trade-off parameter between code size (deploy cost) and code execution cost (cost after deployment).
|
||||||
A "runs" parameter of "1" will produce short but expensive code. The largest value is ``2**32-1``.
|
A "runs" parameter of "1" will produce short but expensive code. In contrast, a larger "runs"
|
||||||
|
parameter will produce longer but more gas efficient code. The maximum value of the parameter
|
||||||
|
is ``2**32-1``.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
@ -65,31 +67,81 @@ A "runs" parameter of "1" will produce short but expensive code. The largest val
|
|||||||
Opcode-Based Optimizer Module
|
Opcode-Based Optimizer Module
|
||||||
=============================
|
=============================
|
||||||
|
|
||||||
The opcode-based optimizer module operates on assembly. It splits the
|
The opcode-based optimizer module operates on assembly code. It splits the
|
||||||
sequence of instructions into basic blocks at ``JUMPs`` and ``JUMPDESTs``.
|
sequence of instructions into basic blocks at ``JUMPs`` and ``JUMPDESTs``.
|
||||||
Inside these blocks, the optimizer analyzes the instructions and records every modification to the stack,
|
Inside these blocks, the optimizer analyzes the instructions and records every modification to the stack,
|
||||||
memory, or storage as an expression which consists of an instruction and
|
memory, or storage as an expression which consists of an instruction and
|
||||||
a list of arguments which are pointers to other expressions. The opcode-based optimizer
|
a list of arguments which are pointers to other expressions.
|
||||||
uses a component called "CommonSubexpressionEliminator" that amongst other
|
|
||||||
|
Additionally, the opcode-based optimizer
|
||||||
|
uses a component called "CommonSubexpressionEliminator" that, amongst other
|
||||||
tasks, finds expressions that are always equal (on every input) and combines
|
tasks, finds expressions that are always equal (on every input) and combines
|
||||||
them into an expression class. It first tries to find each new
|
them into an expression class. It first tries to find each new
|
||||||
expression in a list of already known expressions. If no such matches are found,
|
expression in a list of already known expressions. If no such matches are found,
|
||||||
it simplifies the expression according to rules like
|
it simplifies the expression according to rules like
|
||||||
``constant + constant = sum_of_constants`` or ``X * 1 = X``. Since this is
|
``constant + constant = sum_of_constants`` or ``X * 1 = X``. Since this is
|
||||||
a recursive process, we can also apply the latter rule if the second factor
|
a recursive process, we can also apply the latter rule if the second factor
|
||||||
is a more complex expression where we know that it always evaluates to one.
|
is a more complex expression which we know always evaluates to one.
|
||||||
Modifications to storage and memory locations have to erase knowledge about
|
|
||||||
storage and memory locations which are not known to be different. If we first
|
Certain optimizer steps symbolically track the storage and memory locations. For example, this
|
||||||
write to location x and then to location y and both are input variables, the
|
information is used to compute Keccak-256 hashes that can be evaluated during compile time. Consider
|
||||||
second could overwrite the first, so we do not know what is stored at x after
|
the sequence:
|
||||||
we wrote to y. If simplification of the expression ``x - y`` evaluates to a
|
|
||||||
non-zero constant, we know that we can keep our knowledge about what is stored at ``x``.
|
::
|
||||||
|
|
||||||
|
PUSH 32
|
||||||
|
PUSH 0
|
||||||
|
CALLDATALOAD
|
||||||
|
PUSH 100
|
||||||
|
DUP2
|
||||||
|
MSTORE
|
||||||
|
KECCAK256
|
||||||
|
|
||||||
|
or the equivalent Yul
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
let x := calldataload(0)
|
||||||
|
mstore(x, 100)
|
||||||
|
let value := keccak256(x, 32)
|
||||||
|
|
||||||
|
In this case, the optimizer tracks the value at a memory location ``calldataload(0)`` and then
|
||||||
|
realizes that the Keccak-256 hash can be evaluated at compile time. This only works if there is no
|
||||||
|
other instruction that modifies memory between the ``mstore`` and ``keccak256``. So if there is an
|
||||||
|
instruction that writes to memory (or storage), then we need to erase the knowledge of the current
|
||||||
|
memory (or storage). There is, however, an exception to this erasing, when we can easily see that
|
||||||
|
the instruction doesn't write to a certain location.
|
||||||
|
|
||||||
|
For example,
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
let x := calldataload(0)
|
||||||
|
mstore(x, 100)
|
||||||
|
// Current knowledge memory location x -> 100
|
||||||
|
let y := add(x, 32)
|
||||||
|
// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)
|
||||||
|
mstore(y, 200)
|
||||||
|
// This Keccak-256 can now be evaluated
|
||||||
|
let value := keccak256(x, 32)
|
||||||
|
|
||||||
|
Therefore, modifications to storage and memory locations, of say location ``l``, must erase
|
||||||
|
knowledge about storage or memory locations which may be equal to ``l``. More specifically, for
|
||||||
|
storage, the optimizer has to erase all knowledge of symbolic locations, that may be equal to ``l``
|
||||||
|
and for memory, the optimizer has to erase all knowledge of symbolic locations that may not be at
|
||||||
|
least 32 bytes away. If ``m`` denotes an arbitrary location, then this decision on erasure is done
|
||||||
|
by computing the value ``sub(l, m)``. For storage, if this value evaluates to a literal that is
|
||||||
|
non-zero, then the knowledge about ``m`` will be kept. For memory, if the value evaluates to a
|
||||||
|
literal that is between ``32`` and ``2**256 - 32``, then the knowledge about ``m`` will be kept. In
|
||||||
|
all other cases, the knowledge about ``m`` will be erased.
|
||||||
|
|
||||||
After this process, we know which expressions have to be on the stack at
|
After this process, we know which expressions have to be on the stack at
|
||||||
the end, and have a list of modifications to memory and storage. This information
|
the end, and have a list of modifications to memory and storage. This information
|
||||||
is stored together with the basic blocks and is used to link them. Furthermore,
|
is stored together with the basic blocks and is used to link them. Furthermore,
|
||||||
knowledge about the stack, storage and memory configuration is forwarded to
|
knowledge about the stack, storage and memory configuration is forwarded to
|
||||||
the next block(s). If we know the targets of all ``JUMP`` and ``JUMPI`` instructions,
|
the next block(s).
|
||||||
|
|
||||||
|
If we know the targets of all ``JUMP`` and ``JUMPI`` instructions,
|
||||||
we can build a complete control flow graph of the program. If there is only
|
we can build a complete control flow graph of the program. If there is only
|
||||||
one target we do not know (this can happen as in principle, jump targets can
|
one target we do not know (this can happen as in principle, jump targets can
|
||||||
be computed from inputs), we have to erase all knowledge about the input state
|
be computed from inputs), we have to erase all knowledge about the input state
|
||||||
@ -108,19 +160,18 @@ stack in the correct place.
|
|||||||
These steps are applied to each basic block and the newly generated code
|
These steps are applied to each basic block and the newly generated code
|
||||||
is used as replacement if it is smaller. If a basic block is split at a
|
is used as replacement if it is smaller. If a basic block is split at a
|
||||||
``JUMPI`` and during the analysis, the condition evaluates to a constant,
|
``JUMPI`` and during the analysis, the condition evaluates to a constant,
|
||||||
the ``JUMPI`` is replaced depending on the value of the constant. Thus code like
|
the ``JUMPI`` is replaced based on the value of the constant. Thus code like
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
uint x = 7;
|
uint x = 7;
|
||||||
data[7] = 9;
|
data[7] = 9;
|
||||||
if (data[x] != x + 2)
|
if (data[x] != x + 2) // this condition is never true
|
||||||
return 2;
|
return 2;
|
||||||
else
|
else
|
||||||
return 1;
|
return 1;
|
||||||
|
|
||||||
still simplifies to code which you can compile even though the instructions contained
|
simplifies to this:
|
||||||
a jump in the beginning of the process:
|
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user