Merge pull request #11360 from maurelian/patch-2

Some improvements to optimizer documentation
This commit is contained in:
Harikrishnan Mulackal 2021-05-19 13:24:05 +02:00 committed by GitHub
commit d07c85db67
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -6,24 +6,24 @@ The Optimizer
*************
The Solidity compiler uses two different optimizer modules: The "old" optimizer
that operates at opcode level and the "new" optimizer that operates on Yul IR code.
that operates at the opcode level and the "new" optimizer that operates on Yul IR code.
The opcode-based optimizer applies a set of `simplification rules <https://github.com/ethereum/solidity/blob/develop/libevmasm/RuleList.h>`_
to opcodes. It also combines equal code sets and removes unused code.
The Yul-based optimizer is much more powerful, because it can work across function
calls: In Yul, it is not possible to perform arbitrary jumps, so it is for example
calls. For example, arbitrary jumps are not possible in Yul, so it is
possible to compute the side-effects of each function. Consider two function calls,
where the first does not modify the storage and the second modifies the storage.
If their arguments and return values does not depend on each other, we can reorder
where the first does not modify storage and the second does modify storage.
If their arguments and return values do not depend on each other, we can reorder
the function calls. Similarly, if a function is
side-effect free and its result is multiplied by zero, you can remove the function
call completely.
Currently, the parameter ``--optimize`` activates the opcode-based optimizer for the
generated bytecode and the Yul optimizer for the Yul code generated internally, for example for ABI coder v2.
One can use ``solc --ir-optimized --optimize`` to produce
optimized experimental Yul IR for a Solidity source. Similarly, use ``solc --strict-assembly --optimize``
One can use ``solc --ir-optimized --optimize`` to produce an
optimized experimental Yul IR for a Solidity source. Similarly, one can use ``solc --strict-assembly --optimize``
for a stand-alone Yul mode.
You can find more details on both optimizer modules and their optimization steps below.
@ -32,7 +32,7 @@ Benefits of Optimizing Solidity Code
====================================
Overall, the optimizer tries to simplify complicated expressions, which reduces both code
size and execution cost, i.e., it can reduce gas needed for contract deployment as well as for external calls to the contract.
size and execution cost, i.e., it can reduce gas needed for contract deployment as well as for external calls made to the contract.
It also specializes or inlines functions. Especially
function inlining is an operation that can cause much bigger code, but it is
often done because it results in opportunities for more simplifications.
@ -41,11 +41,11 @@ often done because it results in opportunities for more simplifications.
Differences between Optimized and Non-Optimized Code
====================================================
Generally, the most visible difference would be constant expressions getting evaluated.
When it comes to the ASM output, one can also notice reduction of equivalent/duplicate
"code blocks" (compare the output of the flags ``--asm`` and ``--asm --optimize``). However,
Generally, the most visible difference is that constant expressions are evaluated at compile time.
When it comes to the ASM output, one can also notice a reduction of equivalent or duplicate
code blocks (compare the output of the flags ``--asm`` and ``--asm --optimize``). However,
when it comes to the Yul/intermediate-representation, there can be significant
differences, for example, functions can get inlined, combined, rewritten to eliminate
differences, for example, functions may be inlined, combined, or rewritten to eliminate
redundancies, etc. (compare the output between the flags ``--ir`` and
``--optimize --ir-optimized``).
@ -55,7 +55,9 @@ Optimizer Parameter Runs
The number of runs (``--optimize-runs``) specifies roughly how often each opcode of the
deployed code will be executed across the life-time of the contract. This means it is a
trade-off parameter between code size (deploy cost) and code execution cost (cost after deployment).
A "runs" parameter of "1" will produce short but expensive code. The largest value is ``2**32-1``.
A "runs" parameter of "1" will produce short but expensive code. In contrast, a larger "runs"
parameter will produce longer but more gas efficient code. The maximum value of the parameter
is ``2**32-1``.
.. note::
@ -65,31 +67,81 @@ A "runs" parameter of "1" will produce short but expensive code. The largest val
Opcode-Based Optimizer Module
=============================
The opcode-based optimizer module operates on assembly. It splits the
The opcode-based optimizer module operates on assembly code. It splits the
sequence of instructions into basic blocks at ``JUMPs`` and ``JUMPDESTs``.
Inside these blocks, the optimizer analyzes the instructions and records every modification to the stack,
memory, or storage as an expression which consists of an instruction and
a list of arguments which are pointers to other expressions. The opcode-based optimizer
uses a component called "CommonSubexpressionEliminator" that amongst other
a list of arguments which are pointers to other expressions.
Additionally, the opcode-based optimizer
uses a component called "CommonSubexpressionEliminator" that, amongst other
tasks, finds expressions that are always equal (on every input) and combines
them into an expression class. It first tries to find each new
expression in a list of already known expressions. If no such matches are found,
it simplifies the expression according to rules like
``constant + constant = sum_of_constants`` or ``X * 1 = X``. Since this is
a recursive process, we can also apply the latter rule if the second factor
is a more complex expression where we know that it always evaluates to one.
Modifications to storage and memory locations have to erase knowledge about
storage and memory locations which are not known to be different. If we first
write to location x and then to location y and both are input variables, the
second could overwrite the first, so we do not know what is stored at x after
we wrote to y. If simplification of the expression ``x - y`` evaluates to a
non-zero constant, we know that we can keep our knowledge about what is stored at ``x``.
is a more complex expression which we know always evaluates to one.
Certain optimizer steps symbolically track the storage and memory locations. For example, this
information is used to compute Keccak-256 hashes that can be evaluated during compile time. Consider
the sequence:
::
PUSH 32
PUSH 0
CALLDATALOAD
PUSH 100
DUP2
MSTORE
KECCAK256
or the equivalent Yul
::
let x := calldataload(0)
mstore(x, 100)
let value := keccak256(x, 32)
In this case, the optimizer tracks the value at a memory location ``calldataload(0)`` and then
realizes that the Keccak-256 hash can be evaluated at compile time. This only works if there is no
other instruction that modifies memory between the ``mstore`` and ``keccak256``. So if there is an
instruction that writes to memory (or storage), then we need to erase the knowledge of the current
memory (or storage). There is, however, an exception to this erasing, when we can easily see that
the instruction doesn't write to a certain location.
For example,
::
let x := calldataload(0)
mstore(x, 100)
// Current knowledge memory location x -> 100
let y := add(x, 32)
// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)
mstore(y, 200)
// This Keccak-256 can now be evaluated
let value := keccak256(x, 32)
Therefore, modifications to storage and memory locations, of say location ``l``, must erase
knowledge about storage or memory locations which may be equal to ``l``. More specifically, for
storage, the optimizer has to erase all knowledge of symbolic locations, that may be equal to ``l``
and for memory, the optimizer has to erase all knowledge of symbolic locations that may not be at
least 32 bytes away. If ``m`` denotes an arbitrary location, then this decision on erasure is done
by computing the value ``sub(l, m)``. For storage, if this value evaluates to a literal that is
non-zero, then the knowledge about ``m`` will be kept. For memory, if the value evaluates to a
literal that is between ``32`` and ``2**256 - 32``, then the knowledge about ``m`` will be kept. In
all other cases, the knowledge about ``m`` will be erased.
After this process, we know which expressions have to be on the stack at
the end, and have a list of modifications to memory and storage. This information
is stored together with the basic blocks and is used to link them. Furthermore,
knowledge about the stack, storage and memory configuration is forwarded to
the next block(s). If we know the targets of all ``JUMP`` and ``JUMPI`` instructions,
the next block(s).
If we know the targets of all ``JUMP`` and ``JUMPI`` instructions,
we can build a complete control flow graph of the program. If there is only
one target we do not know (this can happen as in principle, jump targets can
be computed from inputs), we have to erase all knowledge about the input state
@ -108,19 +160,18 @@ stack in the correct place.
These steps are applied to each basic block and the newly generated code
is used as replacement if it is smaller. If a basic block is split at a
``JUMPI`` and during the analysis, the condition evaluates to a constant,
the ``JUMPI`` is replaced depending on the value of the constant. Thus code like
the ``JUMPI`` is replaced based on the value of the constant. Thus code like
::
uint x = 7;
data[7] = 9;
if (data[x] != x + 2)
if (data[x] != x + 2) // this condition is never true
return 2;
else
return 1;
still simplifies to code which you can compile even though the instructions contained
a jump in the beginning of the process:
simplifies to this:
::