From 548ae18dfdadd55fcb88a649cdbfd2a559637d87 Mon Sep 17 00:00:00 2001 From: RJ Catalano Date: Fri, 9 Jun 2017 13:58:55 -0500 Subject: [PATCH 1/7] begin abi spec translation into solidity docs Signed-off-by: RJ Catalano --- docs/abi-spec.rst | 327 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 327 insertions(+) create mode 100644 docs/abi-spec.rst diff --git a/docs/abi-spec.rst b/docs/abi-spec.rst new file mode 100644 index 000000000..ee25c65fd --- /dev/null +++ b/docs/abi-spec.rst @@ -0,0 +1,327 @@ +.. index:: abi, application binary interface + +.. _ABI: + +****************************************** +Application Binary Interface Specification +****************************************** + +Basic design +============ + +We assume the Application Binary Interface (ABI) is strongly typed, known at compilation time and static. No introspection mechanism will be provided. We assert that all contracts will have the interface definitions of any contracts they call available at compile-time. + +This specification does not address contracts whose interface is dynamic or otherwise known only at run-time. Should these cases become important they can be adequately handled as facilities built within the Ethereum ecosystem. + +Function Selector +================= + +The first four bytes of the call data for a function call specifies the function to be called. It is the +first (left, high-order in big-endian) four bytes of the Keccak (SHA-3) hash of the signature of the function. The signature is defined as the canonical expression of the basic prototype, i.e. +the function name with the parenthesised list of parameter types. Parameter types are split by a single comma - no spaces are used. + +Argument Encoding +================= + +Starting from the fifth byte, the encoded arguments follow. This encoding is also used in other places, e.g. the return values and also event arguments are encoded in the same way, without the four bytes specifying the function. + +Types +===== + +The following elementary types exist: + +- `uint`: unsigned integer type of `M` bits, `0 < M <= 256`, `M % 8 == 0`. e.g. `uint32`, `uint8`, `uint256`. + +- `int`: two's complement signed integer type of `M` bits, `0 < M <= 256`, `M % 8 == 0`. + +- `address`: equivalent to `uint160`, except for the assumed interpretation and language typing. + +- `uint`, `int`: synonyms for `uint256`, `int256` respectively (not to be used for computing the function selector). + +- `bool`: equivalent to `uint8` restricted to the values 0 and 1 + +- `fixedx`: signed fixed-point decimal number of `M` bits, `0 < M <= 256`, `M % 8 ==0`, and `0 < N <= 80`, which denotes the value `v` as `v / (10 ** N)`. + +- `ufixedx`: unsigned variant of `fixedx`. + +- `fixed`, `ufixed`: synonyms for `fixed128x19`, `ufixed128x19` respectively (not to be used for computing the function selector). + +- `bytes`: binary type of `M` bytes, `0 < M <= 32`. + +- `function`: equivalent to `bytes24`: an address, followed by a function selector + +The following (fixed-size) array type exists: + +- `[M]`: a fixed-length array of the given fixed-length type. + +The following non-fixed-size types exist: + +- `bytes`: dynamic sized byte sequence. + +- `string`: dynamic sized unicode string assumed to be UTF-8 encoded. + +- `[]`: a variable-length array of the given fixed-length type. + +Formal Specification of the Encoding +==================================== + +We will now formally specify the encoding, such that it will have the following +properties, which are especially useful if some arguments are nested arrays: + +**Properties:** + +1. The number of reads necessary to access a value is at most the depth of the +value inside the argument array structure, i.e. four reads are needed to +retrieve `a_i[k][l][r]`. In a previous version of the ABI, the number of reads scaled +linearly with the total number of dynamic parameters in the worst case. + +2. The data of a variable or array element is not interleaved with other data +and it is relocatable, i.e. it only uses relative "addresses" + +We distinguish static and dynamic types. Static types are encoded in-place and dynamic types are encoded at a separately allocated location after the current block. + +**Definition:** The following types are called "dynamic": +* `bytes` +* `string` +* `T[]` for any `T` +* `T[k]` for any dynamic `T` and any `k > 0` + +All other types are called "static". + +**Definition:** `len(a)` is the number of bytes in a binary string `a`. +The type of `len(a)` is assumed to be `uint256`. + +We define `enc`, the actual encoding, as a mapping of values of the ABI types to binary strings such +that `len(enc(X))` depends on the value of `X` if and only if the type of `X` +is dynamic. + +**Definition:** For any ABI value `X`, we recursively define `enc(X)`, depending +on the type of `X` being + +- `T[k]` for any `T` and `k`: + + `enc(X) = head(X[0]) ... head(X[k-1]) tail(X[0]) ... tail(X[k-1])` + + where `head` and `tail` are defined for `X[i]` being of a static type as + `head(X[i]) = enc(X[i])` and `tail(X[i]) = ""` (the empty string) + and as + `head(X[i]) = enc(len(head(X[0]) ... head(X[k-1]) tail(X[0]) ... tail(X[i-1])))` + `tail(X[i]) = enc(X[i])` + otherwise. + + Note that in the dynamic case, `head(X[i])` is well-defined since the lengths of + the head parts only depend on the types and not the values. Its value is the offset + of the beginning of `tail(X[i])` relative to the start of `enc(X)`. + +- `T[]` where `X` has `k` elements (`k` is assumed to be of type `uint256`): + + `enc(X) = enc(k) enc([X[1], ..., X[k]])` + + i.e. it is encoded as if it were an array of static size `k`, prefixed with + the number of elements. + +- `bytes`, of length `k` (which is assumed to be of type `uint256`): + + `enc(X) = enc(k) pad_right(X)`, i.e. the number of bytes is encoded as a + `uint256` followed by the actual value of `X` as a byte sequence, followed by + the minimum number of zero-bytes such that `len(enc(X))` is a multiple of 32. + +- `string`: + + `enc(X) = enc(enc_utf8(X))`, i.e. `X` is utf-8 encoded and this value is interpreted as of `bytes` type and encoded further. Note that the length used in this subsequent encoding is the number of bytes of the utf-8 encoded string, not its number of characters. + +- `uint`: `enc(X)` is the big-endian encoding of `X`, padded on the higher-order (left) side with zero-bytes such that the length is a multiple of 32 bytes. +- `address`: as in the `uint160` case +- `int`: `enc(X)` is the big-endian two's complement encoding of `X`, padded on the higher-oder (left) side with `0xff` for negative `X` and with zero bytes for positive `X` such that the length is a multiple of 32 bytes. +- `bool`: as in the `uint8` case, where `1` is used for `true` and `0` for `false` +- `fixedx`: `enc(X)` is `enc(X * 2**N)` where `X * 2**N` is interpreted as a `int256`. +- `fixed`: as in the `fixed128x128` case +- `ufixedx`: `enc(X)` is `enc(X * 2**N)` where `X * 2**N` is interpreted as a `uint256`. +- `ufixed`: as in the `ufixed128x128` case +- `bytes`: `enc(X)` is the sequence of bytes in `X` padded with zero-bytes to a length of 32. + +Note that for any `X`, `len(enc(X))` is a multiple of 32. + +## Function Selector and Argument Encoding + +All in all, a call to the function `f` with parameters `a_1, ..., a_n` is encoded as + + `function_selector(f) enc([a_1, ..., a_n])` + +and the return values `v_1, ..., v_k` of `f` are encoded as + + `enc([v_1, ..., v_k])` + +where the types of `[a_1, ..., a_n]` and `[v_1, ..., v_k]` are assumed to be +fixed-size arrays of length `n` and `k`, respectively. Note that strictly, +`[a_1, ..., a_n]` can be an "array" with elements of different types, but the +encoding is still well-defined as the assumed common type `T` (above) is not +actually used. + +## Examples + +Given the contract: + +```js +contract Foo { + function bar(fixed[2] xy) {} + function baz(uint32 x, bool y) returns (bool r) { r = x > 32 || y; } + function sam(bytes name, bool z, uint[] data) {} +} +``` + +Thus for our `Foo` example if we wanted to call `baz` with the parameters `69` and `true`, we would pass 68 bytes total, which can be broken down into: + +- `0xcdcd77c0`: the Method ID. This is derived as the first 4 bytes of the Keccak hash of the ASCII form of the signature `baz(uint32,bool)`. +- `0x0000000000000000000000000000000000000000000000000000000000000045`: the first parameter, a uint32 value `69` padded to 32 bytes +- `0x0000000000000000000000000000000000000000000000000000000000000001`: the second parameter - boolean `true`, padded to 32 bytes + +In total: +``` +0xcdcd77c000000000000000000000000000000000000000000000000000000000000000450000000000000000000000000000000000000000000000000000000000000001 +``` +It returns a single `bool`. If, for example, it were to return `false`, its output would be the single byte array `0x0000000000000000000000000000000000000000000000000000000000000000`, a single bool. + +If we wanted to call `bar` with the argument `[2.125, 8.5]`, we would pass 68 bytes total, broken down into: +- `0xab55044d`: the Method ID. This is derived from the signature `bar(fixed128x128[2])`. Note that `fixed` is replaced with its canonical representation `fixed128x128`. +- `0x0000000000000000000000000000000220000000000000000000000000000000`: the first part of the first parameter, a fixed128x128 value `2.125`. +- `0x0000000000000000000000000000000880000000000000000000000000000000`: the second part of the first parameter, a fixed128x128 value `8.5`. + +In total: +``` +0xab55044d00000000000000000000000000000002200000000000000000000000000000000000000000000000000000000000000880000000000000000000000000000000 +``` + +If we wanted to call `sam` with the arguments `"dave"`, `true` and `[1,2,3]`, we would pass 292 bytes total, broken down into: +- `0xa5643bf2`: the Method ID. This is derived from the signature `sam(bytes,bool,uint256[])`. Note that `uint` is replaced with its canonical representation `uint256`. +- `0x0000000000000000000000000000000000000000000000000000000000000060`: the location of the data part of the first parameter (dynamic type), measured in bytes from the start of the arguments block. In this case, `0x60`. +- `0x0000000000000000000000000000000000000000000000000000000000000001`: the second parameter: boolean true. +- `0x00000000000000000000000000000000000000000000000000000000000000a0`: the location of the data part of the third parameter (dynamic type), measured in bytes. In this case, `0xa0`. +- `0x0000000000000000000000000000000000000000000000000000000000000004`: the data part of the first argument, it starts with the length of the byte array in elements, in this case, 4. +- `0x6461766500000000000000000000000000000000000000000000000000000000`: the contents of the first argument: the UTF-8 (equal to ASCII in this case) encoding of `"dave"`, padded on the right to 32 bytes. +- `0x0000000000000000000000000000000000000000000000000000000000000003`: the data part of the third argument, it starts with the length of the array in elements, in this case, 3. +- `0x0000000000000000000000000000000000000000000000000000000000000001`: the first entry of the third parameter. +- `0x0000000000000000000000000000000000000000000000000000000000000002`: the second entry of the third parameter. +- `0x0000000000000000000000000000000000000000000000000000000000000003`: the third entry of the third parameter. + +In total: +``` +0xa5643bf20000000000000000000000000000000000000000000000000000000000000060000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000000000000000000000000000000000000000000464617665000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000003 +``` + + +### Use of Dynamic Types + +A call to a function with the signature `f(uint,uint32[],bytes10,bytes)` with values `(0x123, [0x456, 0x789], "1234567890", "Hello, world!")` is encoded in the following way: + +We take the first four bytes of `sha3("f(uint256,uint32[],bytes10,bytes)")`, i.e. `0x8be65246`. +Then we encode the head parts of all four arguments. For the static types `uint256` and `bytes10`, these are directly the values we want to pass, whereas for the dynamic types `uint32[]` and `bytes`, we use the offset in bytes to the start of their data area, measured from the start of the value encoding (i.e. not counting the first four bytes containing the hash of the function signature). These are: + + - `0x0000000000000000000000000000000000000000000000000000000000000123` (`0x123` padded to 32 bytes) + - `0x0000000000000000000000000000000000000000000000000000000000000080` (offset to start of data part of second parameter, 4*32 bytes, exactly the size of the head part) + - `0x3132333435363738393000000000000000000000000000000000000000000000` (`"1234567890"` padded to 32 bytes on the right) + - `0x00000000000000000000000000000000000000000000000000000000000000e0` (offset to start of data part of fourth parameter = offset to start of data part of first dynamic parameter + size of data part of first dynamic parameter = 4\*32 + 3\*32 (see below)) + +After this, the data part of the first dynamic argument, `[0x456, 0x789]` follows: + + - `0x0000000000000000000000000000000000000000000000000000000000000002` (number of elements of the array, 2) + - `0x0000000000000000000000000000000000000000000000000000000000000456` (first element) + - `0x0000000000000000000000000000000000000000000000000000000000000789` (second element) + +Finally, we encode the data part of the second dynamic argument, `"Hello, world!"`: + + - `0x000000000000000000000000000000000000000000000000000000000000000d` (number of elements (bytes in this case): 13) + - `0x48656c6c6f2c20776f726c642100000000000000000000000000000000000000` (`"Hello, world!"` padded to 32 bytes on the right) + +All together, the encoding is (newline after function selector and each 32-bytes for clarity): + +``` +0x8be65246 +0000000000000000000000000000000000000000000000000000000000000123 +0000000000000000000000000000000000000000000000000000000000000080 +3132333435363738393000000000000000000000000000000000000000000000 +00000000000000000000000000000000000000000000000000000000000000e0 +0000000000000000000000000000000000000000000000000000000000000002 +0000000000000000000000000000000000000000000000000000000000000456 +0000000000000000000000000000000000000000000000000000000000000789 +000000000000000000000000000000000000000000000000000000000000000d +48656c6c6f2c20776f726c642100000000000000000000000000000000000000 +``` + + +# Events + +Events are an abstraction of the Ethereum logging/event-watching protocol. Log entries provide the contract's address, a series of up to four topics and some arbitrary length binary data. Events leverage the existing function ABI in order to interpret this (together with an interface spec) as a properly typed structure. + +Given an event name and series of event parameters, we split them into two sub-series: those which are indexed and those which are not. Those which are indexed, which may number up to 3, are used alongside the Keccak hash of the event signature to form the topics of the log entry. Those which as not indexed form the byte array of the event. + +In effect, a log entry using this ABI is described as: + +- `address`: the address of the contract (intrinsically provided by Ethereum); +- `topics[0]`: `keccak(EVENT_NAME+"("+EVENT_ARGS.map(canonical_type_of).join(",")+")")` (`canonical_type_of` is a function that simply returns the canonical type of a given argument, e.g. for `uint indexed foo`, it would return `uint256`). If the event is declared as `anonymous` the `topics[0]` is not generated; +- `topics[n]`: `EVENT_INDEXED_ARGS[n - 1]` (`EVENT_INDEXED_ARGS` is the series of `EVENT_ARGS` that are indexed); +- `data`: `abi_serialise(EVENT_NON_INDEXED_ARGS)` (`EVENT_NON_INDEXED_ARGS` is the series of `EVENT_ARGS` that are not indexed, `abi_serialise` is the ABI serialisation function used for returning a series of typed values from a function, as described above). + +# JSON + +The JSON format for a contract's interface is given by an array of function and/or event descriptions. A function description is a JSON object with the fields: + +- `type`: `"function"`, `"constructor"`, or `"fallback"` (the [unnamed "default" function](http://solidity.readthedocs.io/en/develop/contracts.html#fallback-function)); +- `name`: the name of the function; +- `inputs`: an array of objects, each of which contains: + * `name`: the name of the parameter; + * `type`: the canonical type of the parameter. +- `outputs`: an array of objects similar to `inputs`, can be omitted if function doesn't return anything; +- `constant`: `true` if function is [specified to not modify blockchain state](http://solidity.readthedocs.io/en/develop/contracts.html#constant-functions); +- `payable`: `true` if function accepts ether, defaults to `false`. + +`type` can be omitted, defaulting to `"function"`. + +Constructor and fallback function never have `name` or `outputs`. Fallback function doesn't have `inputs` either. + +Sending non-zero ether to non-payable function will throw. Don't do it. + +An event description is a JSON object with fairly similar fields: + +- `type`: always `"event"` +- `name`: the name of the event; +- `inputs`: an array of objects, each of which contains: + * `name`: the name of the parameter; + * `type`: the canonical type of the parameter. + * `indexed`: `true` if the field is part of the log's topics, `false` if it one of the log's data segment. +- `anonymous`: `true` if the event was declared as `anonymous`. + +For example, + +```js +contract Test { +function Test(){ b = 0x12345678901234567890123456789012; } +event Event(uint indexed a, bytes32 b) +event Event2(uint indexed a, bytes32 b) +function foo(uint a) { Event(a, b); } +bytes32 b; +} +``` + +would result in the JSON: + +```js +[{ +"type":"event", +"inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], +"name":"Event" +}, { +"type":"event", +"inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], +"name":"Event2" +}, { +"type":"event", +"inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], +"name":"Event2" +}, { +"type":"function", +"inputs": [{"name":"a","type":"uint256"}], +"name":"foo", +"outputs": [] +}] +``` \ No newline at end of file From 3525280a7262c908cdcb6dcf190e5fc630cb0bef Mon Sep 17 00:00:00 2001 From: RJ Catalano Date: Fri, 9 Jun 2017 14:49:59 -0500 Subject: [PATCH 2/7] some small fixes to the names and underlining; still need to fix the fixed point bytecode representation Signed-off-by: RJ Catalano --- docs/abi-spec.rst | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/docs/abi-spec.rst b/docs/abi-spec.rst index ee25c65fd..4d721167f 100644 --- a/docs/abi-spec.rst +++ b/docs/abi-spec.rst @@ -68,15 +68,11 @@ Formal Specification of the Encoding We will now formally specify the encoding, such that it will have the following properties, which are especially useful if some arguments are nested arrays: -**Properties:** +Properties: -1. The number of reads necessary to access a value is at most the depth of the -value inside the argument array structure, i.e. four reads are needed to -retrieve `a_i[k][l][r]`. In a previous version of the ABI, the number of reads scaled -linearly with the total number of dynamic parameters in the worst case. + 1. The number of reads necessary to access a value is at most the depth of the value inside the argument array structure, i.e. four reads are needed to retrieve `a_i[k][l][r]`. In a previous version of the ABI, the number of reads scaled linearly with the total number of dynamic parameters in the worst case. -2. The data of a variable or array element is not interleaved with other data -and it is relocatable, i.e. it only uses relative "addresses" + 2. The data of a variable or array element is not interleaved with other data and it is relocatable, i.e. it only uses relative "addresses" We distinguish static and dynamic types. Static types are encoded in-place and dynamic types are encoded at a separately allocated location after the current block. @@ -92,8 +88,7 @@ All other types are called "static". The type of `len(a)` is assumed to be `uint256`. We define `enc`, the actual encoding, as a mapping of values of the ABI types to binary strings such -that `len(enc(X))` depends on the value of `X` if and only if the type of `X` -is dynamic. +that `len(enc(X))` depends on the value of `X` if and only if the type of `X` is dynamic. **Definition:** For any ABI value `X`, we recursively define `enc(X)`, depending on the type of `X` being @@ -158,7 +153,8 @@ fixed-size arrays of length `n` and `k`, respectively. Note that strictly, encoding is still well-defined as the assumed common type `T` (above) is not actually used. -## Examples +Examples +======== Given the contract: @@ -183,9 +179,9 @@ In total: It returns a single `bool`. If, for example, it were to return `false`, its output would be the single byte array `0x0000000000000000000000000000000000000000000000000000000000000000`, a single bool. If we wanted to call `bar` with the argument `[2.125, 8.5]`, we would pass 68 bytes total, broken down into: -- `0xab55044d`: the Method ID. This is derived from the signature `bar(fixed128x128[2])`. Note that `fixed` is replaced with its canonical representation `fixed128x128`. -- `0x0000000000000000000000000000000220000000000000000000000000000000`: the first part of the first parameter, a fixed128x128 value `2.125`. -- `0x0000000000000000000000000000000880000000000000000000000000000000`: the second part of the first parameter, a fixed128x128 value `8.5`. +- `0xab55044d`: the Method ID. This is derived from the signature `bar(fixed[2])`. Note that `fixed` is replaced with its canonical representation `fixed128x19`. +- `0x0000000000000000000000000000000220000000000000000000000000000000`: the first part of the first parameter, a fixed128x19 value `2.125`. +- `0x0000000000000000000000000000000880000000000000000000000000000000`: the second part of the first parameter, a fixed128x19 value `8.5`. In total: ``` @@ -209,8 +205,8 @@ In total: 0xa5643bf20000000000000000000000000000000000000000000000000000000000000060000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000000000000000000000000000000000000000000464617665000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000003 ``` - -### Use of Dynamic Types +Use of Dynamic Types +==================== A call to a function with the signature `f(uint,uint32[],bytes10,bytes)` with values `(0x123, [0x456, 0x789], "1234567890", "Hello, world!")` is encoded in the following way: From a0777a7ffb146b995753722358be204055058218 Mon Sep 17 00:00:00 2001 From: chriseth Date: Mon, 12 Jun 2017 17:43:48 +0200 Subject: [PATCH 3/7] Include structs. --- docs/abi-spec.rst | 66 ++++++++++++++++++++++++++++++----------------- 1 file changed, 42 insertions(+), 24 deletions(-) diff --git a/docs/abi-spec.rst b/docs/abi-spec.rst index 4d721167f..fa98cfa0d 100644 --- a/docs/abi-spec.rst +++ b/docs/abi-spec.rst @@ -9,6 +9,10 @@ Application Binary Interface Specification Basic design ============ +The Application Binary Interface is the standard way to interact with contracts in the Ethereum ecosystem, both +from outside the blockchain and for contract-to-contract interaction. Data is encoded following its type, +according to this specification. + We assume the Application Binary Interface (ABI) is strongly typed, known at compilation time and static. No introspection mechanism will be provided. We assert that all contracts will have the interface definitions of any contracts they call available at compile-time. This specification does not address contracts whose interface is dynamic or otherwise known only at run-time. Should these cases become important they can be adequately handled as facilities built within the Ethereum ecosystem. @@ -62,6 +66,14 @@ The following non-fixed-size types exist: - `[]`: a variable-length array of the given fixed-length type. +Types can be combined to anonymous structs by enclosing a finite non-negative number +of them inside parentheses, separated by commas: + +- `(T1,T2,...,Tn)`: anonymous struct (ordered tuple) consisting of the types `T1`, ..., `Tn`, `n >= 0` + +It is possible to form structs of structs, arrays of structs and so on. + + Formal Specification of the Encoding ==================================== @@ -93,20 +105,28 @@ that `len(enc(X))` depends on the value of `X` if and only if the type of `X` is **Definition:** For any ABI value `X`, we recursively define `enc(X)`, depending on the type of `X` being +- `(T1,...,Tk)` for `k >= 0` and any types `T1`, ..., `Tk` + + `enc(X) = head(X(1)) ... head(X(k-1)) tail(X(0)) ... tail(X(k-1))` + + where `X(i)` is the `ith` component of the value, and + `head` and `tail` are defined for `Ti` being a static type as + `head(X(i)) = enc(X(i))` and `tail(X(i)) = ""` (the empty string) + and as + `head(X(i)) = enc(len(head(X(0)) ... head(X(k-1)) tail(X(0)) ... tail(X(i-1))))` + `tail(X(i)) = enc(X(i))` + otherwise, i.e. if `Ti` is a dynamic type. + + Note that in the dynamic case, `head(X(i))` is well-defined since the lengths of + the head parts only depend on the types and not the values. Its value is the offset + of the beginning of `tail(X(i))` relative to the start of `enc(X)`. + - `T[k]` for any `T` and `k`: - `enc(X) = head(X[0]) ... head(X[k-1]) tail(X[0]) ... tail(X[k-1])` - - where `head` and `tail` are defined for `X[i]` being of a static type as - `head(X[i]) = enc(X[i])` and `tail(X[i]) = ""` (the empty string) - and as - `head(X[i]) = enc(len(head(X[0]) ... head(X[k-1]) tail(X[0]) ... tail(X[i-1])))` - `tail(X[i]) = enc(X[i])` - otherwise. - - Note that in the dynamic case, `head(X[i])` is well-defined since the lengths of - the head parts only depend on the types and not the values. Its value is the offset - of the beginning of `tail(X[i])` relative to the start of `enc(X)`. + `enc(X) = enc((X[0], ..., X[k-1]))` + + i.e. it is encoded as if it were an anonymous struct with `k` elements + of the same type. - `T[]` where `X` has `k` elements (`k` is assumed to be of type `uint256`): @@ -141,17 +161,13 @@ Note that for any `X`, `len(enc(X))` is a multiple of 32. All in all, a call to the function `f` with parameters `a_1, ..., a_n` is encoded as - `function_selector(f) enc([a_1, ..., a_n])` + `function_selector(f) enc((a_1, ..., a_n))` and the return values `v_1, ..., v_k` of `f` are encoded as - `enc([v_1, ..., v_k])` + `enc((v_1, ..., v_k))` -where the types of `[a_1, ..., a_n]` and `[v_1, ..., v_k]` are assumed to be -fixed-size arrays of length `n` and `k`, respectively. Note that strictly, -`[a_1, ..., a_n]` can be an "array" with elements of different types, but the -encoding is still well-defined as the assumed common type `T` (above) is not -actually used. +i.e. the values are combined into an anonymous struct and encoded. Examples ======== @@ -245,7 +261,8 @@ All together, the encoding is (newline after function selector and each 32-bytes ``` -# Events +Events +====== Events are an abstraction of the Ethereum logging/event-watching protocol. Log entries provide the contract's address, a series of up to four topics and some arbitrary length binary data. Events leverage the existing function ABI in order to interpret this (together with an interface spec) as a properly typed structure. @@ -258,17 +275,18 @@ In effect, a log entry using this ABI is described as: - `topics[n]`: `EVENT_INDEXED_ARGS[n - 1]` (`EVENT_INDEXED_ARGS` is the series of `EVENT_ARGS` that are indexed); - `data`: `abi_serialise(EVENT_NON_INDEXED_ARGS)` (`EVENT_NON_INDEXED_ARGS` is the series of `EVENT_ARGS` that are not indexed, `abi_serialise` is the ABI serialisation function used for returning a series of typed values from a function, as described above). -# JSON +JSON +==== The JSON format for a contract's interface is given by an array of function and/or event descriptions. A function description is a JSON object with the fields: -- `type`: `"function"`, `"constructor"`, or `"fallback"` (the [unnamed "default" function](http://solidity.readthedocs.io/en/develop/contracts.html#fallback-function)); +- `type`: `"function"`, `"constructor"`, or `"fallback"` (the :ref:`unnamed "default" function `); - `name`: the name of the function; - `inputs`: an array of objects, each of which contains: * `name`: the name of the parameter; * `type`: the canonical type of the parameter. - `outputs`: an array of objects similar to `inputs`, can be omitted if function doesn't return anything; -- `constant`: `true` if function is [specified to not modify blockchain state](http://solidity.readthedocs.io/en/develop/contracts.html#constant-functions); +- `constant`: `true` if function is :ref:`specified to not modify blockchain state `); - `payable`: `true` if function accepts ether, defaults to `false`. `type` can be omitted, defaulting to `"function"`. @@ -320,4 +338,4 @@ would result in the JSON: "name":"foo", "outputs": [] }] -``` \ No newline at end of file +``` From c66c5d4a21bfdd6eea8ce1114cbc700b1dc9eac2 Mon Sep 17 00:00:00 2001 From: chriseth Date: Mon, 12 Jun 2017 17:49:11 +0200 Subject: [PATCH 4/7] Change fixed number example. --- docs/abi-spec.rst | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/docs/abi-spec.rst b/docs/abi-spec.rst index fa98cfa0d..b79bb613e 100644 --- a/docs/abi-spec.rst +++ b/docs/abi-spec.rst @@ -149,10 +149,10 @@ on the type of `X` being - `address`: as in the `uint160` case - `int`: `enc(X)` is the big-endian two's complement encoding of `X`, padded on the higher-oder (left) side with `0xff` for negative `X` and with zero bytes for positive `X` such that the length is a multiple of 32 bytes. - `bool`: as in the `uint8` case, where `1` is used for `true` and `0` for `false` -- `fixedx`: `enc(X)` is `enc(X * 2**N)` where `X * 2**N` is interpreted as a `int256`. -- `fixed`: as in the `fixed128x128` case -- `ufixedx`: `enc(X)` is `enc(X * 2**N)` where `X * 2**N` is interpreted as a `uint256`. -- `ufixed`: as in the `ufixed128x128` case +- `fixedx`: `enc(X)` is `enc(X * 10**N)` where `X * 10**N` is interpreted as a `int256`. +- `fixed`: as in the `fixed128x19` case +- `ufixedx`: `enc(X)` is `enc(X * 10**N)` where `X * 10**N` is interpreted as a `uint256`. +- `ufixed`: as in the `ufixed128x19` case - `bytes`: `enc(X)` is the sequence of bytes in `X` padded with zero-bytes to a length of 32. Note that for any `X`, `len(enc(X))` is a multiple of 32. @@ -176,7 +176,7 @@ Given the contract: ```js contract Foo { - function bar(fixed[2] xy) {} + function bar(bytes3[2] xy) {} function baz(uint32 x, bool y) returns (bool r) { r = x > 32 || y; } function sam(bytes name, bool z, uint[] data) {} } @@ -194,14 +194,15 @@ In total: ``` It returns a single `bool`. If, for example, it were to return `false`, its output would be the single byte array `0x0000000000000000000000000000000000000000000000000000000000000000`, a single bool. -If we wanted to call `bar` with the argument `[2.125, 8.5]`, we would pass 68 bytes total, broken down into: -- `0xab55044d`: the Method ID. This is derived from the signature `bar(fixed[2])`. Note that `fixed` is replaced with its canonical representation `fixed128x19`. -- `0x0000000000000000000000000000000220000000000000000000000000000000`: the first part of the first parameter, a fixed128x19 value `2.125`. -- `0x0000000000000000000000000000000880000000000000000000000000000000`: the second part of the first parameter, a fixed128x19 value `8.5`. +If we wanted to call `bar` with the argument `["abc", "def"]`, we would pass 68 bytes total, broken down into: + +- `0xfce353f6`: the Method ID. This is derived from the signature `bar(bytes3[2])`. +- `0x6162630000000000000000000000000000000000000000000000000000000000`: the first part of the first parameter, a `bytes3` value `"abc"` (left-aligned). +- `0x6465660000000000000000000000000000000000000000000000000000000000`: the second part of the first parameter, a `bytes3` value `"def"` (left-aligned). In total: ``` -0xab55044d00000000000000000000000000000002200000000000000000000000000000000000000000000000000000000000000880000000000000000000000000000000 +0xfce353f661626300000000000000000000000000000000000000000000000000000000006465660000000000000000000000000000000000000000000000000000000000 ``` If we wanted to call `sam` with the arguments `"dave"`, `true` and `[1,2,3]`, we would pass 292 bytes total, broken down into: From 3170fd9a93f9844a155e4f0e01500ab4c9a0bb4b Mon Sep 17 00:00:00 2001 From: chriseth Date: Mon, 12 Jun 2017 17:50:03 +0200 Subject: [PATCH 5/7] Formatting of heading. --- docs/abi-spec.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/abi-spec.rst b/docs/abi-spec.rst index b79bb613e..66a3befdf 100644 --- a/docs/abi-spec.rst +++ b/docs/abi-spec.rst @@ -157,7 +157,8 @@ on the type of `X` being Note that for any `X`, `len(enc(X))` is a multiple of 32. -## Function Selector and Argument Encoding +Function Selector and Argument Encoding +======================================= All in all, a call to the function `f` with parameters `a_1, ..., a_n` is encoded as From ca70d82b96de33ce168a22d38cb0ed4a5a4d2cb9 Mon Sep 17 00:00:00 2001 From: chriseth Date: Mon, 12 Jun 2017 18:33:23 +0200 Subject: [PATCH 6/7] Include abi specs in index and fix styling. --- docs/abi-spec.rst | 122 +++++++++++++++++++++++---------------------- docs/contracts.rst | 2 + docs/index.rst | 1 + 3 files changed, 65 insertions(+), 60 deletions(-) diff --git a/docs/abi-spec.rst b/docs/abi-spec.rst index 66a3befdf..607442dbe 100644 --- a/docs/abi-spec.rst +++ b/docs/abi-spec.rst @@ -111,10 +111,14 @@ on the type of `X` being where `X(i)` is the `ith` component of the value, and `head` and `tail` are defined for `Ti` being a static type as + `head(X(i)) = enc(X(i))` and `tail(X(i)) = ""` (the empty string) + and as + `head(X(i)) = enc(len(head(X(0)) ... head(X(k-1)) tail(X(0)) ... tail(X(i-1))))` `tail(X(i)) = enc(X(i))` + otherwise, i.e. if `Ti` is a dynamic type. Note that in the dynamic case, `head(X(i))` is well-defined since the lengths of @@ -175,13 +179,14 @@ Examples Given the contract: -```js -contract Foo { - function bar(bytes3[2] xy) {} - function baz(uint32 x, bool y) returns (bool r) { r = x > 32 || y; } - function sam(bytes name, bool z, uint[] data) {} -} -``` +:: + + contract Foo { + function bar(bytes3[2] xy) {} + function baz(uint32 x, bool y) returns (bool r) { r = x > 32 || y; } + function sam(bytes name, bool z, uint[] data) {} + } + Thus for our `Foo` example if we wanted to call `baz` with the parameters `69` and `true`, we would pass 68 bytes total, which can be broken down into: @@ -189,10 +194,10 @@ Thus for our `Foo` example if we wanted to call `baz` with the parameters `69` a - `0x0000000000000000000000000000000000000000000000000000000000000045`: the first parameter, a uint32 value `69` padded to 32 bytes - `0x0000000000000000000000000000000000000000000000000000000000000001`: the second parameter - boolean `true`, padded to 32 bytes -In total: -``` -0xcdcd77c000000000000000000000000000000000000000000000000000000000000000450000000000000000000000000000000000000000000000000000000000000001 -``` +In total:: + + 0xcdcd77c000000000000000000000000000000000000000000000000000000000000000450000000000000000000000000000000000000000000000000000000000000001 + It returns a single `bool`. If, for example, it were to return `false`, its output would be the single byte array `0x0000000000000000000000000000000000000000000000000000000000000000`, a single bool. If we wanted to call `bar` with the argument `["abc", "def"]`, we would pass 68 bytes total, broken down into: @@ -201,10 +206,9 @@ If we wanted to call `bar` with the argument `["abc", "def"]`, we would pass 68 - `0x6162630000000000000000000000000000000000000000000000000000000000`: the first part of the first parameter, a `bytes3` value `"abc"` (left-aligned). - `0x6465660000000000000000000000000000000000000000000000000000000000`: the second part of the first parameter, a `bytes3` value `"def"` (left-aligned). -In total: -``` -0xfce353f661626300000000000000000000000000000000000000000000000000000000006465660000000000000000000000000000000000000000000000000000000000 -``` +In total:: + + 0xfce353f661626300000000000000000000000000000000000000000000000000000000006465660000000000000000000000000000000000000000000000000000000000 If we wanted to call `sam` with the arguments `"dave"`, `true` and `[1,2,3]`, we would pass 292 bytes total, broken down into: - `0xa5643bf2`: the Method ID. This is derived from the signature `sam(bytes,bool,uint256[])`. Note that `uint` is replaced with its canonical representation `uint256`. @@ -218,10 +222,9 @@ If we wanted to call `sam` with the arguments `"dave"`, `true` and `[1,2,3]`, we - `0x0000000000000000000000000000000000000000000000000000000000000002`: the second entry of the third parameter. - `0x0000000000000000000000000000000000000000000000000000000000000003`: the third entry of the third parameter. -In total: -``` -0xa5643bf20000000000000000000000000000000000000000000000000000000000000060000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000000000000000000000000000000000000000000464617665000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000003 -``` +In total:: + + 0xa5643bf20000000000000000000000000000000000000000000000000000000000000060000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000000000000000000000000000000000000000000464617665000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000003 Use of Dynamic Types ==================== @@ -249,19 +252,18 @@ Finally, we encode the data part of the second dynamic argument, `"Hello, world! All together, the encoding is (newline after function selector and each 32-bytes for clarity): -``` -0x8be65246 -0000000000000000000000000000000000000000000000000000000000000123 -0000000000000000000000000000000000000000000000000000000000000080 -3132333435363738393000000000000000000000000000000000000000000000 -00000000000000000000000000000000000000000000000000000000000000e0 -0000000000000000000000000000000000000000000000000000000000000002 -0000000000000000000000000000000000000000000000000000000000000456 -0000000000000000000000000000000000000000000000000000000000000789 -000000000000000000000000000000000000000000000000000000000000000d -48656c6c6f2c20776f726c642100000000000000000000000000000000000000 -``` +:: + 0x8be65246 + 0000000000000000000000000000000000000000000000000000000000000123 + 0000000000000000000000000000000000000000000000000000000000000080 + 3132333435363738393000000000000000000000000000000000000000000000 + 00000000000000000000000000000000000000000000000000000000000000e0 + 0000000000000000000000000000000000000000000000000000000000000002 + 0000000000000000000000000000000000000000000000000000000000000456 + 0000000000000000000000000000000000000000000000000000000000000789 + 000000000000000000000000000000000000000000000000000000000000000d + 48656c6c6f2c20776f726c642100000000000000000000000000000000000000 Events ====== @@ -309,35 +311,35 @@ An event description is a JSON object with fairly similar fields: For example, -```js -contract Test { -function Test(){ b = 0x12345678901234567890123456789012; } -event Event(uint indexed a, bytes32 b) -event Event2(uint indexed a, bytes32 b) -function foo(uint a) { Event(a, b); } -bytes32 b; -} -``` +:: + + contract Test { + function Test(){ b = 0x12345678901234567890123456789012; } + event Event(uint indexed a, bytes32 b) + event Event2(uint indexed a, bytes32 b) + function foo(uint a) { Event(a, b); } + bytes32 b; + } would result in the JSON: -```js -[{ -"type":"event", -"inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], -"name":"Event" -}, { -"type":"event", -"inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], -"name":"Event2" -}, { -"type":"event", -"inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], -"name":"Event2" -}, { -"type":"function", -"inputs": [{"name":"a","type":"uint256"}], -"name":"foo", -"outputs": [] -}] -``` +.. code:: JSON + + [{ + "type":"event", + "inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], + "name":"Event" + }, { + "type":"event", + "inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], + "name":"Event2" + }, { + "type":"event", + "inputs": [{"name":"a","type":"uint256","indexed":true},{"name":"b","type":"bytes32","indexed":false}], + "name":"Event2" + }, { + "type":"function", + "inputs": [{"name":"a","type":"uint256"}], + "name":"foo", + "outputs": [] + }] diff --git a/docs/contracts.rst b/docs/contracts.rst index a1192d4e6..74f13cbb2 100644 --- a/docs/contracts.rst +++ b/docs/contracts.rst @@ -458,6 +458,8 @@ value types and strings. } +.. _constant-functions: + ****************** Constant Functions ****************** diff --git a/docs/index.rst b/docs/index.rst index 4b48b91cb..f0ec4472a 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -142,6 +142,7 @@ Contents solidity-in-depth.rst security-considerations.rst using-the-compiler.rst + abi-spec.rst style-guide.rst common-patterns.rst bugs.rst From 1d644bed3159c5557623a5f40fe101b517731a9a Mon Sep 17 00:00:00 2001 From: RJ Catalano Date: Wed, 14 Jun 2017 08:06:03 -0500 Subject: [PATCH 7/7] try to get rid of warning Signed-off-by: RJ Catalano --- docs/abi-spec.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/abi-spec.rst b/docs/abi-spec.rst index 607442dbe..e39c88617 100644 --- a/docs/abi-spec.rst +++ b/docs/abi-spec.rst @@ -323,7 +323,7 @@ For example, would result in the JSON: -.. code:: JSON +.. code:: json [{ "type":"event",