Fork of plugeth with any needed changes for statediff plugin
Go to file
Felix Lange 0042f13d47 eth/downloader: separate state sync from queue (#14460)
* eth/downloader: separate state sync from queue

Scheduling of state node downloads hogged the downloader queue lock when
new requests were scheduled. This caused timeouts for other requests.
With this change, state sync is fully independent of all other downloads
and doesn't involve the queue at all.

State sync is started and checked on in processContent. This is slightly
awkward because processContent doesn't have a select loop. Instead, the
queue is closed by an auxiliary goroutine when state sync fails. We
tried several alternatives to this but settled on the current approach
because it's the least amount of change overall.

Handling of the pivot block has changed slightly: the queue previously
prevented import of pivot block receipts before the state of the pivot
block was available. In this commit, the receipt will be imported before
the state. This causes an annoyance where the pivot block is committed
as fast block head even when state downloads fail. Stay tuned for more
updates in this area ;)

* eth/downloader: remove cancelTimeout channel

* eth/downloader: retry state requests on timeout

* eth/downloader: improve comment

* eth/downloader: mark peers idle when state sync is done

* eth/downloader: move pivot block splitting to processContent

This change also ensures that pivot block receipts aren't imported
before the pivot block itself.

* eth/downloader: limit state node retries

* eth/downloader: improve state node error handling and retry check

* eth/downloader: remove maxStateNodeRetries

It fails the sync too much.

* eth/downloader: remove last use of cancelCh in statesync.go

Fixes TestDeliverHeadersHang*Fast and (hopefully)
the weird cancellation behaviour at the end of fast sync.

* eth/downloader: fix leak in runStateSync

* eth/downloader: don't run processFullSyncContent in LightSync mode

* eth/downloader: improve comments

* eth/downloader: fix vet, megacheck

* eth/downloader: remove unrequested tasks anyway

* eth/downloader, trie: various polishes around duplicate items

This commit explicitly tracks duplicate and unexpected state
delieveries done against a trie Sync structure, also adding
there to import info logs.

The commit moves the db batch used to commit trie changes one
level deeper so its flushed after every node insertion. This
is needed to avoid a lot of duplicate retrievals caused by
inconsistencies between Sync internals and database. A better
approach is to track not-yet-written states in trie.Sync and
flush on commit, but I'm focuing on correctness first now.

The commit fixes a regression around pivot block fail count.
The counter previously was reset to 1 if and only if a sync
cycle progressed (inserted at least 1 entry to the database).
The current code reset it already if a node was delivered,
which is not stong enough, because unless it ends up written
to disk, an attacker can just loop and attack ad infinitum.

The commit also fixes a regression around state deliveries
and timeouts. The old downloader tracked if a delivery is
stale (none of the deliveries were requestedt), in which
case it didn't mark the node idle and did not send further
requests, since it signals a past timeout. The current code
did mark it idle even on stale deliveries, which eventually
caused two requests to be in flight at the same time, making
the deliveries always stale and mass duplicating retrievals
between multiple peers.

* eth/downloader: fix state request leak

This commit fixes the hang seen sometimes while doing the state
sync. The cause of the hang was a rare combination of events:
request state data from peer, peer drops and reconnects almost
immediately. This caused a new download task to be assigned to
the peer, overwriting the old one still waiting for a timeout,
which in turned leaked the requests out, never to be retried.
The fix is to ensure that a task assignment moves any pending
one back into the retry queue.

The commit also fixes a regression with peer dropping due to
stalls. The current code considered a peer stalling if they
timed out delivering 1 item. However, the downloader never
requests only one, the minimum is 2 (attempt to fine tune
estimated latency/bandwidth). The fix is simply to drop if
a timeout is detected at 2 items.

Apart from the above bugfixes, the commit contains some code
polishes I made while debugging the hang.

* core, eth, trie: support batched trie sync db writes

* trie: rename SyncMemCache to syncMemBatch
2017-06-22 15:26:03 +03:00
.github README: removed develop mentions 2016-11-30 19:39:05 +01:00
accounts accounts: fix spelling error (#14567) 2017-06-06 09:28:47 +02:00
build crypto/bn256: fix go vet false positive 2017-05-24 15:40:26 +02:00
cmd swarm/test: add integration test for 'swarm up' (#14353) 2017-06-21 14:54:23 +02:00
common core/vm, common/math: Add doc about Byte, fix format 2017-06-08 23:16:05 +02:00
compression/rle rlp, trie, contracts, compression, consensus: improve comments (#14580) 2017-06-12 14:45:17 +02:00
consensus consensus/clique: fix typo and don't add snapshot into recents again 2017-06-20 10:20:45 +08:00
console console: avoid float64 when remarshaling parameters 2017-05-02 13:42:55 +02:00
containers cotnainers/docker: fix the legacy alpine image before dropping 2017-06-01 00:21:47 +03:00
contracts rlp, trie, contracts, compression, consensus: improve comments (#14580) 2017-06-12 14:45:17 +02:00
core eth/downloader: separate state sync from queue (#14460) 2017-06-22 15:26:03 +03:00
crypto accounts/keystore, crypto: don't enforce key checks on existing keyfiles 2017-06-01 11:11:06 +03:00
eth eth/downloader: separate state sync from queue (#14460) 2017-06-22 15:26:03 +03:00
ethclient ethclient: fix TransactionByHash pending return value. (#14663) 2017-06-21 09:53:50 +02:00
ethdb trie: more node iterator improvements (#14615) 2017-06-20 18:26:09 +02:00
ethstats ethstats: reduce ethstats traffic by trottling reports 2017-06-01 00:16:19 +03:00
event all: update license information 2017-04-14 10:29:00 +02:00
internal swarm/test: add integration test for 'swarm up' (#14353) 2017-06-21 14:54:23 +02:00
les les: code refactoring (#14416) 2017-06-21 12:27:38 +02:00
light Merge pull request #14350 from fjl/trie-iterator-skip-2 2017-04-25 11:10:20 +03:00
log core, consensus: pluggable consensus engines (#3817) 2017-04-05 00:16:29 +02:00
metrics all: update light logs (and a few others) to the new model 2017-03-03 11:41:52 +02:00
miner consensus, core/*, params: metropolis preparation refactor 2017-05-18 09:05:58 +02:00
mobile mobile: add a regression test for signer recovery 2017-06-13 13:39:39 +03:00
node cmd, node: support different bootnodes, fix default light port 2017-05-10 17:51:52 +03:00
p2p discover: Changed Logging from Debug to Info (#14485) 2017-05-20 13:10:59 +02:00
params VERSION, params: begin Geth 1.6.6 release cycle 2017-06-01 22:08:19 +03:00
rlp rlp, trie, contracts, compression, consensus: improve comments (#14580) 2017-06-12 14:45:17 +02:00
rpc Merge pull request #13885 from bas-vk/rpc_generic_pubsub 2017-05-03 11:41:07 +03:00
swarm swarm/fuse: use subtests 2017-06-21 09:56:58 +02:00
tests accounts/keystore, crypto: enforce 256 bit keys on import 2017-05-23 14:58:03 +03:00
trie eth/downloader: separate state sync from queue (#14460) 2017-06-22 15:26:03 +03:00
vendor swarm/test: add integration test for 'swarm up' (#14353) 2017-06-21 14:54:23 +02:00
whisper whisper: switching to v5 + minor refactoring (#14387) 2017-04-28 11:57:15 +02:00
.dockerignore Dockerfile: support building USB on Alpine, ignore temp files 2017-02-13 18:31:09 +02:00
.gitattributes .gitattributes: add 2015-08-06 17:18:59 +02:00
.gitignore Godeps, vendor: convert dependency management to trash (#3198) 2016-10-28 19:05:01 +02:00
.mailmap all: update license information 2017-04-14 10:29:00 +02:00
.travis.yml travis, appveyor: bump to Go 1.8.3, Android NDK 14b 2017-05-25 17:05:33 +03:00
appveyor.yml travis, appveyor: bump to Go 1.8.3, Android NDK 14b 2017-05-25 17:05:33 +03:00
AUTHORS all: update license information 2017-04-14 10:29:00 +02:00
circle.yml circleci: enable docker based hive testing 2016-07-15 16:07:34 +03:00
COPYING all: update license information 2015-07-07 14:12:44 +02:00
COPYING.LESSER all: update license information 2015-07-07 14:12:44 +02:00
Dockerfile dockerfile: expose 30303/udp 2017-05-25 12:34:29 -07:00
interfaces.go all: import "context" instead of "golang.org/x/net/context" 2017-03-22 20:49:15 +01:00
Makefile core, core/types: regenerate JSON marshaling, add "hash" to headers (#13868) 2017-04-06 11:38:21 +03:00
README.md README: document new config file option (#14348) 2017-06-21 14:53:08 +02:00
VERSION VERSION, params: begin Geth 1.6.6 release cycle 2017-06-01 22:08:19 +03:00

Ethereum Go

Official golang implementation of the Ethereum protocol.

API Reference Gitter

Automated builds are available for stable releases and the unstable master branch. Binary archives are published at https://geth.ethereum.org/downloads/.

Building the source

For prerequisites and detailed build instructions please read the Installation Instructions on the wiki.

Building geth requires both a Go (version 1.7 or later) and a C compiler. You can install them using your favourite package manager. Once the dependencies are installed, run

make geth

or, to build the full suite of utilities:

make all

Executables

The go-ethereum project comes with several wrappers/executables found in the cmd directory.

Command Description
geth Our main Ethereum CLI client. It is the entry point into the Ethereum network (main-, test- or private net), capable of running as a full node (default) archive node (retaining all historical state) or a light node (retrieving data live). It can be used by other processes as a gateway into the Ethereum network via JSON RPC endpoints exposed on top of HTTP, WebSocket and/or IPC transports. geth --help and the CLI Wiki page for command line options
abigen Source code generator to convert Ethereum contract definitions into easy to use, compile-time type-safe Go packages. It operates on plain Ethereum contract ABIs with expanded functionality if the contract bytecode is also available. However it also accepts Solidity source files, making development much more streamlined. Please see our Native DApps wiki page for details.
bootnode Stripped down version of our Ethereum client implementation that only takes part in the network node discovery protocol, but does not run any of the higher level application protocols. It can be used as a lightweight bootstrap node to aid in finding peers in private networks.
disasm Bytecode disassembler to convert EVM (Ethereum Virtual Machine) bytecode into more user friendly assembly-like opcodes (e.g. `echo "6001"
evm Developer utility version of the EVM (Ethereum Virtual Machine) that is capable of running bytecode snippets within a configurable environment and execution mode. Its purpose is to allow insolated, fine-grained debugging of EVM opcodes (e.g. evm --code 60ff60ff --debug).
gethrpctest Developer utility tool to support our ethereum/rpc-test test suite which validates baseline conformity to the Ethereum JSON RPC specs. Please see the test suite's readme for details.
rlpdump Developer utility tool to convert binary RLP (Recursive Length Prefix) dumps (data encoding used by the Ethereum protocol both network as well as consensus wise) to user friendlier hierarchical representation (e.g. rlpdump --hex CE0183FFFFFFC4C304050583616263).
swarm swarm daemon and tools. This is the entrypoint for the swarm network. swarm --help for command line options and subcommands. See https://swarm-guide.readthedocs.io for swarm documentation.

Running geth

Going through all the possible command line flags is out of scope here (please consult our CLI Wiki page), but we've enumerated a few common parameter combos to get you up to speed quickly on how you can run your own Geth instance.

Full node on the main Ethereum network

By far the most common scenario is people wanting to simply interact with the Ethereum network: create accounts; transfer funds; deploy and interact with contracts. For this particular use-case the user doesn't care about years-old historical data, so we can fast-sync quickly to the current state of the network. To do so:

$ geth --fast --cache=512 console

This command will:

  • Start geth in fast sync mode (--fast), causing it to download more data in exchange for avoiding processing the entire history of the Ethereum network, which is very CPU intensive.
  • Bump the memory allowance of the database to 512MB (--cache=512), which can help significantly in sync times especially for HDD users. This flag is optional and you can set it as high or as low as you'd like, though we'd recommend the 512MB - 2GB range.
  • Start up Geth's built-in interactive JavaScript console, (via the trailing console subcommand) through which you can invoke all official web3 methods as well as Geth's own management APIs. This too is optional and if you leave it out you can always attach to an already running Geth instance with geth attach.

Full node on the Ethereum test network

Transitioning towards developers, if you'd like to play around with creating Ethereum contracts, you almost certainly would like to do that without any real money involved until you get the hang of the entire system. In other words, instead of attaching to the main network, you want to join the test network with your node, which is fully equivalent to the main network, but with play-Ether only.

$ geth --testnet --fast --cache=512 console

The --fast, --cache flags and console subcommand have the exact same meaning as above and they are equally useful on the testnet too. Please see above for their explanations if you've skipped to here.

Specifying the --testnet flag however will reconfigure your Geth instance a bit:

  • Instead of using the default data directory (~/.ethereum on Linux for example), Geth will nest itself one level deeper into a testnet subfolder (~/.ethereum/testnet on Linux). Note, on OSX and Linux this also means that attaching to a running testnet node requires the use of a custom endpoint since geth attach will try to attach to a production node endpoint by default. E.g. geth attach <datadir>/testnet/geth.ipc. Windows users are not affected by this.
  • Instead of connecting the main Ethereum network, the client will connect to the test network, which uses different P2P bootnodes, different network IDs and genesis states.

Note: Although there are some internal protective measures to prevent transactions from crossing over between the main network and test network, you should make sure to always use separate accounts for play-money and real-money. Unless you manually move accounts, Geth will by default correctly separate the two networks and will not make any accounts available between them.

Configuration

As an alternative to passing the numerous flags to the geth binary, you can also pass a configuration file via:

$ geth --config /path/to/your_config.toml

To get an idea how the file should look like you can use the dumpconfig subcommand to export your existing configuration:

$ geth --your-favourite-flags dumpconfig

Note: This works only with geth v1.6.0 and above

Docker quick start

One of the quickest ways to get Ethereum up and running on your machine is by using Docker:

docker run -d --name ethereum-node -v /Users/alice/ethereum:/root \
           -p 8545:8545 -p 30303:30303 \
           ethereum/client-go --fast --cache=512

This will start geth in fast sync mode with a DB memory allowance of 512MB just as the above command does. It will also create a persistent volume in your home directory for saving your blockchain as well as map the default ports. There is also an alpine tag available for a slim version of the image.

Programatically interfacing Geth nodes

As a developer, sooner rather than later you'll want to start interacting with Geth and the Ethereum network via your own programs and not manually through the console. To aid this, Geth has built in support for a JSON-RPC based APIs (standard APIs and Geth specific APIs). These can be exposed via HTTP, WebSockets and IPC (unix sockets on unix based platforms, and named pipes on Windows).

The IPC interface is enabled by default and exposes all the APIs supported by Geth, whereas the HTTP and WS interfaces need to manually be enabled and only expose a subset of APIs due to security reasons. These can be turned on/off and configured as you'd expect.

HTTP based JSON-RPC API options:

  • --rpc Enable the HTTP-RPC server
  • --rpcaddr HTTP-RPC server listening interface (default: "localhost")
  • --rpcport HTTP-RPC server listening port (default: 8545)
  • --rpcapi API's offered over the HTTP-RPC interface (default: "eth,net,web3")
  • --rpccorsdomain Comma separated list of domains from which to accept cross origin requests (browser enforced)
  • --ws Enable the WS-RPC server
  • --wsaddr WS-RPC server listening interface (default: "localhost")
  • --wsport WS-RPC server listening port (default: 8546)
  • --wsapi API's offered over the WS-RPC interface (default: "eth,net,web3")
  • --wsorigins Origins from which to accept websockets requests
  • --ipcdisable Disable the IPC-RPC server
  • --ipcapi API's offered over the IPC-RPC interface (default: "admin,debug,eth,miner,net,personal,shh,txpool,web3")
  • --ipcpath Filename for IPC socket/pipe within the datadir (explicit paths escape it)

You'll need to use your own programming environments' capabilities (libraries, tools, etc) to connect via HTTP, WS or IPC to a Geth node configured with the above flags and you'll need to speak JSON-RPC on all transports. You can reuse the same connection for multiple requests!

Note: Please understand the security implications of opening up an HTTP/WS based transport before doing so! Hackers on the internet are actively trying to subvert Ethereum nodes with exposed APIs! Further, all browser tabs can access locally running webservers, so malicious webpages could try to subvert locally available APIs!

Operating a private network

Maintaining your own private network is more involved as a lot of configurations taken for granted in the official networks need to be manually set up.

Defining the private genesis state

First, you'll need to create the genesis state of your networks, which all nodes need to be aware of and agree upon. This consists of a small JSON file (e.g. call it genesis.json):

{
  "config": {
        "chainId": 0,
        "homesteadBlock": 0,
        "eip155Block": 0,
        "eip158Block": 0
    },
  "alloc"      : {},
  "coinbase"   : "0x0000000000000000000000000000000000000000",
  "difficulty" : "0x20000",
  "extraData"  : "",
  "gasLimit"   : "0x2fefd8",
  "nonce"      : "0x0000000000000042",
  "mixhash"    : "0x0000000000000000000000000000000000000000000000000000000000000000",
  "parentHash" : "0x0000000000000000000000000000000000000000000000000000000000000000",
  "timestamp"  : "0x00"
}

The above fields should be fine for most purposes, although we'd recommend changing the nonce to some random value so you prevent unknown remote nodes from being able to connect to you. If you'd like to pre-fund some accounts for easier testing, you can populate the alloc field with account configs:

"alloc": {
  "0x0000000000000000000000000000000000000001": {"balance": "111111111"},
  "0x0000000000000000000000000000000000000002": {"balance": "222222222"}
}

With the genesis state defined in the above JSON file, you'll need to initialize every Geth node with it prior to starting it up to ensure all blockchain parameters are correctly set:

$ geth init path/to/genesis.json

Creating the rendezvous point

With all nodes that you want to run initialized to the desired genesis state, you'll need to start a bootstrap node that others can use to find each other in your network and/or over the internet. The clean way is to configure and run a dedicated bootnode:

$ bootnode --genkey=boot.key
$ bootnode --nodekey=boot.key

With the bootnode online, it will display an enode URL that other nodes can use to connect to it and exchange peer information. Make sure to replace the displayed IP address information (most probably [::]) with your externally accessible IP to get the actual enode URL.

Note: You could also use a full fledged Geth node as a bootnode, but it's the less recommended way.

Starting up your member nodes

With the bootnode operational and externally reachable (you can try telnet <ip> <port> to ensure it's indeed reachable), start every subsequent Geth node pointed to the bootnode for peer discovery via the --bootnodes flag. It will probably also be desirable to keep the data directory of your private network separated, so do also specify a custom --datadir flag.

$ geth --datadir=path/to/custom/data/folder --bootnodes=<bootnode-enode-url-from-above>

Note: Since your network will be completely cut off from the main and test networks, you'll also need to configure a miner to process transactions and create new blocks for you.

Running a private miner

Mining on the public Ethereum network is a complex task as it's only feasible using GPUs, requiring an OpenCL or CUDA enabled ethminer instance. For information on such a setup, please consult the EtherMining subreddit and the Genoil miner repository.

In a private network setting however, a single CPU miner instance is more than enough for practical purposes as it can produce a stable stream of blocks at the correct intervals without needing heavy resources (consider running on a single thread, no need for multiple ones either). To start a Geth instance for mining, run it with all your usual flags, extended by:

$ geth <usual-flags> --mine --minerthreads=1 --etherbase=0x0000000000000000000000000000000000000000

Which will start mining bocks and transactions on a single CPU thread, crediting all proceedings to the account specified by --etherbase. You can further tune the mining by changing the default gas limit blocks converge to (--targetgaslimit) and the price transactions are accepted at (--gasprice).

Contribution

Thank you for considering to help out with the source code! We welcome contributions from anyone on the internet, and are grateful for even the smallest of fixes!

If you'd like to contribute to go-ethereum, please fork, fix, commit and send a pull request for the maintainers to review and merge into the main code base. If you wish to submit more complex changes though, please check up with the core devs first on our gitter channel to ensure those changes are in line with the general philosophy of the project and/or get some early feedback which can make both your efforts much lighter as well as our review and merge procedures quick and simple.

Please make sure your contributions adhere to our coding guidelines:

  • Code must adhere to the official Go formatting guidelines (i.e. uses gofmt).
  • Code must be documented adhering to the official Go commentary guidelines.
  • Pull requests need to be based on and opened against the master branch.
  • Commit messages should be prefixed with the package(s) they modify.
    • E.g. "eth, rpc: make trace configs optional"

Please see the Developers' Guide for more details on configuring your environment, managing project dependencies and testing procedures.

License

The go-ethereum library (i.e. all code outside of the cmd directory) is licensed under the GNU Lesser General Public License v3.0, also included in our repository in the COPYING.LESSER file.

The go-ethereum binaries (i.e. all code inside of the cmd directory) is licensed under the GNU General Public License v3.0, also included in our repository in the COPYING file.