fix: cosmovisor binary usage for pre-upgrade (#12005)

## Description

Closes: #11931



---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [x] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [x] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
This commit is contained in:
Julien Robert 2022-06-02 13:28:26 +02:00 committed by GitHub
parent 93f5108eb6
commit 4c3255e4e6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 52 additions and 38 deletions

View File

@ -240,14 +240,15 @@ Ref: https://keepachangelog.com/en/1.0.0/
### Bug Fixes
* (CLI) [#12075](https://github.com/cosmos/cosmos-sdk/pull/12075) Add `p2p-port` to the `gentx` and `create-validator` CLI commands to support custom P2P ports.
* [#11969](https://github.com/cosmos/cosmos-sdk/pull/11969) Fix the panic error in `x/upgrade` when `AppVersion` is not set.
* (cli) [#12075](https://github.com/cosmos/cosmos-sdk/pull/12075) Add `p2p-port` to the `gentx` and `create-validator` CLI commands to support custom P2P ports.
* (x/slashing) [\#11973](https://github.com/cosmos/cosmos-sdk/pull/11973) Update StartHeight signing info in AfterValidatorBonded hook when validator re-bonds
* [\#11969](https://github.com/cosmos/cosmos-sdk/pull/11969) Fix the panic error in `x/upgrade` when `AppVersion` is not set.
* (tests) [\#11940](https://github.com/cosmos/cosmos-sdk/pull/11940) Fix some client tests in the `x/gov` module
* [\#11772](https://github.com/cosmos/cosmos-sdk/pull/11772) Limit types.Dec length to avoid overflow.
* [\#11724](https://github.com/cosmos/cosmos-sdk/pull/11724) Fix data race issues with api.Server
* [\#11693](https://github.com/cosmos/cosmos-sdk/pull/11693) Add validation for gentx cmd.
* [\#11645](https://github.com/cosmos/cosmos-sdk/pull/11645) Fix `--home` flag ignored when running help.
* [\#11558](https://github.com/cosmos/cosmos-sdk/pull/11558) Fix `--dry-run` not working when using tx command.
* (cli) [\#11645](https://github.com/cosmos/cosmos-sdk/pull/11645) Fix `--home` flag ignored when running help.
* (cli) [\#11558](https://github.com/cosmos/cosmos-sdk/pull/11558) Fix `--dry-run` not working when using tx command.
* [\#11354](https://github.com/cosmos/cosmos-sdk/pull/11355) Added missing pagination flag for `bank q total` query.
* [\#11197](https://github.com/cosmos/cosmos-sdk/pull/11197) Signing with multisig now works with multisig address which is not in the keyring.
* (makefile) [\#11285](https://github.com/cosmos/cosmos-sdk/pull/11285) Fix lint-fix make target.

View File

@ -42,6 +42,10 @@ Ref: https://keepachangelog.com/en/1.0.0/
* [\#11823](https://github.com/cosmos/cosmos-sdk/pull/11823) Refactor `cosmovisor` CLI to use `cobra`.
* [\#11731](https://github.com/cosmos/cosmos-sdk/pull/11731) `cosmovisor version -o json` returns the cosmovisor version and the result of `simd --output json --long` in one JSON object.
### Bugfix
* [\#12005](https://github.com/cosmos/cosmos-sdk/pull/12005) Fix cosmovisor binary usage for pre-upgrade
## v1.1.0 2022-10-02
### Features

View File

@ -1,4 +1,4 @@
# Cosmosvisor
# Cosmovisor
`cosmovisor` is a small process manager for Cosmos SDK application binaries that monitors the governance module for incoming chain upgrade proposals. If it sees a proposal that gets approved, `cosmovisor` can automatically download the new binary, stop the current binary, switch from the old binary to the new one, and finally restart the node with the new binary.
@ -312,6 +312,7 @@ cp ./build/simd $DAEMON_HOME/cosmovisor/upgrades/test1/bin
```
Open a new terminal window and submit an upgrade proposal along with a deposit and a vote (these commands must be run within 20 seconds of each other):
Note, when using a `v0.46+` chain, replace `submit-proposal` by `submit-legacy-proposal`.
```sh
./build/simd tx gov submit-proposal software-upgrade test1 --title upgrade --description upgrade --upgrade-height 200 --from validator --yes

View File

@ -8,7 +8,7 @@ import (
// GetHelpText creates the help text multi-line string.
func GetHelpText() string {
return fmt.Sprintf(`Cosmosvisor - A process manager for Cosmos SDK application binaries.
return fmt.Sprintf(`Cosmovisor - A process manager for Cosmos SDK application binaries.
Cosmovisor is a wrapper for a Cosmos SDK based App (set using the required %s env variable).
It starts the App by passing all provided arguments and monitors the %s/data/upgrade-info.json

View File

@ -89,7 +89,7 @@ func (s *HelpTestSuite) setEnv(t *testing.T, env *cosmovisorHelpEnv) {
func (s *HelpTestSuite) TestGetHelpText() {
expectedPieces := []string{
"Cosmosvisor",
"Cosmovisor",
cosmovisor.EnvName, cosmovisor.EnvHome,
"https://github.com/cosmos/cosmos-sdk/tree/main/cosmovisor/README.md",
}

View File

@ -40,11 +40,12 @@ func Run(logger *zerolog.Logger, args []string, options ...RunOption) error {
}
doUpgrade, err := launcher.Run(args, runCfg.StdOut, runCfg.StdErr)
// if RestartAfterUpgrade, we launch after a successful upgrade (only condition LaunchProcess returns nil)
// if RestartAfterUpgrade, we launch after a successful upgrade (given that condition launcher.Run returns nil)
for cfg.RestartAfterUpgrade && err == nil && doUpgrade {
logger.Info().Str("app", cfg.Name).Msg("upgrade detected, relaunching")
doUpgrade, err = launcher.Run(args, runCfg.StdOut, runCfg.StdErr)
}
if doUpgrade && err == nil {
logger.Info().Msg("upgrade detected, DAEMON_RESTART_AFTER_UPGRADE is off. Verify new upgrade and start cosmovisor again.")
}

View File

@ -64,8 +64,7 @@ func (l Launcher) Run(args []string, stdout, stderr io.Writer) (bool, error) {
}
}()
needsUpdate, err := l.WaitForUpgradeOrExit(cmd)
if err != nil || !needsUpdate {
if needsUpdate, err := l.WaitForUpgradeOrExit(cmd); err != nil || !needsUpdate {
return false, err
}
@ -74,12 +73,18 @@ func (l Launcher) Run(args []string, stdout, stderr io.Writer) (bool, error) {
return false, err
}
if err := UpgradeBinary(l.logger, l.cfg, l.fw.currentInfo); err != nil {
return false, err
}
if err = l.doPreUpgrade(); err != nil {
return false, err
}
return true, nil
}
return true, DoUpgrade(l.logger, l.cfg, l.fw.currentInfo)
return false, nil
}
// WaitForUpgradeOrExit checks upgrade plan file created by the app.
@ -130,8 +135,7 @@ func (l Launcher) doBackup() error {
return fmt.Errorf("error while reading upgrade-info.json: %w", err)
}
err = json.Unmarshal(upgradeInfoFile, &uInfo)
if err != nil {
if err = json.Unmarshal(upgradeInfoFile, &uInfo); err != nil {
return err
}
@ -147,9 +151,7 @@ func (l Launcher) doBackup() error {
l.logger.Info().Time("backup start time", st).Msg("starting to take backup of data directory")
// copy the $DAEMON_HOME/data to a backup dir
err = copy.Copy(filepath.Join(l.cfg.Home, "data"), dst)
if err != nil {
if err = copy.Copy(filepath.Join(l.cfg.Home, "data"), dst); err != nil {
return fmt.Errorf("error while taking data backup: %w", err)
}
@ -161,8 +163,9 @@ func (l Launcher) doBackup() error {
return nil
}
// doPreUpgrade runs the pre-upgrade command defined by the application and handles respective error codes
// cfg contains the cosmovisor config from env var
// doPreUpgrade runs the pre-upgrade command defined by the application and handles respective error codes.
// cfg contains the cosmovisor config from env var.
// doPreUpgrade runs the new APP binary in order to process the upgrade (post-upgrade for cosmovisor).
func (l *Launcher) doPreUpgrade() error {
counter := 0
for {
@ -170,18 +173,16 @@ func (l *Launcher) doPreUpgrade() error {
return fmt.Errorf("pre-upgrade command failed. reached max attempt of retries - %d", l.cfg.PreupgradeMaxRetries)
}
err := l.executePreUpgradeCmd()
counter += 1
if err := l.executePreUpgradeCmd(); err != nil {
counter += 1
if err != nil {
if err.(*exec.ExitError).ProcessState.ExitCode() == 1 {
switch err.(*exec.ExitError).ProcessState.ExitCode() {
case 1:
l.logger.Info().Msg("pre-upgrade command does not exist. continuing the upgrade.")
return nil
}
if err.(*exec.ExitError).ProcessState.ExitCode() == 30 {
case 30:
return fmt.Errorf("pre-upgrade command failed : %w", err)
}
if err.(*exec.ExitError).ProcessState.ExitCode() == 31 {
case 31:
l.logger.Error().Err(err).Int("attempt", counter).Msg("pre-upgrade command failed. retrying")
continue
}
@ -193,19 +194,24 @@ func (l *Launcher) doPreUpgrade() error {
}
// executePreUpgradeCmd runs the pre-upgrade command defined by the application
// cfg contains the cosmosvisor config from the env vars
// cfg contains the cosmovisor config from the env vars
func (l *Launcher) executePreUpgradeCmd() error {
bin, err := l.cfg.CurrentBin()
if err != nil {
return fmt.Errorf("error while getting current binary path: %w", err)
}
result, err := exec.Command(bin, "pre-upgrade").Output()
if err != nil {
return err
}
preUpgradeCmd := exec.Command(bin, "pre-upgrade")
_, err = preUpgradeCmd.Output()
return err
l.logger.Info().Bytes("result", result).Msg("pre-upgrade result")
return nil
}
// IsSkipUpgradeHeight checks if pre-upgrade script must be run. If the height in the upgrade plan matches any of the heights provided in --safe-skip-upgrade, the script is not run
// IsSkipUpgradeHeight checks if pre-upgrade script must be run.
// If the height in the upgrade plan matches any of the heights provided in --unsafe-skip-upgrades, the script is not run.
func IsSkipUpgradeHeight(args []string, upgradeInfo upgradetypes.Plan) bool {
skipUpgradeHeights := UpgradeSkipHeights(args)
for _, h := range skipUpgradeHeights {
@ -217,7 +223,7 @@ func IsSkipUpgradeHeight(args []string, upgradeInfo upgradetypes.Plan) bool {
}
// UpgradeSkipHeights gets all the heights provided when
// simd start --unsafe-skip-upgrades <height1> <optional_height_2> ... <optional_height_N>
// simd start --unsafe-skip-upgrades <height1> <optional_height_2> ... <optional_height_N>
func UpgradeSkipHeights(args []string) []int {
var heights []int
for i, arg := range args {

View File

@ -16,16 +16,17 @@ import (
"github.com/rs/zerolog"
)
// DoUpgrade will be called after the log message has been parsed and the process has terminated.
// UpgradeBinary will be called after the log message has been parsed and the process has terminated.
// We can now make any changes to the underlying directory without interference and leave it
// in a state, so we can make a proper restart
func DoUpgrade(logger *zerolog.Logger, cfg *Config, info upgradetypes.Plan) error {
// Simplest case is to switch the link
func UpgradeBinary(logger *zerolog.Logger, cfg *Config, info upgradetypes.Plan) error {
// simplest case is to switch the link
err := EnsureBinary(cfg.UpgradeBin(info.Name))
if err == nil {
// we have the binary - do it
return cfg.SetCurrentUpgrade(info)
}
// if auto-download is disabled, we fail
if !cfg.AllowDownloadBinaries {
return fmt.Errorf("binary not present, downloading disabled: %w", err)

View File

@ -92,7 +92,7 @@ func (s *upgradeTestSuite) assertCurrentLink(cfg cosmovisor.Config, target strin
}
// TODO: test with download (and test all download functions)
func (s *upgradeTestSuite) TestDoUpgradeNoDownloadUrl() {
func (s *upgradeTestSuite) TestUpgradeBinaryNoDownloadUrl() {
home := copyTestData(s.T(), "validate")
cfg := &cosmovisor.Config{Home: home, Name: "dummyd", AllowDownloadBinaries: true}
logger := cosmovisor.NewLogger()
@ -105,7 +105,7 @@ func (s *upgradeTestSuite) TestDoUpgradeNoDownloadUrl() {
// do upgrade ignores bad files
for _, name := range []string{"missing", "nobin", "noexec"} {
info := upgradetypes.Plan{Name: name}
err = cosmovisor.DoUpgrade(logger, cfg, info)
err = cosmovisor.UpgradeBinary(logger, cfg, info)
s.Require().Error(err, name)
currentBin, err := cfg.CurrentBin()
s.Require().NoError(err)
@ -116,7 +116,7 @@ func (s *upgradeTestSuite) TestDoUpgradeNoDownloadUrl() {
for _, upgrade := range []string{"chain2", "chain3"} {
// now set it to a valid upgrade and make sure CurrentBin is now set properly
info := upgradetypes.Plan{Name: upgrade}
err = cosmovisor.DoUpgrade(logger, cfg, info)
err = cosmovisor.UpgradeBinary(logger, cfg, info)
s.Require().NoError(err)
// we should see current point to the new upgrade dir
upgradeBin := cfg.UpgradeBin(upgrade)