Defensively handle errors examining app requests. (#922)
All checks were successful
Lint Checks / Run linter (push) Successful in 34s
Publish / Build and publish (push) Successful in 1m7s
Smoke Test / Run basic test suite (push) Successful in 4m0s
Webapp Test / Run webapp test suite (push) Successful in 4m42s
Deploy Test / Run deploy test suite (push) Successful in 4m58s
Database Test / Run database hosting test on kind/k8s (push) Successful in 10m2s
Container Registry Test / Run contaier registry hosting test on kind/k8s (push) Successful in 3m44s
External Stack Test / Run external stack test suite (push) Successful in 4m26s
All checks were successful
Lint Checks / Run linter (push) Successful in 34s
Publish / Build and publish (push) Successful in 1m7s
Smoke Test / Run basic test suite (push) Successful in 4m0s
Webapp Test / Run webapp test suite (push) Successful in 4m42s
Deploy Test / Run deploy test suite (push) Successful in 4m58s
Database Test / Run database hosting test on kind/k8s (push) Successful in 10m2s
Container Registry Test / Run contaier registry hosting test on kind/k8s (push) Successful in 3m44s
External Stack Test / Run external stack test suite (push) Successful in 4m26s
Related to cerc-io/webapp-deployment-status-api#10 There are two issues in that. One is that the output probably changed recently, whether in the client or server, where no matching record is found by ID (Note this is specific to `laconic record get --id <v>` and does not seem to apply to the similar command to retrieve a record by name, `laconic name resolve <n>`). Rather than returning `[]` it is now returning `[ null ]`. This cause us to think there *was* an application record found, and we attempt to treat the `null` entry like an Application object. That's fixed by filtering out null responses, which is a good precaution for the deployer, though I think it makes sense to ask whether this new behavior by the client/server is correct. Seems suspicious. The other issue is that all the defensive checks we had in place to deal with broken/bad AppDeploymentRequests were around the _build_. This error was coming much earlier, merely when parsing and examining the request to see if it needed to be handled at all. I have now added similar defensive error handling around that portion of the code. Reviewed-on: #922 Reviewed-by: zramsay <zramsay@noreply.git.vdb.to> Co-authored-by: Thomas E Lackey <telackey@bozemanpass.com> Co-committed-by: Thomas E Lackey <telackey@bozemanpass.com>
This commit is contained in:
parent
8576137557
commit
5c275aa622
@ -274,34 +274,43 @@ def command(ctx, kube_config, laconic_config, image_registry, deployment_parent_
|
|||||||
requests_by_name = {}
|
requests_by_name = {}
|
||||||
skipped_by_name = {}
|
skipped_by_name = {}
|
||||||
for r in requests:
|
for r in requests:
|
||||||
if r.id in previous_requests and previous_requests[r.id].get("status", "") != "RETRY":
|
status = None
|
||||||
print(f"Skipping request {r.id}, we've already seen it.")
|
try:
|
||||||
continue
|
if r.id in previous_requests and previous_requests[r.id].get("status", "") != "RETRY":
|
||||||
|
print(f"Skipping request {r.id}, we've already seen it.")
|
||||||
|
continue
|
||||||
|
|
||||||
app = laconic.get_record(r.attributes.application)
|
app = laconic.get_record(r.attributes.application)
|
||||||
if not app:
|
if not app:
|
||||||
print("Skipping request %s, cannot locate app." % r.id)
|
print(f"Skipping request {r.id}, cannot locate app.")
|
||||||
continue
|
status = "SEEN"
|
||||||
|
continue
|
||||||
|
|
||||||
requested_name = r.attributes.dns
|
requested_name = r.attributes.dns
|
||||||
if not requested_name:
|
if not requested_name:
|
||||||
requested_name = generate_hostname_for_app(app)
|
requested_name = generate_hostname_for_app(app)
|
||||||
print("Generating name %s for request %s." % (requested_name, r.id))
|
print("Generating name %s for request %s." % (requested_name, r.id))
|
||||||
|
|
||||||
if requested_name in skipped_by_name or requested_name in requests_by_name:
|
if requested_name in skipped_by_name or requested_name in requests_by_name:
|
||||||
print("Ignoring request %s, it has been superseded." % r.id)
|
print("Ignoring request %s, it has been superseded." % r.id)
|
||||||
continue
|
continue
|
||||||
|
|
||||||
if skip_by_tag(r, include_tags, exclude_tags):
|
if skip_by_tag(r, include_tags, exclude_tags):
|
||||||
print("Skipping request %s, filtered by tag (include %s, exclude %s, present %s)" % (r.id,
|
print("Skipping request %s, filtered by tag (include %s, exclude %s, present %s)" % (r.id,
|
||||||
include_tags,
|
include_tags,
|
||||||
exclude_tags,
|
exclude_tags,
|
||||||
r.attributes.tags))
|
r.attributes.tags))
|
||||||
skipped_by_name[requested_name] = r
|
skipped_by_name[requested_name] = r
|
||||||
continue
|
continue
|
||||||
|
|
||||||
print("Found request %s to run application %s on %s." % (r.id, r.attributes.application, requested_name))
|
print("Found request %s to run application %s on %s." % (r.id, r.attributes.application, requested_name))
|
||||||
requests_by_name[requested_name] = r
|
requests_by_name[requested_name] = r
|
||||||
|
except Exception as e:
|
||||||
|
print(f"ERROR examining request {r.id}: " + str(e))
|
||||||
|
status = "ERROR"
|
||||||
|
finally:
|
||||||
|
if status:
|
||||||
|
dump_known_requests(state_file, [r], status)
|
||||||
|
|
||||||
# Find deployments.
|
# Find deployments.
|
||||||
deployments = laconic.app_deployments()
|
deployments = laconic.app_deployments()
|
||||||
|
@ -172,7 +172,7 @@ class LaconicRegistryClient:
|
|||||||
name_or_id,
|
name_or_id,
|
||||||
]
|
]
|
||||||
|
|
||||||
parsed = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args))]
|
parsed = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
|
||||||
if len(parsed):
|
if len(parsed):
|
||||||
self._add_to_cache(parsed)
|
self._add_to_cache(parsed)
|
||||||
return parsed[0]
|
return parsed[0]
|
||||||
|
Loading…
Reference in New Issue
Block a user