Compare commits

...

69 Commits

Author SHA1 Message Date
A. F. Dudley
0d699edef0 Merge branch '_so_push' into _so_main_push 2026-03-10 17:13:08 +00:00
A. F. Dudley
1df32635de fix(deps): bump pydantic 1.10.9 → 1.10.13 (CVE-2024-3772)
Fixes ReDoS vulnerability in pydantic email validation regex.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 17:12:47 +00:00
A. F. Dudley
dd7aeb329a Merge remote-tracking branch 'stack-orchestrator/fix/kind-mount-propagation' into _so_main_merge 2026-03-10 17:09:45 +00:00
A. F. Dudley
b129aaa9a5 Merge branch 'bar-822-kind-load-after-rebuild'
# Conflicts:
#	stack-orchestrator/stack_orchestrator/deploy/k8s/deploy_k8s.py
#	stack-orchestrator/stack_orchestrator/deploy/k8s/helpers.py
2026-03-10 16:53:55 +00:00
A. F. Dudley
fdde3be5c8 fix: add pre-commit hooks and fix all lint/type/format errors
Process bug fix: no pre-commit existed for this repo's Python code.
Added pyproject.toml with unified dependencies (ruff, mypy, ansible-lint),
.pre-commit-config.yaml with repo-based hooks (ruff) and local uv-run
hooks (mypy, ansible-lint).

Fixed 249 ruff errors (B023, B904, B006, B007, UP008, UP031, C408),
~13 mypy type errors, 11 ansible-lint violations, and ruff-format
across all Python files including stack-orchestrator subtree.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 14:56:22 +00:00
A. F. Dudley
8119b25add bar-822: replace kind load with local registry for image loading
kind load docker-image serializes the full image (docker save | ctr import),
taking 5-10 minutes per cluster recreate. Replace with a persistent local
registry (registry:2 on port 5001) that survives kind cluster deletes.

stack-orchestrator changes:
- helpers.py: replace load_images_into_kind() with ensure_local_registry(),
  connect_registry_to_kind_network(), push_images_to_local_registry()
- helpers.py: add registry mirror to containerdConfigPatches so kind nodes
  pull from localhost:5001 via the kind-registry container
- deploy_k8s.py: rewrite local container image refs to localhost:5001/...
  so containerd pulls from the registry instead of local store

Ansible changes:
- biscayne-sync-tools.yml: ensure registry container before build, then
  tag+push to local registry after build (build-container tag)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 08:37:53 +00:00
A. F. Dudley
7f12270939 bar-6cb: fix PV claimRef, namespace race, and PVC creation resilience
Three related fixes in the k8s deployer restart/up flow:

1. Clear stale claimRefs on Released PVs (_clear_released_pv_claim_refs):
   After namespace deletion, PVs survive in Released state with claimRefs
   pointing to deleted PVC UIDs. New PVCs can't bind until the stale
   claimRef is removed. Now clears them before PVC creation.

2. Wait for namespace termination (_wait_for_namespace_deletion):
   _ensure_namespace() now detects a terminating namespace and polls
   until deletion completes (up to 120s) before creating the new one.
   Replaces the racy 5s sleep in deployment restart.

3. Resilient PVC creation: wrap each PVC creation in error handling so
   one failure doesn't prevent subsequent PVCs from being attempted.
   All errors are collected and reported together.

Closes: bar-6cb, bar-31a, bar-fec

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 08:33:45 +00:00
A. F. Dudley
03a5b5e39e Merge commit '19bb90f8148833ea7ff79cba312b048abc0d790b' as 'stack-orchestrator' 2026-03-10 08:08:04 +00:00
A. F. Dudley
12339ab46e pebbles: sync 2026-03-10 08:05:41 +00:00
A. F. Dudley
6464492009 fix: check-status.py smooth in-place redraw, remove comment bars
- Use \033[H\033[J (home + clear-to-end) instead of just \033[H to
  prevent stale lines from previous frames persisting when output
  shrinks between refreshes.
- Fix cursor restore on exit: was \033[?25l (hide) instead of
  \033[?25h (show), leaving terminal with invisible cursor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 08:04:29 +00:00
A. F. Dudley
9009fb0363 fix: build.sh must be executable for laconic-so build-containers
Also fix --include filter: container name uses slash (laconicnetwork/agave)
not dash (laconicnetwork-agave). The old filter silently skipped the build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 07:25:54 +00:00
A. F. Dudley
a76431a5dd fix: spec.yml snapshot settings — retain 1, enable incrementals
MAXIMUM_SNAPSHOTS_TO_RETAIN: 1 (was 5)
NO_INCREMENTAL_SNAPSHOTS: false (was true)
Removed SNAPSHOT_INTERVAL_SLOTS override (compose default 100000 is correct)

Spec.yml overrides compose defaults, so changing compose was ineffective.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 07:18:38 +00:00
A. F. Dudley
ceea8f0572 fix: restart playbook preserves SSH agent and clears stale PV claimRefs
Two fixes for biscayne-restart.yml:

1. ansible_become_flags: "-E" on the restart task preserves SSH_AUTH_SOCK
   through sudo so laconic-so can git pull the stack repo.

2. After restart, clear claimRef on any Released PVs. laconic-so restart
   deletes the namespace (cascading to PVCs) then recreates, but the PVs
   retain stale claimRefs that prevent new PVCs from binding.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 06:37:45 +00:00
A. F. Dudley
e143bb45c7 feat: add biscayne-restart.yml for graceful restart without cluster teardown
Uses laconic-so deployment restart (GitOps) to pick up new container
images and config. Gracefully stops the validator first (scale to 0,
wait for pod termination, verify no agave processes). Preserves the
kind cluster, all data volumes, and cluster state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 06:21:46 +00:00
A. F. Dudley
0bbc3b5a64 Merge commit '481e9d239247c01604ed9e11160abc94e9dd9eb4' as 'agave-stack' 2026-03-10 06:21:15 +00:00
A. F. Dudley
481e9d2392 Squashed 'agave-stack/' content from commit 7100d11
git-subtree-dir: agave-stack
git-subtree-split: 7100d117421bd79fb52d3dfcd85b76cf18ed0ffa
2026-03-10 06:21:15 +00:00
A. F. Dudley
7c58809cc1 chore: remove scripts/agave-container before subtree add
Moving container scripts into agave-stack subtree (correct direction).
The source of truth will be agave-stack/ in this repo, pushed out to
LaconicNetwork/agave-stack via git subtree push.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 06:21:12 +00:00
A. F. Dudley
08380ec070 fix: Dockerfile includes ip_echo_preflight.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 06:08:22 +00:00
A. F. Dudley
61b7f6a236 feat: ip_echo preflight tool + relay post-mortem and checklist
ip_echo_preflight.py: reimplements Solana ip_echo client protocol in
Python. Verifies UDP port reachability before snapshot download, called
from entrypoint.py. Prevents wasting hours on a snapshot only to
crash-loop on port reachability.

docs/postmortem-ashburn-relay-outbound.md: root cause analysis of the
firewalld nftables FORWARD chain blocking outbound relay traffic.

docs/ashburn-relay-checklist.md: 7-layer verification checklist for
relay path debugging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 05:54:23 +00:00
A. F. Dudley
68edcc60c7 fix: migrate ashburn relay playbook to firewalld + iptables coexistence
Firewalld zones/policies for forwarding (Docker bridge → gre-ashburn),
iptables for Docker-specific rules (DNAT, DOCKER-USER, mangle, SNAT).
Both coexist at different netfilter priorities.

See docs/postmortem-ashburn-relay-outbound.md for root cause analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 05:54:08 +00:00
A. F. Dudley
3bf87a2e9b feat: snapshot leapfrog — auto-recovery when validator falls behind
Entrypoint changes:
- Always require full + incremental before starting (retry until found)
- Check incremental freshness against convergence threshold (500 slots)
- Gap monitor thread: if validator falls >5000 slots behind for 3
  consecutive checks, graceful stop + restart with fresh incremental
- cmd_serve is now a loop: download → run → monitor → leapfrog → repeat
- --no-snapshot-fetch moved to common args (both RPC and validator modes)
- --maximum-full-snapshots-to-retain default 1 (validator deletes
  downloaded full after generating its own)
- SNAPSHOT_MAX_AGE_SLOTS default 100000 (one full snapshot generation)

snapshot_download.py refactoring:
- Extract _discover_and_benchmark() and _rolling_incremental_download()
  as shared helpers
- Restore download_incremental_for_slot() using shared helpers (downloads
  only an incremental for an existing full snapshot)
- download_best_snapshot() uses shared helpers, downloads full then
  incremental as separate operations

The leapfrog cycle: validator generates full snapshots at standard 100k
block height intervals (same slots as the rest of the network). When the
gap monitor triggers, the entrypoint loops back to maybe_download_snapshot
which finds the validator's local full, downloads a fresh network
incremental (generated every ~40s, converges within the ~11hr full
generation window), and restarts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 05:53:56 +00:00
A. F. Dudley
cd36bfe5ee fix: check-status.py smooth in-place redraw, remove comment bars
- Overwrite lines in place instead of clear+redraw (no flicker)
- Pad lines to terminal width to clear stale characters
- Blank leftover rows when output shrinks between frames
- Hide cursor during watch mode
- Remove section comment bars
- Replace unicode checkmarks with +/x

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 01:00:36 +00:00
A. F. Dudley
e597968708 fix: recovery playbook fixes grafana PV ownership before scale-up
laconic-so creates PV hostPath dirs as root. Grafana runs as UID 472
and crashes on startup because it can't write to /var/lib/grafana.
Fix ownership inside the kind node before scaling the deployment up.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 00:57:36 +00:00
A. F. Dudley
ddbcd1a97c fix: migration playbook stops docker first, skips stale data copy
- biscayne-migrate-storage.yml: stop docker to release bind mounts
  before destroying zvol, no data copy (stale, fresh snapshot needed),
  handle partially-migrated state, restart docker at end
- biscayne-upgrade-zfs.yml: use add-apt-repository CLI (module times
  out), fix libzfs package name (libzfs4linux not 5), allow apt update
  warnings from stale influxdata GPG key

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 00:48:37 +00:00
A. F. Dudley
b88af2be70 feat: graceful shutdown, ZFS upgrade, storage migration, sync-tools build
- entrypoint.py: Python stays PID 1, traps SIGTERM, requests graceful exit
  via admin RPC (agave-validator exit --force) before falling back to signals
- snapshot_download.py: fix break-on-failure bug in incremental download loop
  (continue + re-probe instead of giving up)
- biscayne-upgrade-zfs.yml: upgrade ZFS 2.2.2 → 2.2.9 via arter97/zfs-lts
  PPA to fix io_uring deadlock at kernel module level
- biscayne-migrate-storage.yml: one-time migration from zvol/XFS to ZFS
  dataset (zvol workaround no longer needed with graceful shutdown + ZFS fix)
- biscayne-stop.yml: patch terminationGracePeriodSeconds to 300 before
  scaling to 0, updated docs for admin RPC shutdown
- biscayne-sync-tools.yml: fix SSH agent forwarding (vars: ansible_become),
  add --tags build-container support, add set -e to shell blocks
- biscayne-recover.yml: updated for graceful shutdown awareness
- check-status.py: add --pane flag for tmux, clean redraw in watch mode
- CLAUDE.md: update docs for ZFS dataset storage, graceful shutdown

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 07:58:37 +00:00
A. F. Dudley
173b807451 fix: check-status.py discovers cluster-id from deployment.yml
Instead of hardcoding the laconic cluster ID, namespace, deployment
name, and pod label, read cluster-id from deployment.yml on biscayne
and derive everything from it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 06:48:19 +00:00
A. F. Dudley
ed6f6bfd59 fix: check-status.py pod label selector matches actual k8s labels
The pod label is app=laconic-70ce4c4b47e23b85, not
app=laconic-70ce4c4b47e23b85-deployment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 06:46:17 +00:00
A. F. Dudley
09728a719c fix: recovery playbook is fire-and-forget, add check-status.py
The recovery playbook now exits after scaling to 1. The container
entrypoint handles snapshot download (60+ min) and validator startup
autonomously. Removed all polling/verification steps that would
time out waiting.

Added scripts/check-status.py for monitoring download progress,
validator slot, gap to mainnet, catch-up rate, and ramdisk usage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 06:39:25 +00:00
A. F. Dudley
3dc345ea7d fix: recovery playbook delegates snapshot download to container entrypoint
The container's entrypoint.py already handles snapshot freshness checks,
cleanup, download (with rolling incremental convergence), and validator
startup. Remove the host-side download and let the container do the work.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 06:28:01 +00:00
A. F. Dudley
f842aba56a fix: sync-tools playbook uses agent forwarding, not socket hunting
- Add become: false to git tasks so SSH_AUTH_SOCK survives (sudo drops it)
- Fetch explicit branch names instead of bare `git fetch origin`
- Remove the fragile `Find SSH agent socket` workaround

Requires ForwardAgent yes in SSH config (added to ~/.ssh/config).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 06:20:16 +00:00
A. F. Dudley
601f520a45 fix: add 30-min wall-clock timeout to incremental convergence loop
Without a bound, the loop runs forever if sources never serve an
incremental close enough to head (e.g. full snapshot base slot is
too old). After 30 minutes, proceed with the best incremental
available or none.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 06:11:19 +00:00
A. F. Dudley
bfde58431e feat: rolling incremental snapshot download loop
After the full snapshot downloads, continuously re-probe all fast sources
for newer incrementals until the best available is within convergence_slots
(default 500) of head. Each iteration finds the highest-slot incremental
matching our full snapshot's base slot, downloads it (replacing any previous),
and checks the gap to mainnet head.

- Extract probe_incremental() from inline re-probe code
- Add convergence_slots param to download_best_snapshot() (default 500)
- Add --convergence-slots CLI arg
- Pass SNAPSHOT_CONVERGENCE_SLOTS env var from entrypoint.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 05:33:47 +00:00
A. F. Dudley
bd38c1b791 fix: remove Ansible snapshot download, add sync-tools playbook
The container entrypoint (entrypoint.py) handles snapshot download
internally via aria2c. Ansible no longer needs to scale-to-0, download,
scale-to-1 — it just deploys and lets the container manage startup.

- biscayne-redeploy.yml: remove snapshot download section, simplify to
  teardown → wipe → deploy → verify
- biscayne-sync-tools.yml: new playbook to sync laconic-so and
  agave-stack repos on biscayne, with separate branch controls
- snapshot_download.py: re-probe for fresh incremental after full
  snapshot download completes (old incremental is stale by then)
- Switch laconic_so_branch to fix/kind-mount-propagation (has
  hostNetwork translation code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 05:14:43 +00:00
A. F. Dudley
3574e387cc fix: update playbooks to use subtree path for snapshot_download.py
scripts/agave-container/ is a git subtree of agave-stack's container-build
directory. Replaces fragile cross-repo symlink with proper subtree.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 19:13:53 +00:00
A. F. Dudley
25952b4fa7 Merge commit 'f4b3a46109a8da00fdd68d8999160ddc45dcc88a' as 'scripts/agave-container' 2026-03-08 19:13:38 +00:00
A. F. Dudley
f4b3a46109 Squashed 'scripts/agave-container/' content from commit 4b5c875
git-subtree-dir: scripts/agave-container
git-subtree-split: 4b5c875a05cbbfbde38eeb053fd5443a8a50228c
2026-03-08 19:13:38 +00:00
A. F. Dudley
ba015bf3b1 chore: remove snapshot-download.py symlink (replacing with subtree)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 19:13:34 +00:00
A. F. Dudley
078872d78d feat: add iptables playbook, symlink snapshot-download.py to agave-stack
- playbooks/biscayne-iptables.yml: manages PREROUTING DNAT and DOCKER-USER
  rules for both host IP (186.233.184.235) and relay loopback (137.239.194.65).
  Idempotent, persists via netfilter-persistent.
- scripts/snapshot-download.py: replaced standalone copy with symlink to
  agave-stack source of truth, eliminating duplication.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 19:11:24 +00:00
A. F. Dudley
ec12e6079b fix: redeploy wipe uses umount+remount instead of rm -rf
Remounting tmpfs is instant (kernel frees pages), while rm -rf on 400GB+
of accounts files traverses every inode. Recover playbook keeps rm -rf
because the kind node's bind mount prevents umount while the container
is running.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 18:45:44 +00:00
A. F. Dudley
b2342bc539 fix: switch ramdisk from /dev/ram0 to tmpfs, refactor snapshot-download.py
The /dev/ram0 + XFS + format-ramdisk.service approach was unnecessary
complexity from a migration confusion — there was no actual tmpfs bug
with io_uring. tmpfs is simpler (no format-on-boot), resizable on the
fly, and what every other Solana operator uses.

Changes:
- prepare-agave: remove format-ramdisk.service and ramdisk-accounts.service,
  use tmpfs fstab entry with size=1024G (was 600G /dev/ram0, too small)
- recover: remove ramdisk_device var (no longer needed)
- redeploy: wipe accounts by rm -rf instead of umount+mkfs
- snapshot-download.py: extract download_best_snapshot() public API for
  use by the new container entrypoint.py (in agave-stack)
- CLAUDE.md: update ramdisk docs, fix /srv/solana → /srv/kind/solana paths
- health-check: fix ramdisk path references

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 18:43:41 +00:00
A. F. Dudley
591d158e1f chore: populate pebbles with known bugs and feature requests
Issues:
- bar-a3b [P0] agave-validator crash after ~57 seconds
- bar-41a [P1] telegraf volume mounts missing from pod spec
- bar-02e [P1] zvol mount bug (closed — fixed 2026-03-08)
- bar-b04 [P2] update redeploy to use deployment prepare
- bar-b41 [P2] snapshot leapfrog recovery playbook
- bar-0b4 [P3] prepare-agave unconditionally imports relay playbook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 06:59:07 +00:00
A. F. Dudley
974eed0c73 feat: add deployment prepare command (so-076.1)
Refactors K8sDeployer.up() into three composable methods:
- _setup_cluster_and_namespace(): kind cluster, API, namespace, ingress
- _create_infrastructure(): PVs, PVCs, ConfigMaps, Services, NodePorts
- _create_deployment(): Deployment resource (pods)

`prepare` calls the first two only — creates all cluster infrastructure
without starting pods. This eliminates the scale-to-0 workaround where
operators had to run `deployment start` then immediately scale down.

Usage: laconic-so deployment --dir <dir> prepare

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 06:56:34 +00:00
A. F. Dudley
9c5b8e3f4e chore: initialize pebbles issue tracker
Track stack-orchestrator work items with pebbles (append-only event log).

Epic so-076: Stack composition — deploy multiple stacks into one kind cluster
with independent lifecycle management per sub-stack.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 06:56:25 +00:00
A. F. Dudley
63735a9830 fix: revert snapshot_dir, add laconic_so_branch, move kind ramdisk check
- Revert snapshot_dir to /srv/solana/snapshots — aria2c runs on the host
  where this is the direct zvol mount (always available), unlike
  /srv/kind/solana/snapshots which depends on the bind mount
- Add laconic_so_branch variable (default: main) and use it in both
  git reset commands so the branch can be overridden via -e
- Move "Verify ramdisk visible inside kind node" from preflight to after
  "Wait for deployment to exist" — the kind container may not exist
  during preflight after teardown

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 04:42:11 +00:00
A. F. Dudley
14f423ea0c fix(k8s): read existing resourceVersion/clusterIP before replace
K8s PUT (replace) operations require metadata.resourceVersion for
optimistic concurrency control. Services additionally have immutable
spec.clusterIP that must be preserved from the existing object.

On 409 conflict, all _ensure_* methods now read the existing resource
first and copy resourceVersion (and clusterIP for Services) into the
body before calling replace.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 04:32:20 +00:00
A. F. Dudley
fe935037f7 fix: add laconic-so update step, downgrade unified mount check to warning
- Add laconic_so_repo variable (/home/rix/stack-orchestrator) and a
  git pull task before deployment start — the editable install must be
  current or stale code causes deploy failures
- Downgrade unified mount root check from fatal assertion to debug
  warning — the mount style depends on which laconic-so version is
  deployed, and individual PV mounts (/mnt/validator-*) work fine

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 04:32:20 +00:00
A. F. Dudley
1da69cf739 fix(k8s): make deploy_k8s.py idempotent with create-or-replace semantics
All K8s resource creation in deploy_k8s.py now uses try-create, catch
ApiException(409), then replace — matching the pattern already used for
secrets in deployment_create.py. This allows `deployment start` to be
safely re-run without 409 Conflict errors.

Resources made idempotent:
- Deployment (create_namespaced_deployment → replace on 409)
- Service (create_namespaced_service → replace on 409)
- Ingress (create_namespaced_ingress → replace on 409)
- NodePort services (same as Service)
- ConfigMap (create_namespaced_config_map → replace on 409)
- PV/PVC: bare `except: pass` replaced with explicit ApiException
  catch for 404

Extracted _ensure_deployment(), _ensure_service(), _ensure_ingress(),
and _ensure_config_map() helpers to keep cyclomatic complexity in check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 04:15:03 +00:00
A. F. Dudley
ad68d505ae fix: redeploy playbook paths, tags, and idempotency
- Fix snapshot_dir: /srv/solana/snapshots → /srv/kind/solana/snapshots
  (kind node reads from the bind mount, not the zvol mount directly)
- Fix kind-internal paths: /mnt/solana/... → /mnt/validator-... to match
  actual PV hostPath layout (individual mounts, not unified)
- Add 'scale-up' tag to "Scale validator to 1" task for partial recovery
  (--tags snapshot,scale-up,verify resumes without re-running deploy)
- Make 'Start deployment' idempotent: failed_when: false + follow-up
  check so existing deployment doesn't fail the play

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 04:14:05 +00:00
A. F. Dudley
05f9acf8a0 fix: DOCKER-USER rules for inbound relay, add UDP test playbooks
Root cause: Docker FORWARD chain policy DROP blocked all DNAT'd relay
traffic (UDP/TCP 8001, UDP 9000-9025) to the kind node. The DOCKER
chain only ACCEPTs specific TCP ports (6443, 443, 80). Added ACCEPT
rules in DOCKER-USER chain which runs before all Docker chains.

Changes:
- ashburn-relay-biscayne.yml: add DOCKER-USER ACCEPT rules (inbound
  tag) and rollback cleanup
- ashburn-relay-setup.sh.j2: persist DOCKER-USER rules across reboot
- relay-inbound-udp-test.yml: controlled e2e test — listener in kind
  netns, sender from kelce, assert arrival
- relay-link-test.yml: link-by-link tcpdump captures at each hop
- relay-test-udp-listen.py, relay-test-udp-send.py: test helpers
- relay-test-ip-echo.py: full ip_echo protocol test
- inventory/kelce.yml, inventory/panic.yml: test host inventories
- test-ashburn-relay.sh: add ip_echo UDP reachability test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 02:43:31 +00:00
A. F. Dudley
cc6acd5f09 fix: default skip-cluster-management to true
Destroying the kind cluster on stop/start is almost never the intent.
The cluster holds PVs, ConfigMaps, and networking state that are
expensive to recreate. Default to preserving the cluster; pass
--perform-cluster-management explicitly when a full teardown is needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 02:41:25 +00:00
A. F. Dudley
806c1bb723 refactor: rename deployment update to deployment update-envs
The update command only patches environment variables and adds a
restart annotation. It does not update ports, volumes, configmaps,
or any other deployment spec. The old name was misleading — it
implied a full spec update, causing operators to expect changes
that never took effect.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 02:33:20 +00:00
A. F. Dudley
496c7982cb feat: end-to-end relay test scripts
Three Python scripts send real packets from the kind node through the
full relay path (biscayne → tunnel → mia-sw01 → was-sw01 → internet)
and verify responses come back via the inbound path. No indirect
counter-checking — a response proves both directions work.

- relay-test-udp.py: DNS query with sport 8001
- relay-test-tcp-sport.py: HTTP request with sport 8001
- relay-test-tcp-dport.py: TCP connect to entrypoint dport 8001 (ip_echo)
- test-ashburn-relay.sh: orchestrates from ansible controller via nsenter

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 00:43:06 +00:00
A. F. Dudley
8eac9cc87f docs: document DoubleZero agent managed config on both switches
Inventories what the DZ agent controls (tunnels, ACLs, VRFs, BGP,
route-maps, loopbacks) so we don't accidentally modify objects that
the agent will silently overwrite. Includes a "safe to modify" section
listing our own relay infrastructure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:45:36 +00:00
A. F. Dudley
b82d66eeff fix: VRF isolation for mia-sw01 relay, TCP dport mangle for ip_echo
mia-sw01: Replace PBR-based outbound routing with VRF isolation.
TCAM profile tunnel-interface-acl doesn't support PBR or traffic-policy
on tunnel interfaces. Tunnel100 now lives in VRF "relay" whose default
route sends decapsulated traffic to was-sw01 via backbone, avoiding
BCP38 drops on the ISP uplink for src 137.239.194.65.

biscayne: Add TCP dport mangle rule for ip_echo (port 8001). Without it,
outbound ip_echo probes use biscayne's real IP instead of the Ashburn
relay IP, causing entrypoints to probe the wrong address. Also fix
loopback IP idempotency (handle "already assigned" error).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:31:18 +00:00
A. F. Dudley
a02534fc11 chore: add containerlab topologies for relay testing
Ashburn relay and shred relay lab configs for local end-to-end
testing with cEOS. No secrets — only public IPs and test scripts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 22:30:03 +00:00
A. F. Dudley
9cbc115295 fix: inventory layering — playbooks use hosts:all, cross-inventory uses explicit hosts
Normal playbooks should never hardcode hostnames — that's an inventory
concern. Changed all playbooks to hosts:all. The one exception is
ashburn-relay-check.yml which legitimately spans both inventories
(switches + biscayne) and uses explicit hostnames.

Also adds:
- ashburn-relay-check.yml: full-path relay diagnostics (switches + host)
- biscayne-start.yml: start kind container and scale validator to 1
- ashburn-relay-setup.sh.j2: boot persistence script for relay state
- Direct device mounts replacing rbind (ZFS shared propagation fix)
- systemd service replacing broken if-up.d/netfilter-persistent
- PV mount path corrections (/mnt/validator-* not /mnt/solana/*)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 22:28:21 +00:00
A. F. Dudley
7f205732f2 fix(k8s): expand etcd cleanup whitelist to preserve core cluster services
_clean_etcd_keeping_certs() only preserved /registry/secrets/caddy-system,
deleting everything else including the kubernetes ClusterIP service in the
default namespace. When kind recreated the cluster with the cleaned etcd,
kube-apiserver saw existing data and skipped bootstrapping the service.
kindnet panicked on KUBERNETES_SERVICE_HOST missing, blocking all pod
networking.

Expand the whitelist to also preserve:
- /registry/services/specs/default/kubernetes
- /registry/services/endpoints/default/kubernetes

Loop over multiple prefixes instead of a single etcdctl get --prefix call.

See docs/bug-laconic-so-etcd-cleanup.md in biscayne-agave-runbook.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 17:56:13 +00:00
A. F. Dudley
14c0f63775 feat: layer 4 invariants, mount checks, and deployment layer docs
- Rename biscayne-boot.yml → biscayne-prepare-agave.yml (layer 4)
- Document deployment layers and layer 4 invariants in playbook header
- Add zvol, ramdisk, rbind fstab management with stale entry cleanup
- Add kind node XFS verification (reads cluster-id from deployment)
- Add mount checks to health-check.yml (host mounts, kind visibility, propagation)
- Fix health-check discovery tasks with tags: [always] and non-fatal pod lookup
- Fix biscayne-redeploy.yml shell tasks missing executable: /bin/bash
- Add ansible_python_interpreter to inventory
- Update CLAUDE.md with deployment layers table and mount propagation notes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 13:08:04 +00:00
A. F. Dudley
a11d40f2f3 fix(k8s): add HostToContainer mount propagation to kind extraMounts
Without propagation, rbind submounts on the host (e.g., XFS zvol at
/srv/kind/solana) are invisible inside the kind node — it sees the
underlying filesystem (ZFS) instead. This causes agave's io_uring to
deadlock on ZFS transaction commits (D-state in dsl_dir_tempreserve_space).

HostToContainer propagation ensures host submounts propagate into the
kind node, so /mnt/solana correctly resolves to the XFS zvol.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 13:07:12 +00:00
A. F. Dudley
b40883ef65 fix: separate switch inventory to prevent accidental targeting
Move switches.yml to inventory-switches/ so ansible.cfg's
`inventory = inventory/` only loads biscayne. Switch playbooks
must pass `-i inventory-switches/` explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 10:56:48 +00:00
A. F. Dudley
4f452db6fe fix: ansible-lint production profile compliance for all playbooks
- FQCN for all modules (ansible.builtin.*)
- changed_when/failed_when on all command/shell tasks
- set -o pipefail on all shell tasks
- Add KUBECONFIG environment to health-check.yml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 10:52:40 +00:00
A. F. Dudley
eae4c3cdff feat(k8s): per-service resource layering in deployer
Resolve container resources using layered priority:
1. spec.yml per-container override (resources.containers.<name>)
2. Compose file deploy.resources block
3. spec.yml global resources
4. DEFAULT_CONTAINER_RESOURCES fallback

This prevents monitoring sidecars from inheriting the validator's
resource requests (e.g., 256G memory). Each service gets appropriate
resources from its compose definition unless explicitly overridden.

Note: existing deployments with a global resources block in spec.yml
can remove it once compose files declare per-service defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 10:26:10 +00:00
A. F. Dudley
d36a71f13d fix: redeploy playbook handles SSH agent, git pull, config regen, stale PVs
- ansible.cfg: enable SSH agent forwarding for git operations
- biscayne-redeploy.yml: add git pull, deploy create --update, and
  clear stale PV claimRefs after namespace deletion

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 09:58:29 +00:00
A. F. Dudley
8a8b882e32 bug: deploy create doesn't auto-generate volume mappings for new pods
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 09:56:28 +00:00
A. F. Dudley
9f6e1b5da7 fix: remove auto-revert timer, use checkpoint + write memory instead
Config is committed to running-config immediately (no 5-min timer).
Safety net is the checkpoint (rollback) and the fact that startup-config
is only written with -e commit=true. A reboot reverts uncommitted changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 01:49:25 +00:00
A. F. Dudley
742e84e3b0 feat: dedicated GRE tunnel (Tunnel100) bypassing DZ-managed Tunnel500
Root cause: the doublezero-agent on mia-sw01 manages Tunnel500's ACL
(SEC-USER-500-IN) and drops outbound gossip with src 137.239.194.65.
The agent overwrites any custom ACL entries.

Fix: create a separate GRE tunnel (Tunnel100) using mia-sw01's free
LAN IP (209.42.167.137) as tunnel source. This tunnel goes over the
ISP uplink, completely independent of the DZ overlay:
- mia-sw01: Tunnel100 src 209.42.167.137, dst 186.233.184.235
- biscayne: gre-ashburn src 186.233.184.235, dst 209.42.167.137
- Link addresses: 169.254.100.0/31

Playbook changes:
- ashburn-relay-mia-sw01: Tunnel100 + Loopback101 + SEC-VALIDATOR-100-IN
- ashburn-relay-biscayne: gre-ashburn tunnel + updated policy routing
- New template: ashburn-routing-ifup.sh.j2 for boot persistence

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 01:47:58 +00:00
A. F. Dudley
0b52fc99d7 fix: ashburn relay playbooks and document DZ tunnel ACL root cause
Playbook fixes from testing:
- ashburn-relay-biscayne: insert DNAT rules at position 1 before
  Docker's ADDRTYPE LOCAL rule (was being swallowed at position 3+)
- ashburn-relay-mia-sw01: add inbound route for 137.239.194.65 via
  egress-vrf vrf1 (nexthop only, no interface — EOS silently drops
  cross-VRF routes that specify a tunnel interface)
- ashburn-relay-was-sw01: replace PBR with static route, remove
  Loopback101

Bug doc (bug-ashburn-tunnel-port-filtering.md): root cause is the
DoubleZero agent on mia-sw01 overwrites SEC-USER-500-IN ACL, dropping
outbound gossip with src 137.239.194.65. The DZ agent controls
Tunnel500's lifecycle. Fix requires a separate GRE tunnel using
mia-sw01's free LAN IP (209.42.167.137) to bypass DZ infrastructure.

Also adds all repo docs, scripts, inventory, and remaining playbooks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 01:44:25 +00:00
A. F. Dudley
6841d5e3c3 feat: ashburn validator relay playbooks
Three playbooks for routing all validator traffic through 137.239.194.65:

- was-sw01: Loopback101 + PBR redirect on Et1/1 (already applied/committed)
  Will be simplified to a static route in next iteration.

- mia-sw01: ACL permit for src 137.239.194.65 on Tunnel500 + default route
  in vrf1 via egress-vrf default to was-sw01 backbone. No PBR needed —
  per-tunnel ACLs already scope what enters vrf1.

- biscayne: DNAT inbound (137.239.194.65 → kind node), SNAT + policy
  routing outbound (validator sport 8001,9000-9025 → doublezero0 GRE).
  Inbound already applied.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 21:08:48 +00:00
A. F. Dudley
dd29257dd8 chore: snapshot mia-sw01 and was-sw01 running configs
Captured via ansible `show running-config` before applying
mia-sw01 outbound validator redirect changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 20:45:32 +00:00
58 changed files with 3277 additions and 1323 deletions

View File

@ -1,2 +1,2 @@
Change this file to trigger running the test-database CI job
Change this file to trigger running the test-database CI job
Trigger test run

1
.gitignore vendored
View File

@ -8,3 +8,4 @@ __pycache__
package
stack_orchestrator/data/build_tag.txt
/build
.worktrees

3
.pebbles/config.json Normal file
View File

@ -0,0 +1,3 @@
{
"prefix": "so"
}

15
.pebbles/events.jsonl Normal file
View File

@ -0,0 +1,15 @@
{"type":"create","timestamp":"2026-03-08T06:56:07.080584539Z","issue_id":"so-076","payload":{"description":"Currently laconic-so maps one stack to one deployment to one pod. All containers\nin a stack's compose files become containers in a single k8s pod. This means:\n\n- Can't upgrade doublezero without restarting agave-validator\n- Can't restart monitoring without disrupting the validator\n- Can't independently scale or lifecycle-manage components\n\nThe fix is stack composition. A meta-stack (e.g. biscayne-stack) composes\nsub-stacks (agave, doublezero, agave-monitoring), each becoming its own\nk8s Deployment with independent lifecycle.","priority":"2","title":"Stack composition: deploy multiple stacks into one kind cluster","type":"epic"}}
{"type":"create","timestamp":"2026-03-08T06:56:07.551986919Z","issue_id":"so-ab0","payload":{"description":"Add laconic-so deployment prepare that creates cluster infrastructure without pods. Already implemented, needs review.","priority":"2","title":"deployment prepare command","type":"task"}}
{"type":"create","timestamp":"2026-03-08T06:56:07.884418759Z","issue_id":"so-04f","payload":{"description":"deployment stop on ANY deployment deletes the shared kind cluster. Should only delete its own namespace.","priority":"2","title":"deployment stop should not destroy shared cluster","type":"bug"}}
{"type":"create","timestamp":"2026-03-08T06:56:08.253520249Z","issue_id":"so-370","payload":{"description":"Allow stack.yml to reference sub-stacks. Each sub-stack becomes its own k8s Deployment sharing namespace and PVs.","priority":"2","title":"Add stacks: field to stack.yml for composition","type":"task"}}
{"type":"create","timestamp":"2026-03-08T06:56:08.646764337Z","issue_id":"so-f7c","payload":{"description":"Create three independent stacks from the monolithic agave-stack. Each gets its own compose file and independent lifecycle.","priority":"2","title":"Split agave-stack into agave + doublezero + monitoring","type":"task"}}
{"type":"rename","timestamp":"2026-03-08T06:56:14.499990161Z","issue_id":"so-ab0","payload":{"new_id":"so-076.1"}}
{"type":"dep_add","timestamp":"2026-03-08T06:56:14.499992031Z","issue_id":"so-076.1","payload":{"dep_type":"parent-child","depends_on":"so-076"}}
{"type":"rename","timestamp":"2026-03-08T06:56:14.786407752Z","issue_id":"so-04f","payload":{"new_id":"so-076.2"}}
{"type":"dep_add","timestamp":"2026-03-08T06:56:14.786409842Z","issue_id":"so-076.2","payload":{"dep_type":"parent-child","depends_on":"so-076"}}
{"type":"rename","timestamp":"2026-03-08T06:56:15.058959714Z","issue_id":"so-370","payload":{"new_id":"so-076.3"}}
{"type":"dep_add","timestamp":"2026-03-08T06:56:15.058961364Z","issue_id":"so-076.3","payload":{"dep_type":"parent-child","depends_on":"so-076"}}
{"type":"rename","timestamp":"2026-03-08T06:56:15.410080785Z","issue_id":"so-f7c","payload":{"new_id":"so-076.4"}}
{"type":"dep_add","timestamp":"2026-03-08T06:56:15.410082305Z","issue_id":"so-076.4","payload":{"dep_type":"parent-child","depends_on":"so-076"}}
{"type":"dep_add","timestamp":"2026-03-08T06:56:16.313585082Z","issue_id":"so-076.3","payload":{"dep_type":"blocks","depends_on":"so-076.2"}}
{"type":"dep_add","timestamp":"2026-03-08T06:56:16.567629422Z","issue_id":"so-076.4","payload":{"dep_type":"blocks","depends_on":"so-076.3"}}

View File

@ -25,7 +25,7 @@ dependencies = [
"click>=8.1.6",
"PyYAML>=6.0.1",
"ruamel.yaml>=0.17.32",
"pydantic==1.10.9",
"pydantic==1.10.13",
"tomli==2.0.1",
"validators==0.22.0",
"kubernetes>=28.1.0",

View File

@ -6,7 +6,7 @@ python-on-whales>=0.64.0
click>=8.1.6
PyYAML>=6.0.1
ruamel.yaml>=0.17.32
pydantic==1.10.9
pydantic==1.10.13
tomli==2.0.1
validators==0.22.0
kubernetes>=28.1.0

View File

@ -1,12 +1,12 @@
# See
# https://medium.com/nerd-for-tech/how-to-build-and-distribute-a-cli-tool-with-python-537ae41d9d78
from setuptools import setup, find_packages
from setuptools import find_packages, setup
with open("README.md", "r", encoding="utf-8") as fh:
with open("README.md", encoding="utf-8") as fh:
long_description = fh.read()
with open("requirements.txt", "r", encoding="utf-8") as fh:
with open("requirements.txt", encoding="utf-8") as fh:
requirements = fh.read()
with open("stack_orchestrator/data/version.txt", "r", encoding="utf-8") as fh:
with open("stack_orchestrator/data/version.txt", encoding="utf-8") as fh:
version = fh.readlines()[-1].strip(" \n")
setup(
name="laconic-stack-orchestrator",

View File

@ -15,9 +15,11 @@
import os
from abc import ABC, abstractmethod
from stack_orchestrator.deploy.deploy import get_stack_status
from decouple import config
from stack_orchestrator.deploy.deploy import get_stack_status
def get_stack(config, stack):
if stack == "package-registry":

View File

@ -22,17 +22,19 @@
# allow re-build of either all or specific containers
import os
import sys
from decouple import config
import subprocess
import click
import sys
from pathlib import Path
from stack_orchestrator.opts import opts
from stack_orchestrator.util import include_exclude_check, stack_is_external, error_exit
import click
from decouple import config
from stack_orchestrator.base import get_npm_registry_url
from stack_orchestrator.build.build_types import BuildContext
from stack_orchestrator.build.publish import publish_image
from stack_orchestrator.build.build_util import get_containers_in_scope
from stack_orchestrator.build.publish import publish_image
from stack_orchestrator.opts import opts
from stack_orchestrator.util import error_exit, include_exclude_check, stack_is_external
# TODO: find a place for this
# epilog="Config provided either in .env or settings.ini or env vars:
@ -59,9 +61,7 @@ def make_container_build_env(
container_build_env.update({"CERC_SCRIPT_DEBUG": "true"} if debug else {})
container_build_env.update({"CERC_FORCE_REBUILD": "true"} if force_rebuild else {})
container_build_env.update(
{"CERC_CONTAINER_EXTRA_BUILD_ARGS": extra_build_args}
if extra_build_args
else {}
{"CERC_CONTAINER_EXTRA_BUILD_ARGS": extra_build_args} if extra_build_args else {}
)
docker_host_env = os.getenv("DOCKER_HOST")
if docker_host_env:
@ -81,12 +81,8 @@ def process_container(build_context: BuildContext) -> bool:
# Check if this is in an external stack
if stack_is_external(build_context.stack):
container_parent_dir = Path(build_context.stack).parent.parent.joinpath(
"container-build"
)
temp_build_dir = container_parent_dir.joinpath(
build_context.container.replace("/", "-")
)
container_parent_dir = Path(build_context.stack).parent.parent.joinpath("container-build")
temp_build_dir = container_parent_dir.joinpath(build_context.container.replace("/", "-"))
temp_build_script_filename = temp_build_dir.joinpath("build.sh")
# Now check if the container exists in the external stack.
if not temp_build_script_filename.exists():
@ -104,18 +100,13 @@ def process_container(build_context: BuildContext) -> bool:
build_command = build_script_filename.as_posix()
else:
if opts.o.verbose:
print(
f"No script file found: {build_script_filename}, "
"using default build script"
)
print(f"No script file found: {build_script_filename}, " "using default build script")
repo_dir = build_context.container.split("/")[1]
# TODO: make this less of a hack -- should be specified in
# some metadata somewhere. Check if we have a repo for this
# container. If not, set the context dir to container-build subdir
repo_full_path = os.path.join(build_context.dev_root_path, repo_dir)
repo_dir_or_build_dir = (
repo_full_path if os.path.exists(repo_full_path) else build_dir
)
repo_dir_or_build_dir = repo_full_path if os.path.exists(repo_full_path) else build_dir
build_command = (
os.path.join(build_context.container_build_dir, "default-build.sh")
+ f" {default_container_tag} {repo_dir_or_build_dir}"
@ -159,9 +150,7 @@ def process_container(build_context: BuildContext) -> bool:
default=False,
help="Publish the built images in the specified image registry",
)
@click.option(
"--image-registry", help="Specify the image registry for --publish-images"
)
@click.option("--image-registry", help="Specify the image registry for --publish-images")
@click.pass_context
def command(
ctx,
@ -185,14 +174,9 @@ def command(
if local_stack:
dev_root_path = os.getcwd()[0 : os.getcwd().rindex("stack-orchestrator")]
print(
f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: "
f"{dev_root_path}"
)
print(f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: " f"{dev_root_path}")
else:
dev_root_path = os.path.expanduser(
config("CERC_REPO_BASE_DIR", default="~/cerc")
)
dev_root_path = os.path.expanduser(config("CERC_REPO_BASE_DIR", default="~/cerc"))
if not opts.o.quiet:
print(f"Dev Root is: {dev_root_path}")
@ -230,10 +214,7 @@ def command(
else:
print(f"Error running build for {build_context.container}")
if not opts.o.continue_on_error:
error_exit(
"container build failed and --continue-on-error "
"not set, exiting"
)
error_exit("container build failed and --continue-on-error " "not set, exiting")
sys.exit(1)
else:
print(

View File

@ -18,15 +18,17 @@
# env vars:
# CERC_REPO_BASE_DIR defaults to ~/cerc
import importlib.resources
import os
import sys
from shutil import rmtree, copytree
from decouple import config
from shutil import copytree, rmtree
import click
import importlib.resources
from python_on_whales import docker, DockerException
from decouple import config
from python_on_whales import DockerException, docker
from stack_orchestrator.base import get_stack
from stack_orchestrator.util import include_exclude_check, get_parsed_stack_config
from stack_orchestrator.util import get_parsed_stack_config, include_exclude_check
builder_js_image_name = "cerc/builder-js:local"
@ -70,14 +72,9 @@ def command(ctx, include, exclude, force_rebuild, extra_build_args):
if local_stack:
dev_root_path = os.getcwd()[0 : os.getcwd().rindex("stack-orchestrator")]
print(
f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: "
f"{dev_root_path}"
)
print(f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: " f"{dev_root_path}")
else:
dev_root_path = os.path.expanduser(
config("CERC_REPO_BASE_DIR", default="~/cerc")
)
dev_root_path = os.path.expanduser(config("CERC_REPO_BASE_DIR", default="~/cerc"))
build_root_path = os.path.join(dev_root_path, "build-trees")
@ -94,9 +91,7 @@ def command(ctx, include, exclude, force_rebuild, extra_build_args):
# See: https://stackoverflow.com/a/20885799/1701505
from stack_orchestrator import data
with importlib.resources.open_text(
data, "npm-package-list.txt"
) as package_list_file:
with importlib.resources.open_text(data, "npm-package-list.txt") as package_list_file:
all_packages = package_list_file.read().splitlines()
packages_in_scope = []
@ -132,8 +127,7 @@ def command(ctx, include, exclude, force_rebuild, extra_build_args):
build_command = [
"sh",
"-c",
"cd /workspace && "
f"build-npm-package-local-dependencies.sh {npm_registry_url}",
"cd /workspace && " f"build-npm-package-local-dependencies.sh {npm_registry_url}",
]
if not dry_run:
if verbose:
@ -151,9 +145,7 @@ def command(ctx, include, exclude, force_rebuild, extra_build_args):
envs.update({"CERC_SCRIPT_DEBUG": "true"} if debug else {})
envs.update({"CERC_FORCE_REBUILD": "true"} if force_rebuild else {})
envs.update(
{"CERC_CONTAINER_EXTRA_BUILD_ARGS": extra_build_args}
if extra_build_args
else {}
{"CERC_CONTAINER_EXTRA_BUILD_ARGS": extra_build_args} if extra_build_args else {}
)
try:
docker.run(
@ -176,16 +168,10 @@ def command(ctx, include, exclude, force_rebuild, extra_build_args):
except DockerException as e:
print(f"Error executing build for {package} in container:\n {e}")
if not continue_on_error:
print(
"FATAL Error: build failed and --continue-on-error "
"not set, exiting"
)
print("FATAL Error: build failed and --continue-on-error " "not set, exiting")
sys.exit(1)
else:
print(
"****** Build Error, continuing because "
"--continue-on-error is set"
)
print("****** Build Error, continuing because " "--continue-on-error is set")
else:
print("Skipped")
@ -203,10 +189,7 @@ def _ensure_prerequisites():
# Tell the user how to build it if not
images = docker.image.list(builder_js_image_name)
if len(images) == 0:
print(
f"FATAL: builder image: {builder_js_image_name} is required "
"but was not found"
)
print(f"FATAL: builder image: {builder_js_image_name} is required " "but was not found")
print(
"Please run this command to create it: "
"laconic-so --stack build-support build-containers"

View File

@ -16,7 +16,6 @@
from dataclasses import dataclass
from pathlib import Path
from typing import Mapping
@dataclass
@ -24,5 +23,5 @@ class BuildContext:
stack: str
container: str
container_build_dir: Path
container_build_env: Mapping[str, str]
container_build_env: dict[str, str]
dev_root_path: str

View File

@ -30,9 +30,7 @@ def get_containers_in_scope(stack: str):
# See: https://stackoverflow.com/a/20885799/1701505
from stack_orchestrator import data
with importlib.resources.open_text(
data, "container-image-list.txt"
) as container_list_file:
with importlib.resources.open_text(data, "container-image-list.txt") as container_list_file:
containers_in_scope = container_list_file.read().splitlines()
if opts.o.verbose:

View File

@ -23,20 +23,19 @@
import os
import sys
from decouple import config
import click
from pathlib import Path
import click
from decouple import config
from stack_orchestrator.build import build_containers
from stack_orchestrator.deploy.webapp.util import determine_base_container, TimedLogger
from stack_orchestrator.build.build_types import BuildContext
from stack_orchestrator.deploy.webapp.util import TimedLogger, determine_base_container
@click.command()
@click.option("--base-container")
@click.option(
"--source-repo", help="directory containing the webapp to build", required=True
)
@click.option("--source-repo", help="directory containing the webapp to build", required=True)
@click.option(
"--force-rebuild",
is_flag=True,
@ -64,13 +63,10 @@ def command(ctx, base_container, source_repo, force_rebuild, extra_build_args, t
if local_stack:
dev_root_path = os.getcwd()[0 : os.getcwd().rindex("stack-orchestrator")]
logger.log(
f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: "
f"{dev_root_path}"
f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: " f"{dev_root_path}"
)
else:
dev_root_path = os.path.expanduser(
config("CERC_REPO_BASE_DIR", default="~/cerc")
)
dev_root_path = os.path.expanduser(config("CERC_REPO_BASE_DIR", default="~/cerc"))
if verbose:
logger.log(f"Dev Root is: {dev_root_path}")

View File

@ -13,19 +13,19 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import click
from dataclasses import dataclass
import json
import platform
from dataclasses import dataclass
import click
import requests
from python_on_whales import DockerClient
from python_on_whales.components.manifest.cli_wrapper import ManifestCLI, ManifestList
from python_on_whales.utils import run
import requests
from typing import List
from stack_orchestrator.opts import opts
from stack_orchestrator.util import include_exclude_check, error_exit
from stack_orchestrator.build.build_util import get_containers_in_scope
from stack_orchestrator.opts import opts
from stack_orchestrator.util import error_exit, include_exclude_check
# Experimental fetch-container command
@ -55,7 +55,7 @@ def _local_tag_for(container: str):
# $ curl -u "my-username:my-token" -X GET \
# "https://<container-registry-hostname>/v2/cerc-io/cerc/test-container/tags/list"
# {"name":"cerc-io/cerc/test-container","tags":["202402232130","202402232208"]}
def _get_tags_for_container(container: str, registry_info: RegistryInfo) -> List[str]:
def _get_tags_for_container(container: str, registry_info: RegistryInfo) -> list[str]:
# registry looks like: git.vdb.to/cerc-io
registry_parts = registry_info.registry.split("/")
url = f"https://{registry_parts[0]}/v2/{registry_parts[1]}/{container}/tags/list"
@ -68,16 +68,15 @@ def _get_tags_for_container(container: str, registry_info: RegistryInfo) -> List
tag_info = response.json()
if opts.o.debug:
print(f"container tags list: {tag_info}")
tags_array = tag_info["tags"]
tags_array: list[str] = tag_info["tags"]
return tags_array
else:
error_exit(
f"failed to fetch tags from image registry, "
f"status code: {response.status_code}"
f"failed to fetch tags from image registry, " f"status code: {response.status_code}"
)
def _find_latest(candidate_tags: List[str]):
def _find_latest(candidate_tags: list[str]):
# Lex sort should give us the latest first
sorted_candidates = sorted(candidate_tags)
if opts.o.debug:
@ -86,8 +85,8 @@ def _find_latest(candidate_tags: List[str]):
def _filter_for_platform(
container: str, registry_info: RegistryInfo, tag_list: List[str]
) -> List[str]:
container: str, registry_info: RegistryInfo, tag_list: list[str]
) -> list[str]:
filtered_tags = []
this_machine = platform.machine()
# Translate between Python and docker platform names
@ -151,15 +150,9 @@ def _add_local_tag(remote_tag: str, registry: str, local_tag: str):
default=False,
help="Overwrite a locally built image, if present",
)
@click.option(
"--image-registry", required=True, help="Specify the image registry to fetch from"
)
@click.option(
"--registry-username", required=True, help="Specify the image registry username"
)
@click.option(
"--registry-token", required=True, help="Specify the image registry access token"
)
@click.option("--image-registry", required=True, help="Specify the image registry to fetch from")
@click.option("--registry-username", required=True, help="Specify the image registry username")
@click.option("--registry-token", required=True, help="Specify the image registry access token")
@click.pass_context
def command(
ctx,

View File

@ -14,6 +14,7 @@
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from datetime import datetime
from python_on_whales import DockerClient
from stack_orchestrator.opts import opts

View File

@ -2,12 +2,11 @@
import argparse
import os
import random
import sys
from subprocess import Popen
import psycopg
import random
from subprocess import Popen
from fabric import Connection
@ -27,27 +26,19 @@ def dump_src_db_to_file(db_host, db_port, db_user, db_password, db_name, file_na
def establish_ssh_tunnel(ssh_host, ssh_port, ssh_user, db_host, db_port):
local_port = random.randint(11000, 12000)
conn = Connection(host=ssh_host, port=ssh_port, user=ssh_user)
fw = conn.forward_local(
local_port=local_port, remote_port=db_port, remote_host=db_host
)
fw = conn.forward_local(local_port=local_port, remote_port=db_port, remote_host=db_host)
return conn, fw, local_port
def load_db_from_file(db_host, db_port, db_user, db_password, db_name, file_name):
connstr = "host=%s port=%s user=%s password=%s sslmode=disable dbname=%s" % (
db_host,
db_port,
db_user,
db_password,
db_name,
)
connstr = f"host={db_host} port={db_port} user={db_user} password={db_password} sslmode=disable dbname={db_name}"
with psycopg.connect(connstr) as conn:
with conn.cursor() as cur:
print(
f"Importing from {file_name} to {db_host}:{db_port}/{db_name}... ",
end="",
)
cur.execute(open(file_name, "rt").read())
cur.execute(open(file_name).read())
print("DONE")
@ -60,9 +51,7 @@ if __name__ == "__main__":
parser.add_argument("--src-dbpw", help="DB password", required=True)
parser.add_argument("--src-dbname", help="dbname", default="keycloak")
parser.add_argument(
"--dst-file", help="Destination filename", default="keycloak-mirror.sql"
)
parser.add_argument("--dst-file", help="Destination filename", default="keycloak-mirror.sql")
parser.add_argument("--live-import", help="run the import", action="store_true")

View File

@ -1,7 +1,8 @@
from web3.auto import w3
import ruamel.yaml as yaml
import sys
import ruamel.yaml as yaml
from web3.auto import w3
w3.eth.account.enable_unaudited_hdwallet_features()
testnet_config_path = "genesis-config.yaml"
@ -11,8 +12,6 @@ if len(sys.argv) > 1:
with open(testnet_config_path) as stream:
data = yaml.safe_load(stream)
for key, value in data["el_premine"].items():
acct = w3.eth.account.from_mnemonic(
data["mnemonic"], account_path=key, passphrase=""
)
print("%s,%s,%s" % (key, acct.address, acct.key.hex()))
for key, _value in data["el_premine"].items():
acct = w3.eth.account.from_mnemonic(data["mnemonic"], account_path=key, passphrase="")
print(f"{key},{acct.address},{acct.key.hex()}")

View File

@ -16,13 +16,14 @@
from pathlib import Path
from shutil import copy
import yaml
def create(context, extra_args):
# Our goal here is just to copy the json files for blast
yml_path = context.deployment_dir.joinpath("spec.yml")
with open(yml_path, "r") as file:
with open(yml_path) as file:
data = yaml.safe_load(file)
mount_point = data["volumes"]["blast-data"]

View File

@ -27,8 +27,6 @@ def setup(ctx):
def create(ctx, extra_args):
# Generate the JWT secret and save to its config file
secret = token_hex(32)
jwt_file_path = ctx.deployment_dir.joinpath(
"data", "mainnet_eth_config_data", "jwtsecret"
)
jwt_file_path = ctx.deployment_dir.joinpath("data", "mainnet_eth_config_data", "jwtsecret")
with open(jwt_file_path, "w+") as jwt_file:
jwt_file.write(secret)

View File

@ -13,22 +13,23 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from stack_orchestrator.util import get_yaml
import os
import re
import sys
from enum import Enum
from pathlib import Path
from shutil import copyfile, copytree
import tomli
from stack_orchestrator.deploy.deploy_types import (
DeployCommandContext,
LaconicStackSetupCommand,
)
from stack_orchestrator.deploy.deploy_util import VolumeMapping, run_container_command
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.deploy.stack_state import State
from stack_orchestrator.deploy.deploy_util import VolumeMapping, run_container_command
from stack_orchestrator.opts import opts
from enum import Enum
from pathlib import Path
from shutil import copyfile, copytree
import os
import sys
import tomli
import re
from stack_orchestrator.util import get_yaml
default_spec_file_content = ""
@ -80,9 +81,7 @@ def _copy_gentx_files(network_dir: Path, gentx_file_list: str):
gentx_file_path = Path(gentx_file)
copyfile(
gentx_file_path,
os.path.join(
network_dir, "config", "gentx", os.path.basename(gentx_file_path)
),
os.path.join(network_dir, "config", "gentx", os.path.basename(gentx_file_path)),
)
@ -91,7 +90,7 @@ def _remove_persistent_peers(network_dir: Path):
if not config_file_path.exists():
print("Error: config.toml not found")
sys.exit(1)
with open(config_file_path, "r") as input_file:
with open(config_file_path) as input_file:
config_file_content = input_file.read()
persistent_peers_pattern = '^persistent_peers = "(.+?)"'
replace_with = 'persistent_peers = ""'
@ -110,7 +109,7 @@ def _insert_persistent_peers(config_dir: Path, new_persistent_peers: str):
if not config_file_path.exists():
print("Error: config.toml not found")
sys.exit(1)
with open(config_file_path, "r") as input_file:
with open(config_file_path) as input_file:
config_file_content = input_file.read()
persistent_peers_pattern = r'^persistent_peers = ""'
replace_with = f'persistent_peers = "{new_persistent_peers}"'
@ -129,7 +128,7 @@ def _enable_cors(config_dir: Path):
if not config_file_path.exists():
print("Error: config.toml not found")
sys.exit(1)
with open(config_file_path, "r") as input_file:
with open(config_file_path) as input_file:
config_file_content = input_file.read()
cors_pattern = r"^cors_allowed_origins = \[]"
replace_with = 'cors_allowed_origins = ["*"]'
@ -142,13 +141,11 @@ def _enable_cors(config_dir: Path):
if not app_file_path.exists():
print("Error: app.toml not found")
sys.exit(1)
with open(app_file_path, "r") as input_file:
with open(app_file_path) as input_file:
app_file_content = input_file.read()
cors_pattern = r"^enabled-unsafe-cors = false"
replace_with = "enabled-unsafe-cors = true"
app_file_content = re.sub(
cors_pattern, replace_with, app_file_content, flags=re.MULTILINE
)
app_file_content = re.sub(cors_pattern, replace_with, app_file_content, flags=re.MULTILINE)
with open(app_file_path, "w") as output_file:
output_file.write(app_file_content)
@ -158,7 +155,7 @@ def _set_listen_address(config_dir: Path):
if not config_file_path.exists():
print("Error: config.toml not found")
sys.exit(1)
with open(config_file_path, "r") as input_file:
with open(config_file_path) as input_file:
config_file_content = input_file.read()
existing_pattern = r'^laddr = "tcp://127.0.0.1:26657"'
replace_with = 'laddr = "tcp://0.0.0.0:26657"'
@ -172,7 +169,7 @@ def _set_listen_address(config_dir: Path):
if not app_file_path.exists():
print("Error: app.toml not found")
sys.exit(1)
with open(app_file_path, "r") as input_file:
with open(app_file_path) as input_file:
app_file_content = input_file.read()
existing_pattern1 = r'^address = "tcp://localhost:1317"'
replace_with1 = 'address = "tcp://0.0.0.0:1317"'
@ -192,10 +189,7 @@ def _phase_from_params(parameters):
phase = SetupPhase.ILLEGAL
if parameters.initialize_network:
if parameters.join_network or parameters.create_network:
print(
"Can't supply --join-network or --create-network "
"with --initialize-network"
)
print("Can't supply --join-network or --create-network " "with --initialize-network")
sys.exit(1)
if not parameters.chain_id:
print("--chain-id is required")
@ -207,26 +201,17 @@ def _phase_from_params(parameters):
phase = SetupPhase.INITIALIZE
elif parameters.join_network:
if parameters.initialize_network or parameters.create_network:
print(
"Can't supply --initialize-network or --create-network "
"with --join-network"
)
print("Can't supply --initialize-network or --create-network " "with --join-network")
sys.exit(1)
phase = SetupPhase.JOIN
elif parameters.create_network:
if parameters.initialize_network or parameters.join_network:
print(
"Can't supply --initialize-network or --join-network "
"with --create-network"
)
print("Can't supply --initialize-network or --join-network " "with --create-network")
sys.exit(1)
phase = SetupPhase.CREATE
elif parameters.connect_network:
if parameters.initialize_network or parameters.join_network:
print(
"Can't supply --initialize-network or --join-network "
"with --connect-network"
)
print("Can't supply --initialize-network or --join-network " "with --connect-network")
sys.exit(1)
phase = SetupPhase.CONNECT
return phase
@ -341,8 +326,7 @@ def setup(
output3, status3 = run_container_command(
command_context,
"laconicd",
f"laconicd cometbft show-validator "
f"--home {laconicd_home_path_in_container}",
f"laconicd cometbft show-validator " f"--home {laconicd_home_path_in_container}",
mounts,
)
print(f"Node validator address: {output3}")
@ -361,23 +345,16 @@ def setup(
# Copy it into our network dir
genesis_file_path = Path(parameters.genesis_file)
if not os.path.exists(genesis_file_path):
print(
f"Error: supplied genesis file: {parameters.genesis_file} "
"does not exist."
)
print(f"Error: supplied genesis file: {parameters.genesis_file} " "does not exist.")
sys.exit(1)
copyfile(
genesis_file_path,
os.path.join(
network_dir, "config", os.path.basename(genesis_file_path)
),
os.path.join(network_dir, "config", os.path.basename(genesis_file_path)),
)
else:
# We're generating the genesis file
# First look in the supplied gentx files for the other nodes' keys
other_node_keys = _get_node_keys_from_gentx_files(
parameters.gentx_address_list
)
other_node_keys = _get_node_keys_from_gentx_files(parameters.gentx_address_list)
# Add those keys to our genesis, with balances we determine here (why?)
outputk = None
for other_node_key in other_node_keys:
@ -398,8 +375,7 @@ def setup(
output1, status1 = run_container_command(
command_context,
"laconicd",
f"laconicd genesis collect-gentxs "
f"--home {laconicd_home_path_in_container}",
f"laconicd genesis collect-gentxs " f"--home {laconicd_home_path_in_container}",
mounts,
)
if options.debug:
@ -416,8 +392,7 @@ def setup(
output2, status1 = run_container_command(
command_context,
"laconicd",
f"laconicd genesis validate-genesis "
f"--home {laconicd_home_path_in_container}",
f"laconicd genesis validate-genesis " f"--home {laconicd_home_path_in_container}",
mounts,
)
print(f"validate-genesis result: {output2}")
@ -452,9 +427,7 @@ def create(deployment_context: DeploymentContext, extra_args):
sys.exit(1)
# Copy the network directory contents into our deployment
# TODO: change this to work with non local paths
deployment_config_dir = deployment_context.deployment_dir.joinpath(
"data", "laconicd-config"
)
deployment_config_dir = deployment_context.deployment_dir.joinpath("data", "laconicd-config")
copytree(config_dir_path, deployment_config_dir, dirs_exist_ok=True)
# If supplied, add the initial persistent peers to the config file
if extra_args[1]:
@ -465,9 +438,7 @@ def create(deployment_context: DeploymentContext, extra_args):
_set_listen_address(deployment_config_dir)
# Copy the data directory contents into our deployment
# TODO: change this to work with non local paths
deployment_data_dir = deployment_context.deployment_dir.joinpath(
"data", "laconicd-data"
)
deployment_data_dir = deployment_context.deployment_dir.joinpath("data", "laconicd-data")
copytree(data_dir_path, deployment_data_dir, dirs_exist_ok=True)

View File

@ -13,12 +13,13 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from stack_orchestrator.util import get_yaml
from pathlib import Path
from stack_orchestrator.deploy.deploy_types import DeployCommandContext
from stack_orchestrator.deploy.deploy_util import VolumeMapping, run_container_command
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.deploy.stack_state import State
from stack_orchestrator.deploy.deploy_util import VolumeMapping, run_container_command
from pathlib import Path
from stack_orchestrator.util import get_yaml
default_spec_file_content = """config:
test-variable-1: test-value-1

View File

@ -14,12 +14,13 @@
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from pathlib import Path
from typing import Optional
from python_on_whales import DockerClient, DockerException
from stack_orchestrator.deploy.deployer import (
Deployer,
DeployerException,
DeployerConfigGenerator,
DeployerException,
)
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.opts import opts
@ -32,10 +33,10 @@ class DockerDeployer(Deployer):
def __init__(
self,
type: str,
deployment_context: Optional[DeploymentContext],
deployment_context: DeploymentContext | None,
compose_files: list,
compose_project_name: Optional[str],
compose_env_file: Optional[str],
compose_project_name: str | None,
compose_env_file: str | None,
) -> None:
self.docker = DockerClient(
compose_files=compose_files,
@ -53,21 +54,21 @@ class DockerDeployer(Deployer):
try:
return self.docker.compose.up(detach=detach, services=services)
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def down(self, timeout, volumes, skip_cluster_management):
if not opts.o.dry_run:
try:
return self.docker.compose.down(timeout=timeout, volumes=volumes)
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def update(self):
def update_envs(self):
if not opts.o.dry_run:
try:
return self.docker.compose.restart()
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def status(self):
if not opts.o.dry_run:
@ -75,23 +76,21 @@ class DockerDeployer(Deployer):
for p in self.docker.compose.ps():
print(f"{p.name}\t{p.state.status}")
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def ps(self):
if not opts.o.dry_run:
try:
return self.docker.compose.ps()
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def port(self, service, private_port):
if not opts.o.dry_run:
try:
return self.docker.compose.port(
service=service, private_port=private_port
)
return self.docker.compose.port(service=service, private_port=private_port)
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def execute(self, service, command, tty, envs):
if not opts.o.dry_run:
@ -100,7 +99,7 @@ class DockerDeployer(Deployer):
service=service, command=command, tty=tty, envs=envs
)
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def logs(self, services, tail, follow, stream):
if not opts.o.dry_run:
@ -109,7 +108,7 @@ class DockerDeployer(Deployer):
services=services, tail=tail, follow=follow, stream=stream
)
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def run(
self,
@ -118,10 +117,14 @@ class DockerDeployer(Deployer):
user=None,
volumes=None,
entrypoint=None,
env={},
ports=[],
env=None,
ports=None,
detach=False,
):
if ports is None:
ports = []
if env is None:
env = {}
if not opts.o.dry_run:
try:
return self.docker.run(
@ -136,9 +139,9 @@ class DockerDeployer(Deployer):
publish_all=len(ports) == 0,
)
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
def run_job(self, job_name: str, release_name: Optional[str] = None):
def run_job(self, job_name: str, release_name: str | None = None):
# release_name is ignored for Docker deployments (only used for K8s/Helm)
if not opts.o.dry_run:
try:
@ -155,9 +158,7 @@ class DockerDeployer(Deployer):
)
if not job_compose_file.exists():
raise DeployerException(
f"Job compose file not found: {job_compose_file}"
)
raise DeployerException(f"Job compose file not found: {job_compose_file}")
if opts.o.verbose:
print(f"Running job from: {job_compose_file}")
@ -175,7 +176,7 @@ class DockerDeployer(Deployer):
return job_docker.compose.run(service=job_name, remove=True, tty=True)
except DockerException as e:
raise DeployerException(e)
raise DeployerException(e) from e
class DockerDeployerConfigGenerator(DeployerConfigGenerator):

View File

@ -15,36 +15,37 @@
# Deploys the system components using a deployer (either docker-compose or k8s)
import hashlib
import copy
import hashlib
import os
import subprocess
import sys
from dataclasses import dataclass
from importlib import resources
from typing import Optional
import subprocess
import click
from pathlib import Path
import click
from stack_orchestrator import constants
from stack_orchestrator.opts import opts
from stack_orchestrator.util import (
get_stack_path,
include_exclude_check,
get_parsed_stack_config,
global_options2,
get_dev_root_path,
stack_is_in_deployment,
resolve_compose_file,
)
from stack_orchestrator.deploy.deployer import DeployerException
from stack_orchestrator.deploy.deployer_factory import getDeployer
from stack_orchestrator.deploy.compose.deploy_docker import DockerDeployer
from stack_orchestrator.deploy.deploy_types import ClusterContext, DeployCommandContext
from stack_orchestrator.deploy.deployer import DeployerException
from stack_orchestrator.deploy.deployer_factory import getDeployer
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.deploy.deployment_create import create as deployment_create
from stack_orchestrator.deploy.deployment_create import init as deployment_init
from stack_orchestrator.deploy.deployment_create import setup as deployment_setup
from stack_orchestrator.deploy.k8s import k8s_command
from stack_orchestrator.opts import opts
from stack_orchestrator.util import (
get_dev_root_path,
get_parsed_stack_config,
get_stack_path,
global_options2,
include_exclude_check,
resolve_compose_file,
stack_is_in_deployment,
)
@click.group()
@ -52,9 +53,7 @@ from stack_orchestrator.deploy.k8s import k8s_command
@click.option("--exclude", help="don't start these components")
@click.option("--env-file", help="env file to be used")
@click.option("--cluster", help="specify a non-default cluster name")
@click.option(
"--deploy-to", help="cluster system to deploy to (compose or k8s or k8s-kind)"
)
@click.option("--deploy-to", help="cluster system to deploy to (compose or k8s or k8s-kind)")
@click.pass_context
def command(ctx, include, exclude, env_file, cluster, deploy_to):
"""deploy a stack"""
@ -93,7 +92,7 @@ def command(ctx, include, exclude, env_file, cluster, deploy_to):
def create_deploy_context(
global_context,
deployment_context: Optional[DeploymentContext],
deployment_context: DeploymentContext | None,
stack,
include,
exclude,
@ -116,9 +115,7 @@ def create_deploy_context(
# For helm chart deployments, skip compose file loading
if is_helm_chart_deployment:
cluster_context = ClusterContext(
global_context, cluster, [], [], [], None, env_file
)
cluster_context = ClusterContext(global_context, cluster, [], [], [], None, env_file)
else:
cluster_context = _make_cluster_context(
global_context, stack, include, exclude, cluster, env_file
@ -134,9 +131,7 @@ def create_deploy_context(
return DeployCommandContext(stack, cluster_context, deployer)
def up_operation(
ctx, services_list, stay_attached=False, skip_cluster_management=False
):
def up_operation(ctx, services_list, stay_attached=False, skip_cluster_management=False):
global_context = ctx.parent.parent.obj
deploy_context = ctx.obj
cluster_context = deploy_context.cluster_context
@ -182,8 +177,14 @@ def status_operation(ctx):
ctx.obj.deployer.status()
def update_operation(ctx):
ctx.obj.deployer.update()
def prepare_operation(ctx, skip_cluster_management=False):
ctx.obj.deployer.prepare(
skip_cluster_management=skip_cluster_management,
)
def update_envs_operation(ctx):
ctx.obj.deployer.update_envs()
def ps_operation(ctx):
@ -203,8 +204,7 @@ def ps_operation(ctx):
print(f"{port_mapping}", end="")
else:
print(
f"{mapping[0]['HostIp']}:{mapping[0]['HostPort']}"
f"->{port_mapping}",
f"{mapping[0]['HostIp']}:{mapping[0]['HostPort']}" f"->{port_mapping}",
end="",
)
comma = ", "
@ -254,11 +254,11 @@ def logs_operation(ctx, tail: int, follow: bool, extra_args: str):
logs_stream = ctx.obj.deployer.logs(
services=services_list, tail=tail, follow=follow, stream=True
)
for stream_type, stream_content in logs_stream:
for _stream_type, stream_content in logs_stream:
print(stream_content.decode("utf-8"), end="")
def run_job_operation(ctx, job_name: str, helm_release: Optional[str] = None):
def run_job_operation(ctx, job_name: str, helm_release: str | None = None):
global_context = ctx.parent.parent.obj
if not global_context.dry_run:
print(f"Running job: {job_name}")
@ -278,9 +278,7 @@ def up(ctx, extra_args):
@command.command()
@click.option(
"--delete-volumes/--preserve-volumes", default=False, help="delete data volumes"
)
@click.option("--delete-volumes/--preserve-volumes", default=False, help="delete data volumes")
@click.argument("extra_args", nargs=-1) # help: command: down<service1> <service2>
@click.pass_context
def down(ctx, delete_volumes, extra_args):
@ -380,14 +378,10 @@ def _make_cluster_context(ctx, stack, include, exclude, cluster, env_file):
else:
# See:
# https://stackoverflow.com/questions/25389095/python-get-path-of-root-project-structure
compose_dir = (
Path(__file__).absolute().parent.parent.joinpath("data", "compose")
)
compose_dir = Path(__file__).absolute().parent.parent.joinpath("data", "compose")
if cluster is None:
cluster = _make_default_cluster_name(
deployment, compose_dir, stack, include, exclude
)
cluster = _make_default_cluster_name(deployment, compose_dir, stack, include, exclude)
else:
_make_default_cluster_name(deployment, compose_dir, stack, include, exclude)
@ -404,9 +398,7 @@ def _make_cluster_context(ctx, stack, include, exclude, cluster, env_file):
if stack_config is not None:
# TODO: syntax check the input here
pods_in_scope = stack_config["pods"]
cluster_config = (
stack_config["config"] if "config" in stack_config else None
)
cluster_config = stack_config["config"] if "config" in stack_config else None
else:
pods_in_scope = all_pods
@ -428,43 +420,29 @@ def _make_cluster_context(ctx, stack, include, exclude, cluster, env_file):
if include_exclude_check(pod_name, include, exclude):
if pod_repository is None or pod_repository == "internal":
if deployment:
compose_file_name = os.path.join(
compose_dir, f"docker-compose-{pod_path}.yml"
)
compose_file_name = os.path.join(compose_dir, f"docker-compose-{pod_path}.yml")
else:
compose_file_name = resolve_compose_file(stack, pod_name)
else:
if deployment:
compose_file_name = os.path.join(
compose_dir, f"docker-compose-{pod_name}.yml"
)
compose_file_name = os.path.join(compose_dir, f"docker-compose-{pod_name}.yml")
pod_pre_start_command = pod.get("pre_start_command")
pod_post_start_command = pod.get("post_start_command")
script_dir = compose_dir.parent.joinpath(
"pods", pod_name, "scripts"
)
script_dir = compose_dir.parent.joinpath("pods", pod_name, "scripts")
if pod_pre_start_command is not None:
pre_start_commands.append(
os.path.join(script_dir, pod_pre_start_command)
)
pre_start_commands.append(os.path.join(script_dir, pod_pre_start_command))
if pod_post_start_command is not None:
post_start_commands.append(
os.path.join(script_dir, pod_post_start_command)
)
post_start_commands.append(os.path.join(script_dir, pod_post_start_command))
else:
# TODO: fix this code for external stack with scripts
pod_root_dir = os.path.join(
dev_root_path, pod_repository.split("/")[-1], pod["path"]
)
compose_file_name = os.path.join(
pod_root_dir, f"docker-compose-{pod_name}.yml"
)
compose_file_name = os.path.join(pod_root_dir, f"docker-compose-{pod_name}.yml")
pod_pre_start_command = pod.get("pre_start_command")
pod_post_start_command = pod.get("post_start_command")
if pod_pre_start_command is not None:
pre_start_commands.append(
os.path.join(pod_root_dir, pod_pre_start_command)
)
pre_start_commands.append(os.path.join(pod_root_dir, pod_pre_start_command))
if pod_post_start_command is not None:
post_start_commands.append(
os.path.join(pod_root_dir, pod_post_start_command)
@ -508,9 +486,7 @@ def _run_command(ctx, cluster_name, command):
command_env["CERC_SO_COMPOSE_PROJECT"] = cluster_name
if ctx.debug:
command_env["CERC_SCRIPT_DEBUG"] = "true"
command_result = subprocess.run(
command_file, shell=True, env=command_env, cwd=command_dir
)
command_result = subprocess.run(command_file, shell=True, env=command_env, cwd=command_dir)
if command_result.returncode != 0:
print(f"FATAL Error running command: {command}")
sys.exit(1)
@ -567,9 +543,7 @@ def _orchestrate_cluster_config(ctx, cluster_config, deployer, container_exec_en
# "It returned with code 1"
if "It returned with code 1" in str(error):
if ctx.verbose:
print(
"Config export script returned an error, re-trying"
)
print("Config export script returned an error, re-trying")
# If the script failed to execute
# (e.g. the file is not there) then we get:
# "It returned with code 2"

View File

@ -13,8 +13,9 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from typing import List, Mapping, Optional
from collections.abc import Mapping
from dataclasses import dataclass
from stack_orchestrator.command_types import CommandOptions
from stack_orchestrator.deploy.deployer import Deployer
@ -23,19 +24,19 @@ from stack_orchestrator.deploy.deployer import Deployer
class ClusterContext:
# TODO: this should be in its own object not stuffed in here
options: CommandOptions
cluster: Optional[str]
compose_files: List[str]
pre_start_commands: List[str]
post_start_commands: List[str]
config: Optional[str]
env_file: Optional[str]
cluster: str | None
compose_files: list[str]
pre_start_commands: list[str]
post_start_commands: list[str]
config: str | None
env_file: str | None
@dataclass
class DeployCommandContext:
stack: str
cluster_context: ClusterContext
deployer: Optional[Deployer]
deployer: Deployer | None
@dataclass

View File

@ -13,15 +13,16 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from typing import List, Any
from typing import Any
from stack_orchestrator.deploy.deploy_types import DeployCommandContext, VolumeMapping
from stack_orchestrator.opts import opts
from stack_orchestrator.util import (
get_parsed_stack_config,
get_yaml,
get_pod_list,
get_yaml,
resolve_compose_file,
)
from stack_orchestrator.opts import opts
def _container_image_from_service(stack: str, service: str):
@ -32,7 +33,7 @@ def _container_image_from_service(stack: str, service: str):
yaml = get_yaml()
for pod in pods:
pod_file_path = resolve_compose_file(stack, pod)
parsed_pod_file = yaml.load(open(pod_file_path, "r"))
parsed_pod_file = yaml.load(open(pod_file_path))
if "services" in parsed_pod_file:
services = parsed_pod_file["services"]
if service in services:
@ -45,7 +46,7 @@ def _container_image_from_service(stack: str, service: str):
def parsed_pod_files_map_from_file_names(pod_files):
parsed_pod_yaml_map: Any = {}
for pod_file in pod_files:
with open(pod_file, "r") as pod_file_descriptor:
with open(pod_file) as pod_file_descriptor:
parsed_pod_file = get_yaml().load(pod_file_descriptor)
parsed_pod_yaml_map[pod_file] = parsed_pod_file
if opts.o.debug:
@ -53,7 +54,7 @@ def parsed_pod_files_map_from_file_names(pod_files):
return parsed_pod_yaml_map
def images_for_deployment(pod_files: List[str]):
def images_for_deployment(pod_files: list[str]):
image_set = set()
parsed_pod_yaml_map = parsed_pod_files_map_from_file_names(pod_files)
# Find the set of images in the pods
@ -69,7 +70,7 @@ def images_for_deployment(pod_files: List[str]):
return image_set
def _volumes_to_docker(mounts: List[VolumeMapping]):
def _volumes_to_docker(mounts: list[VolumeMapping]):
# Example from doc: [("/", "/host"), ("/etc/hosts", "/etc/hosts", "rw")]
result = []
for mount in mounts:
@ -79,7 +80,7 @@ def _volumes_to_docker(mounts: List[VolumeMapping]):
def run_container_command(
ctx: DeployCommandContext, service: str, command: str, mounts: List[VolumeMapping]
ctx: DeployCommandContext, service: str, command: str, mounts: list[VolumeMapping]
):
deployer = ctx.deployer
if deployer is None:

View File

@ -15,7 +15,6 @@
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Optional
class Deployer(ABC):
@ -28,7 +27,7 @@ class Deployer(ABC):
pass
@abstractmethod
def update(self):
def update_envs(self):
pass
@abstractmethod
@ -59,16 +58,23 @@ class Deployer(ABC):
user=None,
volumes=None,
entrypoint=None,
env={},
ports=[],
env=None,
ports=None,
detach=False,
):
pass
@abstractmethod
def run_job(self, job_name: str, release_name: Optional[str] = None):
def run_job(self, job_name: str, release_name: str | None = None):
pass
def prepare(self, skip_cluster_management):
"""Create cluster infrastructure (namespace, PVs, services) without starting pods.
Only supported for k8s deployers. Compose deployers raise an error.
"""
raise DeployerException("prepare is only supported for k8s deployments")
class DeployerException(Exception):
def __init__(self, *args: object) -> None:

View File

@ -14,14 +14,14 @@
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from stack_orchestrator import constants
from stack_orchestrator.deploy.k8s.deploy_k8s import (
K8sDeployer,
K8sDeployerConfigGenerator,
)
from stack_orchestrator.deploy.compose.deploy_docker import (
DockerDeployer,
DockerDeployerConfigGenerator,
)
from stack_orchestrator.deploy.k8s.deploy_k8s import (
K8sDeployer,
K8sDeployerConfigGenerator,
)
def getDeployerConfigGenerator(type: str, deployment_context):
@ -44,10 +44,7 @@ def getDeployer(
compose_project_name,
compose_env_file,
)
elif (
type == type == constants.k8s_deploy_type
or type == constants.k8s_kind_deploy_type
):
elif type == type == constants.k8s_deploy_type or type == constants.k8s_kind_deploy_type:
return K8sDeployer(
type,
deployment_context,

View File

@ -13,28 +13,28 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import click
from pathlib import Path
import subprocess
import sys
import time
from pathlib import Path
import click
from stack_orchestrator import constants
from stack_orchestrator.deploy.images import push_images_operation
from stack_orchestrator.deploy.deploy import (
up_operation,
create_deploy_context,
down_operation,
ps_operation,
port_operation,
status_operation,
)
from stack_orchestrator.deploy.deploy import (
exec_operation,
logs_operation,
create_deploy_context,
update_operation,
port_operation,
prepare_operation,
ps_operation,
status_operation,
up_operation,
update_envs_operation,
)
from stack_orchestrator.deploy.deploy_types import DeployCommandContext
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.deploy.images import push_images_operation
@click.group()
@ -114,7 +114,7 @@ def up(ctx, stay_attached, skip_cluster_management, extra_args):
)
@click.option(
"--skip-cluster-management/--perform-cluster-management",
default=False,
default=True,
help="Skip cluster initialization/tear-down (only for kind-k8s deployments)",
)
@click.argument("extra_args", nargs=-1) # help: command: up <service1> <service2>
@ -125,14 +125,33 @@ def start(ctx, stay_attached, skip_cluster_management, extra_args):
up_operation(ctx, services_list, stay_attached, skip_cluster_management)
# TODO: remove legacy up command since it's an alias for stop
@command.command()
@click.option(
"--delete-volumes/--preserve-volumes", default=False, help="delete data volumes"
)
@click.option(
"--skip-cluster-management/--perform-cluster-management",
default=False,
help="Skip cluster initialization (only for kind-k8s deployments)",
)
@click.pass_context
def prepare(ctx, skip_cluster_management):
"""Create cluster infrastructure without starting pods.
Sets up the kind cluster, namespace, PVs, PVCs, ConfigMaps, Services,
and Ingresses everything that 'start' does EXCEPT creating the
Deployment resource. No pods will be scheduled.
Use 'start --skip-cluster-management' afterward to create the Deployment
and start pods when ready.
"""
ctx.obj = make_deploy_context(ctx)
prepare_operation(ctx, skip_cluster_management)
# TODO: remove legacy up command since it's an alias for stop
@command.command()
@click.option("--delete-volumes/--preserve-volumes", default=False, help="delete data volumes")
@click.option(
"--skip-cluster-management/--perform-cluster-management",
default=True,
help="Skip cluster initialization/tear-down (only for kind-k8s deployments)",
)
@click.argument("extra_args", nargs=-1) # help: command: down <service1> <service2>
@ -146,12 +165,10 @@ def down(ctx, delete_volumes, skip_cluster_management, extra_args):
# stop is the preferred alias for down
@command.command()
@click.option(
"--delete-volumes/--preserve-volumes", default=False, help="delete data volumes"
)
@click.option("--delete-volumes/--preserve-volumes", default=False, help="delete data volumes")
@click.option(
"--skip-cluster-management/--perform-cluster-management",
default=False,
default=True,
help="Skip cluster initialization/tear-down (only for kind-k8s deployments)",
)
@click.argument("extra_args", nargs=-1) # help: command: down <service1> <service2>
@ -210,11 +227,11 @@ def status(ctx):
status_operation(ctx)
@command.command()
@command.command(name="update-envs")
@click.pass_context
def update(ctx):
def update_envs(ctx):
ctx.obj = make_deploy_context(ctx)
update_operation(ctx)
update_envs_operation(ctx)
@command.command()
@ -234,9 +251,7 @@ def run_job(ctx, job_name, helm_release):
@command.command()
@click.option("--stack-path", help="Path to stack git repo (overrides stored path)")
@click.option(
"--spec-file", help="Path to GitOps spec.yml in repo (e.g., deployment/spec.yml)"
)
@click.option("--spec-file", help="Path to GitOps spec.yml in repo (e.g., deployment/spec.yml)")
@click.option("--config-file", help="Config file to pass to deploy init")
@click.option(
"--force",
@ -270,33 +285,27 @@ def restart(ctx, stack_path, spec_file, config_file, force, expected_ip):
commands.py on each restart. Use 'deploy init' only for initial
spec generation, then customize and commit to your operator repo.
"""
from stack_orchestrator.util import get_yaml, get_parsed_deployment_spec
from stack_orchestrator.deploy.deployment_create import create_operation
from stack_orchestrator.deploy.dns_probe import verify_dns_via_probe
from stack_orchestrator.util import get_parsed_deployment_spec, get_yaml
deployment_context: DeploymentContext = ctx.obj
# Get current spec info (before git pull)
current_spec = deployment_context.spec
current_http_proxy = current_spec.get_http_proxy()
current_hostname = (
current_http_proxy[0]["host-name"] if current_http_proxy else None
)
current_hostname = current_http_proxy[0]["host-name"] if current_http_proxy else None
# Resolve stack source path
if stack_path:
stack_source = Path(stack_path).resolve()
else:
# Try to get from deployment.yml
deployment_file = (
deployment_context.deployment_dir / constants.deployment_file_name
)
deployment_file = deployment_context.deployment_dir / constants.deployment_file_name
deployment_data = get_yaml().load(open(deployment_file))
stack_source_str = deployment_data.get("stack-source")
if not stack_source_str:
print(
"Error: No stack-source in deployment.yml and --stack-path not provided"
)
print("Error: No stack-source in deployment.yml and --stack-path not provided")
print("Use --stack-path to specify the stack git repository location")
sys.exit(1)
stack_source = Path(stack_source_str)
@ -312,9 +321,7 @@ def restart(ctx, stack_path, spec_file, config_file, force, expected_ip):
# Step 1: Git pull (brings in updated spec.yml from operator's repo)
print("\n[1/4] Pulling latest code from stack repository...")
git_result = subprocess.run(
["git", "pull"], cwd=stack_source, capture_output=True, text=True
)
git_result = subprocess.run(["git", "pull"], cwd=stack_source, capture_output=True, text=True)
if git_result.returncode != 0:
print(f"Git pull failed: {git_result.stderr}")
sys.exit(1)
@ -386,17 +393,13 @@ def restart(ctx, stack_path, spec_file, config_file, force, expected_ip):
# Stop deployment
print("\n[4/4] Restarting deployment...")
ctx.obj = make_deploy_context(ctx)
down_operation(
ctx, delete_volumes=False, extra_args_list=[], skip_cluster_management=True
)
down_operation(ctx, delete_volumes=False, extra_args_list=[], skip_cluster_management=True)
# Brief pause to ensure clean shutdown
time.sleep(5)
# Namespace deletion wait is handled by _ensure_namespace() in
# the deployer — no fixed sleep needed here.
# Start deployment
up_operation(
ctx, services_list=None, stay_attached=False, skip_cluster_management=True
)
up_operation(ctx, services_list=None, stay_attached=False, skip_cluster_management=True)
print("\n=== Restart Complete ===")
print("Deployment restarted with git-tracked configuration.")

View File

@ -18,9 +18,9 @@ import os
from pathlib import Path
from stack_orchestrator import constants
from stack_orchestrator.util import get_yaml
from stack_orchestrator.deploy.stack import Stack
from stack_orchestrator.deploy.spec import Spec
from stack_orchestrator.deploy.stack import Stack
from stack_orchestrator.util import get_yaml
class DeploymentContext:
@ -58,7 +58,7 @@ class DeploymentContext:
self.stack.init_from_file(self.get_stack_file())
deployment_file_path = self.get_deployment_file()
if deployment_file_path.exists():
obj = get_yaml().load(open(deployment_file_path, "r"))
obj = get_yaml().load(open(deployment_file_path))
self.id = obj[constants.cluster_id_key]
# Handle the case of a legacy deployment with no file
# Code below is intended to match the output from _make_default_cluster_name()
@ -75,7 +75,7 @@ class DeploymentContext:
raise ValueError(f"File is not inside deployment directory: {file_path}")
yaml = get_yaml()
with open(file_path, "r") as f:
with open(file_path) as f:
yaml_data = yaml.load(f)
modifier_func(yaml_data)

View File

@ -13,44 +13,44 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import click
from importlib import util
import base64
import filecmp
import json
import os
import re
import base64
from pathlib import Path
from typing import List, Optional
import random
from shutil import copy, copyfile, copytree, rmtree
from secrets import token_hex
import re
import sys
import filecmp
import tempfile
from importlib import util
from pathlib import Path
from secrets import token_hex
from shutil import copy, copyfile, copytree, rmtree
import click
from stack_orchestrator import constants
from stack_orchestrator.opts import opts
from stack_orchestrator.util import (
get_stack_path,
get_parsed_deployment_spec,
get_parsed_stack_config,
global_options,
get_yaml,
get_pod_list,
get_pod_file_path,
pod_has_scripts,
get_pod_script_paths,
get_plugin_code_paths,
error_exit,
env_var_map_from_file,
resolve_config_dir,
get_job_list,
get_job_file_path,
)
from stack_orchestrator.deploy.spec import Spec
from stack_orchestrator.deploy.deploy_types import LaconicStackSetupCommand
from stack_orchestrator.deploy.deployer_factory import getDeployerConfigGenerator
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.deploy.spec import Spec
from stack_orchestrator.opts import opts
from stack_orchestrator.util import (
env_var_map_from_file,
error_exit,
get_job_file_path,
get_job_list,
get_parsed_deployment_spec,
get_parsed_stack_config,
get_plugin_code_paths,
get_pod_file_path,
get_pod_list,
get_pod_script_paths,
get_stack_path,
get_yaml,
global_options,
pod_has_scripts,
resolve_config_dir,
)
def _make_default_deployment_dir():
@ -66,7 +66,7 @@ def _get_ports(stack):
pod_file_path = get_pod_file_path(stack, parsed_stack, pod)
if pod_file_path is None:
continue
parsed_pod_file = yaml.load(open(pod_file_path, "r"))
parsed_pod_file = yaml.load(open(pod_file_path))
if "services" in parsed_pod_file:
for svc_name, svc in parsed_pod_file["services"].items():
if "ports" in svc:
@ -102,7 +102,7 @@ def _get_named_volumes(stack):
pod_file_path = get_pod_file_path(stack, parsed_stack, pod)
if pod_file_path is None:
continue
parsed_pod_file = yaml.load(open(pod_file_path, "r"))
parsed_pod_file = yaml.load(open(pod_file_path))
if "volumes" in parsed_pod_file:
volumes = parsed_pod_file["volumes"]
for volume in volumes.keys():
@ -132,9 +132,7 @@ def _create_bind_dir_if_relative(volume, path_string, compose_dir):
absolute_path.mkdir(parents=True, exist_ok=True)
else:
if not path.exists():
print(
f"WARNING: mount path for volume {volume} does not exist: {path_string}"
)
print(f"WARNING: mount path for volume {volume} does not exist: {path_string}")
# See:
@ -151,9 +149,7 @@ def _fixup_pod_file(pod, spec, compose_dir):
volume_spec = spec_volumes[volume]
if volume_spec:
volume_spec_fixedup = (
volume_spec
if Path(volume_spec).is_absolute()
else f".{volume_spec}"
volume_spec if Path(volume_spec).is_absolute() else f".{volume_spec}"
)
_create_bind_dir_if_relative(volume, volume_spec, compose_dir)
# this is Docker specific
@ -328,10 +324,7 @@ def _get_mapped_ports(stack: str, map_recipe: str):
else:
print("Error: bad map_recipe")
else:
print(
f"Error: --map-ports-to-host must specify one of: "
f"{port_map_recipes}"
)
print(f"Error: --map-ports-to-host must specify one of: " f"{port_map_recipes}")
sys.exit(1)
return ports
@ -356,9 +349,7 @@ def _parse_config_variables(variable_values: str):
@click.command()
@click.option("--config", help="Provide config variables for the deployment")
@click.option(
"--config-file", help="Provide config variables in a file for the deployment"
)
@click.option("--config-file", help="Provide config variables in a file for the deployment")
@click.option("--kube-config", help="Provide a config file for a k8s deployment")
@click.option(
"--image-registry",
@ -372,9 +363,7 @@ def _parse_config_variables(variable_values: str):
"localhost-same, any-same, localhost-fixed-random, any-fixed-random",
)
@click.pass_context
def init(
ctx, config, config_file, kube_config, image_registry, output, map_ports_to_host
):
def init(ctx, config, config_file, kube_config, image_registry, output, map_ports_to_host):
stack = global_options(ctx).stack
deployer_type = ctx.obj.deployer.type
deploy_command_context = ctx.obj
@ -421,13 +410,9 @@ def init_operation(
else:
# Check for --kube-config supplied for non-relevant deployer types
if kube_config is not None:
error_exit(
f"--kube-config is not allowed with a {deployer_type} deployment"
)
error_exit(f"--kube-config is not allowed with a {deployer_type} deployment")
if image_registry is not None:
error_exit(
f"--image-registry is not allowed with a {deployer_type} deployment"
)
error_exit(f"--image-registry is not allowed with a {deployer_type} deployment")
if default_spec_file_content:
spec_file_content.update(default_spec_file_content)
config_variables = _parse_config_variables(config)
@ -479,9 +464,7 @@ def init_operation(
spec_file_content["configmaps"] = configmap_descriptors
if opts.o.debug:
print(
f"Creating spec file for stack: {stack} with content: {spec_file_content}"
)
print(f"Creating spec file for stack: {stack} with content: {spec_file_content}")
with open(output, "w") as output_file:
get_yaml().dump(spec_file_content, output_file)
@ -497,7 +480,8 @@ def _generate_and_store_secrets(config_vars: dict, deployment_name: str):
Called by `deploy create` - generates fresh secrets and stores them.
Returns the generated secrets dict for reference.
"""
from kubernetes import client, config as k8s_config
from kubernetes import client
from kubernetes import config as k8s_config
secrets = {}
for name, value in config_vars.items():
@ -526,9 +510,7 @@ def _generate_and_store_secrets(config_vars: dict, deployment_name: str):
try:
k8s_config.load_incluster_config()
except Exception:
print(
"Warning: Could not load kube config, secrets will not be stored in K8s"
)
print("Warning: Could not load kube config, secrets will not be stored in K8s")
return secrets
v1 = client.CoreV1Api()
@ -555,7 +537,7 @@ def _generate_and_store_secrets(config_vars: dict, deployment_name: str):
return secrets
def create_registry_secret(spec: Spec, deployment_name: str) -> Optional[str]:
def create_registry_secret(spec: Spec, deployment_name: str) -> str | None:
"""Create K8s docker-registry secret from spec + environment.
Reads registry configuration from spec.yml and creates a Kubernetes
@ -568,7 +550,8 @@ def create_registry_secret(spec: Spec, deployment_name: str) -> Optional[str]:
Returns:
The secret name if created, None if no registry config
"""
from kubernetes import client, config as k8s_config
from kubernetes import client
from kubernetes import config as k8s_config
registry_config = spec.get_image_registry_config()
if not registry_config:
@ -585,17 +568,12 @@ def create_registry_secret(spec: Spec, deployment_name: str) -> Optional[str]:
assert token_env is not None
token = os.environ.get(token_env)
if not token:
print(
f"Warning: Registry token env var '{token_env}' not set, "
"skipping registry secret"
)
print(f"Warning: Registry token env var '{token_env}' not set, " "skipping registry secret")
return None
# Create dockerconfigjson format (Docker API uses "password" field for tokens)
auth = base64.b64encode(f"{username}:{token}".encode()).decode()
docker_config = {
"auths": {server: {"username": username, "password": token, "auth": auth}}
}
docker_config = {"auths": {server: {"username": username, "password": token, "auth": auth}}}
# Secret name derived from deployment name
secret_name = f"{deployment_name}-registry"
@ -615,11 +593,7 @@ def create_registry_secret(spec: Spec, deployment_name: str) -> Optional[str]:
k8s_secret = client.V1Secret(
metadata=client.V1ObjectMeta(name=secret_name),
data={
".dockerconfigjson": base64.b64encode(
json.dumps(docker_config).encode()
).decode()
},
data={".dockerconfigjson": base64.b64encode(json.dumps(docker_config).encode()).decode()},
type="kubernetes.io/dockerconfigjson",
)
@ -636,17 +610,14 @@ def create_registry_secret(spec: Spec, deployment_name: str) -> Optional[str]:
return secret_name
def _write_config_file(
spec_file: Path, config_env_file: Path, deployment_name: Optional[str] = None
):
def _write_config_file(spec_file: Path, config_env_file: Path, deployment_name: str | None = None):
spec_content = get_parsed_deployment_spec(spec_file)
config_vars = spec_content.get("config", {}) or {}
# Generate and store secrets in K8s if deployment_name provided and tokens exist
if deployment_name and config_vars:
has_generate_tokens = any(
isinstance(v, str) and GENERATE_TOKEN_PATTERN.search(v)
for v in config_vars.values()
isinstance(v, str) and GENERATE_TOKEN_PATTERN.search(v) for v in config_vars.values()
)
if has_generate_tokens:
_generate_and_store_secrets(config_vars, deployment_name)
@ -669,13 +640,13 @@ def _write_kube_config_file(external_path: Path, internal_path: Path):
copyfile(external_path, internal_path)
def _copy_files_to_directory(file_paths: List[Path], directory: Path):
def _copy_files_to_directory(file_paths: list[Path], directory: Path):
for path in file_paths:
# Using copy to preserve the execute bit
copy(path, os.path.join(directory, os.path.basename(path)))
def _create_deployment_file(deployment_dir: Path, stack_source: Optional[Path] = None):
def _create_deployment_file(deployment_dir: Path, stack_source: Path | None = None):
deployment_file_path = deployment_dir.joinpath(constants.deployment_file_name)
cluster = f"{constants.cluster_name_prefix}{token_hex(8)}"
deployment_content = {constants.cluster_id_key: cluster}
@ -701,9 +672,7 @@ def _check_volume_definitions(spec):
@click.command()
@click.option(
"--spec-file", required=True, help="Spec file to use to create this deployment"
)
@click.option("--spec-file", required=True, help="Spec file to use to create this deployment")
@click.option("--deployment-dir", help="Create deployment files in this directory")
@click.option(
"--update",
@ -757,9 +726,7 @@ def create_operation(
initial_peers=None,
extra_args=(),
):
parsed_spec = Spec(
os.path.abspath(spec_file), get_parsed_deployment_spec(spec_file)
)
parsed_spec = Spec(os.path.abspath(spec_file), get_parsed_deployment_spec(spec_file))
_check_volume_definitions(parsed_spec)
stack_name = parsed_spec["stack"]
deployment_type = parsed_spec[constants.deploy_to_key]
@ -816,9 +783,7 @@ def create_operation(
# Exclude config file to preserve deployment settings
# (XXX breaks passing config vars from spec)
exclude_patterns = ["data", "data/*", constants.config_file_name]
_safe_copy_tree(
temp_dir, deployment_dir_path, exclude_patterns=exclude_patterns
)
_safe_copy_tree(temp_dir, deployment_dir_path, exclude_patterns=exclude_patterns)
finally:
# Clean up temp dir
rmtree(temp_dir)
@ -841,18 +806,14 @@ def create_operation(
deployment_context = DeploymentContext()
deployment_context.init(deployment_dir_path)
# Call the deployer to generate any deployer-specific files (e.g. for kind)
deployer_config_generator = getDeployerConfigGenerator(
deployment_type, deployment_context
)
deployer_config_generator = getDeployerConfigGenerator(deployment_type, deployment_context)
# TODO: make deployment_dir_path a Path above
if deployer_config_generator is not None:
deployer_config_generator.generate(deployment_dir_path)
call_stack_deploy_create(
deployment_context, [network_dir, initial_peers, *extra_args]
)
call_stack_deploy_create(deployment_context, [network_dir, initial_peers, *extra_args])
def _safe_copy_tree(src: Path, dst: Path, exclude_patterns: Optional[List[str]] = None):
def _safe_copy_tree(src: Path, dst: Path, exclude_patterns: list[str] | None = None):
"""
Recursively copy a directory tree, backing up changed files with .bak suffix.
@ -873,11 +834,7 @@ def _safe_copy_tree(src: Path, dst: Path, exclude_patterns: Optional[List[str]]
def safe_copy_file(src_file: Path, dst_file: Path):
"""Copy file, backing up destination if it differs."""
if (
dst_file.exists()
and not dst_file.is_dir()
and not filecmp.cmp(src_file, dst_file)
):
if dst_file.exists() and not dst_file.is_dir() and not filecmp.cmp(src_file, dst_file):
os.rename(dst_file, f"{dst_file}.bak")
copy(src_file, dst_file)
@ -903,7 +860,7 @@ def _write_deployment_files(
stack_name: str,
deployment_type: str,
include_deployment_file: bool = True,
stack_source: Optional[Path] = None,
stack_source: Path | None = None,
):
"""
Write deployment files to target directory.
@ -931,9 +888,7 @@ def _write_deployment_files(
# Use stack_name as deployment_name for K8s secret naming
# Extract just the name part if stack_name is a path ("path/to/stack" -> "stack")
deployment_name = Path(stack_name).name.replace("_", "-")
_write_config_file(
spec_file, target_dir.joinpath(constants.config_file_name), deployment_name
)
_write_config_file(spec_file, target_dir.joinpath(constants.config_file_name), deployment_name)
# Copy any k8s config file into the target dir
if deployment_type == "k8s":
@ -954,7 +909,7 @@ def _write_deployment_files(
pod_file_path = get_pod_file_path(stack_name, parsed_stack, pod)
if pod_file_path is None:
continue
parsed_pod_file = yaml.load(open(pod_file_path, "r"))
parsed_pod_file = yaml.load(open(pod_file_path))
extra_config_dirs = _find_extra_config_dirs(parsed_pod_file, pod)
destination_pod_dir = destination_pods_dir.joinpath(pod)
os.makedirs(destination_pod_dir, exist_ok=True)
@ -962,7 +917,7 @@ def _write_deployment_files(
print(f"extra config dirs: {extra_config_dirs}")
_fixup_pod_file(parsed_pod_file, parsed_spec, destination_compose_dir)
with open(
destination_compose_dir.joinpath("docker-compose-%s.yml" % pod), "w"
destination_compose_dir.joinpath(f"docker-compose-{pod}.yml"), "w"
) as output_file:
yaml.dump(parsed_pod_file, output_file)
@ -986,12 +941,8 @@ def _write_deployment_files(
for configmap in parsed_spec.get_configmaps():
source_config_dir = resolve_config_dir(stack_name, configmap)
if os.path.exists(source_config_dir):
destination_config_dir = target_dir.joinpath(
"configmaps", configmap
)
copytree(
source_config_dir, destination_config_dir, dirs_exist_ok=True
)
destination_config_dir = target_dir.joinpath("configmaps", configmap)
copytree(source_config_dir, destination_config_dir, dirs_exist_ok=True)
else:
# TODO:
# This is odd - looks up config dir that matches a volume name,
@ -1022,12 +973,10 @@ def _write_deployment_files(
for job in jobs:
job_file_path = get_job_file_path(stack_name, parsed_stack, job)
if job_file_path and job_file_path.exists():
parsed_job_file = yaml.load(open(job_file_path, "r"))
parsed_job_file = yaml.load(open(job_file_path))
_fixup_pod_file(parsed_job_file, parsed_spec, destination_compose_dir)
with open(
destination_compose_jobs_dir.joinpath(
"docker-compose-%s.yml" % job
),
destination_compose_jobs_dir.joinpath(f"docker-compose-{job}.yml"),
"w",
) as output_file:
yaml.dump(parsed_job_file, output_file)
@ -1042,18 +991,14 @@ def _write_deployment_files(
@click.option("--node-moniker", help="Moniker for this node")
@click.option("--chain-id", help="The new chain id")
@click.option("--key-name", help="Name for new node key")
@click.option(
"--gentx-files", help="List of comma-delimited gentx filenames from other nodes"
)
@click.option("--gentx-files", help="List of comma-delimited gentx filenames from other nodes")
@click.option(
"--gentx-addresses",
type=str,
help="List of comma-delimited validator addresses for other nodes",
)
@click.option("--genesis-file", help="Genesis file for the network")
@click.option(
"--initialize-network", is_flag=True, default=False, help="Initialize phase"
)
@click.option("--initialize-network", is_flag=True, default=False, help="Initialize phase")
@click.option("--join-network", is_flag=True, default=False, help="Join phase")
@click.option("--connect-network", is_flag=True, default=False, help="Connect phase")
@click.option("--create-network", is_flag=True, default=False, help="Create phase")

View File

@ -6,7 +6,7 @@
import secrets
import socket
import time
from typing import Optional
import requests
from kubernetes import client
@ -15,7 +15,8 @@ def get_server_egress_ip() -> str:
"""Get this server's public egress IP via ipify."""
response = requests.get("https://api.ipify.org", timeout=10)
response.raise_for_status()
return response.text.strip()
result: str = response.text.strip()
return result
def resolve_hostname(hostname: str) -> list[str]:
@ -27,7 +28,7 @@ def resolve_hostname(hostname: str) -> list[str]:
return []
def verify_dns_simple(hostname: str, expected_ip: Optional[str] = None) -> bool:
def verify_dns_simple(hostname: str, expected_ip: str | None = None) -> bool:
"""Simple DNS verification - check hostname resolves to expected IP.
If expected_ip not provided, uses server's egress IP.
@ -98,9 +99,7 @@ def delete_probe_ingress(namespace: str = "default"):
"""Delete the temporary probe ingress."""
networking_api = client.NetworkingV1Api()
try:
networking_api.delete_namespaced_ingress(
name="laconic-dns-probe", namespace=namespace
)
networking_api.delete_namespaced_ingress(name="laconic-dns-probe", namespace=namespace)
except client.exceptions.ApiException:
pass # Ignore if already deleted

View File

@ -13,15 +13,14 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from typing import Set
from python_on_whales import DockerClient
from stack_orchestrator import constants
from stack_orchestrator.opts import opts
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.deploy.deploy_types import DeployCommandContext
from stack_orchestrator.deploy.deploy_util import images_for_deployment
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.opts import opts
def _image_needs_pushed(image: str):
@ -32,9 +31,7 @@ def _image_needs_pushed(image: str):
def _remote_tag_for_image(image: str, remote_repo_url: str):
# Turns image tags of the form: foo/bar:local into remote.repo/org/bar:deploy
major_parts = image.split("/", 2)
image_name_with_version = (
major_parts[1] if 2 == len(major_parts) else major_parts[0]
)
image_name_with_version = major_parts[1] if 2 == len(major_parts) else major_parts[0]
(image_name, image_version) = image_name_with_version.split(":")
if image_version == "local":
return f"{remote_repo_url}/{image_name}:deploy"
@ -63,18 +60,14 @@ def add_tags_to_image(remote_repo_url: str, local_tag: str, *additional_tags):
docker = DockerClient()
remote_tag = _remote_tag_for_image(local_tag, remote_repo_url)
new_remote_tags = [
_remote_tag_for_image(tag, remote_repo_url) for tag in additional_tags
]
new_remote_tags = [_remote_tag_for_image(tag, remote_repo_url) for tag in additional_tags]
docker.buildx.imagetools.create(sources=[remote_tag], tags=new_remote_tags)
def remote_tag_for_image_unique(image: str, remote_repo_url: str, deployment_id: str):
# Turns image tags of the form: foo/bar:local into remote.repo/org/bar:deploy
major_parts = image.split("/", 2)
image_name_with_version = (
major_parts[1] if 2 == len(major_parts) else major_parts[0]
)
image_name_with_version = major_parts[1] if 2 == len(major_parts) else major_parts[0]
(image_name, image_version) = image_name_with_version.split(":")
if image_version == "local":
# Salt the tag with part of the deployment id to make it unique to this
@ -91,24 +84,20 @@ def push_images_operation(
):
# Get the list of images for the stack
cluster_context = command_context.cluster_context
images: Set[str] = images_for_deployment(cluster_context.compose_files)
images: set[str] = images_for_deployment(cluster_context.compose_files)
# Tag the images for the remote repo
remote_repo_url = deployment_context.spec.obj[constants.image_registry_key]
docker = DockerClient()
for image in images:
if _image_needs_pushed(image):
remote_tag = remote_tag_for_image_unique(
image, remote_repo_url, deployment_context.id
)
remote_tag = remote_tag_for_image_unique(image, remote_repo_url, deployment_context.id)
if opts.o.verbose:
print(f"Tagging {image} to {remote_tag}")
docker.image.tag(image, remote_tag)
# Run docker push commands to upload
for image in images:
if _image_needs_pushed(image):
remote_tag = remote_tag_for_image_unique(
image, remote_repo_url, deployment_context.id
)
remote_tag = remote_tag_for_image_unique(image, remote_repo_url, deployment_context.id)
if opts.o.verbose:
print(f"Pushing image {remote_tag}")
docker.image.push(remote_tag)

View File

@ -13,33 +13,31 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import os
import base64
import os
from typing import Any
from kubernetes import client
from typing import Any, List, Optional, Set
from stack_orchestrator.opts import opts
from stack_orchestrator.util import env_var_map_from_file
from stack_orchestrator.deploy.deploy_types import DeployEnvVars
from stack_orchestrator.deploy.deploy_util import (
images_for_deployment,
parsed_pod_files_map_from_file_names,
)
from stack_orchestrator.deploy.images import remote_tag_for_image_unique
from stack_orchestrator.deploy.k8s.helpers import (
envs_from_compose_file,
envs_from_environment_variables_map,
get_kind_pv_bind_mount_path,
merge_envs,
named_volumes_from_pod_files,
translate_sidecar_service_names,
volume_mounts_for_service,
volumes_for_pod_files,
)
from stack_orchestrator.deploy.k8s.helpers import get_kind_pv_bind_mount_path
from stack_orchestrator.deploy.k8s.helpers import (
envs_from_environment_variables_map,
envs_from_compose_file,
merge_envs,
translate_sidecar_service_names,
)
from stack_orchestrator.deploy.deploy_util import (
parsed_pod_files_map_from_file_names,
images_for_deployment,
)
from stack_orchestrator.deploy.deploy_types import DeployEnvVars
from stack_orchestrator.deploy.spec import Spec, Resources, ResourceLimits
from stack_orchestrator.deploy.images import remote_tag_for_image_unique
from stack_orchestrator.deploy.spec import ResourceLimits, Resources, Spec
from stack_orchestrator.opts import opts
from stack_orchestrator.util import env_var_map_from_file
DEFAULT_VOLUME_RESOURCES = Resources({"reservations": {"storage": "2Gi"}})
@ -52,7 +50,7 @@ DEFAULT_CONTAINER_RESOURCES = Resources(
def to_k8s_resource_requirements(resources: Resources) -> client.V1ResourceRequirements:
def to_dict(limits: Optional[ResourceLimits]):
def to_dict(limits: ResourceLimits | None):
if not limits:
return None
@ -72,7 +70,7 @@ def to_k8s_resource_requirements(resources: Resources) -> client.V1ResourceRequi
class ClusterInfo:
parsed_pod_yaml_map: Any
image_set: Set[str] = set()
image_set: set[str] = set()
app_name: str
environment_variables: DeployEnvVars
spec: Spec
@ -80,14 +78,12 @@ class ClusterInfo:
def __init__(self) -> None:
pass
def int(self, pod_files: List[str], compose_env_file, deployment_name, spec: Spec):
def int(self, pod_files: list[str], compose_env_file, deployment_name, spec: Spec):
self.parsed_pod_yaml_map = parsed_pod_files_map_from_file_names(pod_files)
# Find the set of images in the pods
self.image_set = images_for_deployment(pod_files)
# Filter out None values from env file
env_vars = {
k: v for k, v in env_var_map_from_file(compose_env_file).items() if v
}
env_vars = {k: v for k, v in env_var_map_from_file(compose_env_file).items() if v}
self.environment_variables = DeployEnvVars(env_vars)
self.app_name = deployment_name
self.spec = spec
@ -124,8 +120,7 @@ class ClusterInfo:
service = client.V1Service(
metadata=client.V1ObjectMeta(
name=(
f"{self.app_name}-nodeport-"
f"{pod_port}-{protocol.lower()}"
f"{self.app_name}-nodeport-" f"{pod_port}-{protocol.lower()}"
),
labels={"app": self.app_name},
),
@ -145,9 +140,7 @@ class ClusterInfo:
nodeports.append(service)
return nodeports
def get_ingress(
self, use_tls=False, certificate=None, cluster_issuer="letsencrypt-prod"
):
def get_ingress(self, use_tls=False, certificate=None, cluster_issuer="letsencrypt-prod"):
# No ingress for a deployment that has no http-proxy defined, for now
http_proxy_info_list = self.spec.get_http_proxy()
ingress = None
@ -162,9 +155,7 @@ class ClusterInfo:
tls = (
[
client.V1IngressTLS(
hosts=certificate["spec"]["dnsNames"]
if certificate
else [host_name],
hosts=certificate["spec"]["dnsNames"] if certificate else [host_name],
secret_name=certificate["spec"]["secretName"]
if certificate
else f"{self.app_name}-tls",
@ -237,8 +228,7 @@ class ClusterInfo:
return None
service_ports = [
client.V1ServicePort(port=p, target_port=p, name=f"port-{p}")
for p in sorted(ports_set)
client.V1ServicePort(port=p, target_port=p, name=f"port-{p}") for p in sorted(ports_set)
]
service = client.V1Service(
@ -290,9 +280,7 @@ class ClusterInfo:
volume_name=k8s_volume_name,
)
pvc = client.V1PersistentVolumeClaim(
metadata=client.V1ObjectMeta(
name=f"{self.app_name}-{volume_name}", labels=labels
),
metadata=client.V1ObjectMeta(name=f"{self.app_name}-{volume_name}", labels=labels),
spec=spec,
)
result.append(pvc)
@ -309,9 +297,7 @@ class ClusterInfo:
continue
if not cfg_map_path.startswith("/") and self.spec.file_path is not None:
cfg_map_path = os.path.join(
os.path.dirname(str(self.spec.file_path)), cfg_map_path
)
cfg_map_path = os.path.join(os.path.dirname(str(self.spec.file_path)), cfg_map_path)
# Read in all the files at a single-level of the directory.
# This mimics the behavior of
@ -320,9 +306,7 @@ class ClusterInfo:
for f in os.listdir(cfg_map_path):
full_path = os.path.join(cfg_map_path, f)
if os.path.isfile(full_path):
data[f] = base64.b64encode(open(full_path, "rb").read()).decode(
"ASCII"
)
data[f] = base64.b64encode(open(full_path, "rb").read()).decode("ASCII")
spec = client.V1ConfigMap(
metadata=client.V1ObjectMeta(
@ -425,7 +409,7 @@ class ClusterInfo:
return global_resources
# TODO: put things like image pull policy into an object-scope struct
def get_deployment(self, image_pull_policy: Optional[str] = None):
def get_deployment(self, image_pull_policy: str | None = None):
containers = []
services = {}
global_resources = self.spec.get_container_resources()
@ -453,9 +437,7 @@ class ClusterInfo:
port_str = port_str.split(":")[-1]
port = int(port_str)
container_ports.append(
client.V1ContainerPort(
container_port=port, protocol=protocol
)
client.V1ContainerPort(container_port=port, protocol=protocol)
)
if opts.o.debug:
print(f"image: {image}")
@ -473,9 +455,7 @@ class ClusterInfo:
# Translate docker-compose service names to localhost for sidecars
# All services in the same pod share the network namespace
sibling_services = [s for s in services.keys() if s != service_name]
merged_envs = translate_sidecar_service_names(
merged_envs, sibling_services
)
merged_envs = translate_sidecar_service_names(merged_envs, sibling_services)
envs = envs_from_environment_variables_map(merged_envs)
if opts.o.debug:
print(f"Merged envs: {envs}")
@ -488,18 +468,14 @@ class ClusterInfo:
if self.spec.get_image_registry() is not None
else image
)
volume_mounts = volume_mounts_for_service(
self.parsed_pod_yaml_map, service_name
)
volume_mounts = volume_mounts_for_service(self.parsed_pod_yaml_map, service_name)
# Handle command/entrypoint from compose file
# In docker-compose: entrypoint -> k8s command, command -> k8s args
container_command = None
container_args = None
if "entrypoint" in service_info:
entrypoint = service_info["entrypoint"]
container_command = (
entrypoint if isinstance(entrypoint, list) else [entrypoint]
)
container_command = entrypoint if isinstance(entrypoint, list) else [entrypoint]
if "command" in service_info:
cmd = service_info["command"]
container_args = cmd if isinstance(cmd, list) else cmd.split()
@ -528,18 +504,14 @@ class ClusterInfo:
volume_mounts=volume_mounts,
security_context=client.V1SecurityContext(
privileged=self.spec.get_privileged(),
capabilities=client.V1Capabilities(
add=self.spec.get_capabilities()
)
capabilities=client.V1Capabilities(add=self.spec.get_capabilities())
if self.spec.get_capabilities()
else None,
),
resources=to_k8s_resource_requirements(container_resources),
)
containers.append(container)
volumes = volumes_for_pod_files(
self.parsed_pod_yaml_map, self.spec, self.app_name
)
volumes = volumes_for_pod_files(self.parsed_pod_yaml_map, self.spec, self.app_name)
registry_config = self.spec.get_image_registry_config()
if registry_config:
secret_name = f"{self.app_name}-registry"

View File

@ -12,43 +12,41 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, cast
from kubernetes import client, config
from kubernetes.client.exceptions import ApiException
from typing import Any, Dict, List, Optional, cast
from stack_orchestrator import constants
from stack_orchestrator.deploy.deployer import Deployer, DeployerConfigGenerator
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.deploy.k8s.cluster_info import ClusterInfo
from stack_orchestrator.deploy.k8s.helpers import (
connect_registry_to_kind_network,
containers_in_pod,
create_cluster,
destroy_cluster,
load_images_into_kind,
)
from stack_orchestrator.deploy.k8s.helpers import (
install_ingress_for_kind,
wait_for_ingress_in_kind,
is_ingress_running,
)
from stack_orchestrator.deploy.k8s.helpers import (
pods_in_deployment,
containers_in_pod,
log_stream_from_string,
)
from stack_orchestrator.deploy.k8s.helpers import (
generate_kind_config,
ensure_local_registry,
generate_high_memlock_spec_json,
generate_kind_config,
install_ingress_for_kind,
is_ingress_running,
local_registry_image,
log_stream_from_string,
pods_in_deployment,
push_images_to_local_registry,
wait_for_ingress_in_kind,
)
from stack_orchestrator.deploy.k8s.cluster_info import ClusterInfo
from stack_orchestrator.opts import opts
from stack_orchestrator.deploy.deployment_context import DeploymentContext
from stack_orchestrator.util import error_exit
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
super().__init__(*args, **kwargs)
self.__dict__ = self
@ -143,9 +141,7 @@ class K8sDeployer(Deployer):
else:
# Get the config file and pass to load_kube_config()
config.load_kube_config(
config_file=self.deployment_dir.joinpath(
constants.kube_config_filename
).as_posix()
config_file=self.deployment_dir.joinpath(constants.kube_config_filename).as_posix()
)
self.core_api = client.CoreV1Api()
self.networking_api = client.NetworkingV1Api()
@ -153,10 +149,20 @@ class K8sDeployer(Deployer):
self.custom_obj_api = client.CustomObjectsApi()
def _ensure_namespace(self):
"""Create the deployment namespace if it doesn't exist."""
"""Create the deployment namespace if it doesn't exist.
If the namespace exists but is terminating (e.g., from a prior
down() call), wait for deletion to complete before creating a
fresh namespace. K8s rejects resource creation in a terminating
namespace with 403 Forbidden, so proceeding without waiting
causes PVC/ConfigMap creation failures.
"""
if opts.o.dry_run:
print(f"Dry run: would create namespace {self.k8s_namespace}")
return
self._wait_for_namespace_deletion()
try:
self.core_api.read_namespace(name=self.k8s_namespace)
if opts.o.debug:
@ -176,6 +182,35 @@ class K8sDeployer(Deployer):
else:
raise
def _wait_for_namespace_deletion(self):
"""Block until the namespace is fully deleted, if it is terminating.
Polls every 2s for up to 120s. If the namespace does not exist
(404) or is active, returns immediately.
"""
deadline = time.monotonic() + 120
while True:
try:
ns = self.core_api.read_namespace(name=self.k8s_namespace)
except ApiException as e:
if e.status == 404:
return # Gone — ready to create
raise
phase = ns.status.phase if ns.status else None
if phase != "Terminating":
return # Active or unknown — proceed
if time.monotonic() > deadline:
error_exit(
f"Namespace {self.k8s_namespace} still terminating "
f"after 120s — cannot proceed"
)
if opts.o.debug:
print(f"Namespace {self.k8s_namespace} is terminating, " f"waiting for deletion...")
time.sleep(2)
def _delete_namespace(self):
"""Delete the deployment namespace and all resources within it."""
if opts.o.dry_run:
@ -192,6 +227,152 @@ class K8sDeployer(Deployer):
else:
raise
def _ensure_config_map(self, cfg_map):
"""Create or replace a ConfigMap (idempotent)."""
try:
resp = self.core_api.create_namespaced_config_map(
body=cfg_map, namespace=self.k8s_namespace
)
if opts.o.debug:
print(f"ConfigMap created: {resp}")
except ApiException as e:
if e.status == 409:
existing = self.core_api.read_namespaced_config_map(
name=cfg_map.metadata.name, namespace=self.k8s_namespace
)
cfg_map.metadata.resource_version = existing.metadata.resource_version
resp = self.core_api.replace_namespaced_config_map(
name=cfg_map.metadata.name,
namespace=self.k8s_namespace,
body=cfg_map,
)
if opts.o.debug:
print(f"ConfigMap updated: {resp}")
else:
raise
def _ensure_deployment(self, deployment):
"""Create or replace a Deployment (idempotent)."""
try:
resp = cast(
client.V1Deployment,
self.apps_api.create_namespaced_deployment(
body=deployment, namespace=self.k8s_namespace
),
)
if opts.o.debug:
print("Deployment created:")
except ApiException as e:
if e.status == 409:
existing = self.apps_api.read_namespaced_deployment(
name=deployment.metadata.name,
namespace=self.k8s_namespace,
)
deployment.metadata.resource_version = existing.metadata.resource_version
resp = cast(
client.V1Deployment,
self.apps_api.replace_namespaced_deployment(
name=deployment.metadata.name,
namespace=self.k8s_namespace,
body=deployment,
),
)
if opts.o.debug:
print("Deployment updated:")
else:
raise
if opts.o.debug:
meta = resp.metadata
spec = resp.spec
if meta and spec and spec.template.spec:
containers = spec.template.spec.containers
img = containers[0].image if containers else None
print(f"{meta.namespace} {meta.name} {meta.generation} {img}")
def _ensure_service(self, service, kind: str = "Service"):
"""Create or replace a Service (idempotent).
Services have immutable fields (spec.clusterIP) that must be
preserved from the existing object on replace.
"""
try:
resp = self.core_api.create_namespaced_service(
namespace=self.k8s_namespace, body=service
)
if opts.o.debug:
print(f"{kind} created: {resp}")
except ApiException as e:
if e.status == 409:
existing = self.core_api.read_namespaced_service(
name=service.metadata.name, namespace=self.k8s_namespace
)
service.metadata.resource_version = existing.metadata.resource_version
if existing.spec.cluster_ip:
service.spec.cluster_ip = existing.spec.cluster_ip
resp = self.core_api.replace_namespaced_service(
name=service.metadata.name,
namespace=self.k8s_namespace,
body=service,
)
if opts.o.debug:
print(f"{kind} updated: {resp}")
else:
raise
def _ensure_ingress(self, ingress):
"""Create or replace an Ingress (idempotent)."""
try:
resp = self.networking_api.create_namespaced_ingress(
namespace=self.k8s_namespace, body=ingress
)
if opts.o.debug:
print(f"Ingress created: {resp}")
except ApiException as e:
if e.status == 409:
existing = self.networking_api.read_namespaced_ingress(
name=ingress.metadata.name, namespace=self.k8s_namespace
)
ingress.metadata.resource_version = existing.metadata.resource_version
resp = self.networking_api.replace_namespaced_ingress(
name=ingress.metadata.name,
namespace=self.k8s_namespace,
body=ingress,
)
if opts.o.debug:
print(f"Ingress updated: {resp}")
else:
raise
def _clear_released_pv_claim_refs(self):
"""Patch any Released PVs for this deployment to clear stale claimRefs.
After a namespace is deleted, PVCs are cascade-deleted but
cluster-scoped PVs survive in Released state with claimRefs
pointing to the now-deleted PVC UIDs. New PVCs cannot bind
to these PVs until the stale claimRef is removed.
"""
try:
pvs = self.core_api.list_persistent_volume(
label_selector=f"app={self.cluster_info.app_name}"
)
except ApiException:
return
for pv in pvs.items:
phase = pv.status.phase if pv.status else None
if phase == "Released" and pv.spec and pv.spec.claim_ref:
pv_name = pv.metadata.name
if opts.o.debug:
old_ref = pv.spec.claim_ref
print(
f"Clearing stale claimRef on PV {pv_name} "
f"(was {old_ref.namespace}/{old_ref.name})"
)
self.core_api.patch_persistent_volume(
name=pv_name,
body={"spec": {"claimRef": None}},
)
def _create_volume_data(self):
# Create the host-path-mounted PVs for this deployment
pvs = self.cluster_info.get_pvs()
@ -200,24 +381,29 @@ class K8sDeployer(Deployer):
print(f"Sending this pv: {pv}")
if not opts.o.dry_run:
try:
pv_resp = self.core_api.read_persistent_volume(
name=pv.metadata.name
)
pv_resp = self.core_api.read_persistent_volume(name=pv.metadata.name)
if pv_resp:
if opts.o.debug:
print("PVs already present:")
print(f"{pv_resp}")
continue
except: # noqa: E722
pass
except ApiException as e:
if e.status != 404:
raise
pv_resp = self.core_api.create_persistent_volume(body=pv)
if opts.o.debug:
print("PVs created:")
print(f"{pv_resp}")
# After PV creation/verification, clear stale claimRefs on any
# Released PVs so that new PVCs can bind to them.
if not opts.o.dry_run:
self._clear_released_pv_claim_refs()
# Figure out the PVCs for this deployment
pvcs = self.cluster_info.get_pvcs()
pvc_errors = []
for pvc in pvcs:
if opts.o.debug:
print(f"Sending this pvc: {pvc}")
@ -232,15 +418,27 @@ class K8sDeployer(Deployer):
print("PVCs already present:")
print(f"{pvc_resp}")
continue
except: # noqa: E722
pass
except ApiException as e:
if e.status != 404:
raise
pvc_resp = self.core_api.create_namespaced_persistent_volume_claim(
body=pvc, namespace=self.k8s_namespace
)
if opts.o.debug:
print("PVCs created:")
print(f"{pvc_resp}")
try:
pvc_resp = self.core_api.create_namespaced_persistent_volume_claim(
body=pvc, namespace=self.k8s_namespace
)
if opts.o.debug:
print("PVCs created:")
print(f"{pvc_resp}")
except ApiException as e:
pvc_name = pvc.metadata.name
print(f"Error creating PVC {pvc_name}: {e.reason}")
pvc_errors.append(pvc_name)
if pvc_errors:
error_exit(
f"Failed to create PVCs: {', '.join(pvc_errors)}. "
f"Check namespace state and PV availability."
)
# Figure out the ConfigMaps for this deployment
config_maps = self.cluster_info.get_configmaps()
@ -248,50 +446,35 @@ class K8sDeployer(Deployer):
if opts.o.debug:
print(f"Sending this ConfigMap: {cfg_map}")
if not opts.o.dry_run:
cfg_rsp = self.core_api.create_namespaced_config_map(
body=cfg_map, namespace=self.k8s_namespace
)
if opts.o.debug:
print("ConfigMap created:")
print(f"{cfg_rsp}")
self._ensure_config_map(cfg_map)
def _create_deployment(self):
# Process compose files into a Deployment
"""Create the k8s Deployment resource (which starts pods)."""
deployment = self.cluster_info.get_deployment(
image_pull_policy=None if self.is_kind() else "Always"
)
# Create the k8s objects
if self.is_kind():
self._rewrite_local_images(deployment)
if opts.o.debug:
print(f"Sending this deployment: {deployment}")
if not opts.o.dry_run:
deployment_resp = cast(
client.V1Deployment,
self.apps_api.create_namespaced_deployment(
body=deployment, namespace=self.k8s_namespace
),
)
if opts.o.debug:
print("Deployment created:")
meta = deployment_resp.metadata
spec = deployment_resp.spec
if meta and spec and spec.template.spec:
ns = meta.namespace
name = meta.name
gen = meta.generation
containers = spec.template.spec.containers
img = containers[0].image if containers else None
print(f"{ns} {name} {gen} {img}")
self._ensure_deployment(deployment)
service = self.cluster_info.get_service()
if opts.o.debug:
print(f"Sending this service: {service}")
if service and not opts.o.dry_run:
service_resp = self.core_api.create_namespaced_service(
namespace=self.k8s_namespace, body=service
)
if opts.o.debug:
print("Service created:")
print(f"{service_resp}")
def _rewrite_local_images(self, deployment):
"""Rewrite local container images to use the local registry.
Images built locally (listed in stack.yml containers) are pushed to
localhost:5001 by push_images_to_local_registry(). The k8s pod spec
must reference them at that address so containerd pulls from the
local registry instead of trying to find them in its local store.
"""
local_containers = self.deployment_context.stack.obj.get("containers", [])
if not local_containers:
return
containers = deployment.spec.template.spec.containers or []
for container in containers:
if any(c in container.image for c in local_containers):
container.image = local_registry_image(container.image)
def _find_certificate_for_host_name(self, host_name):
all_certificates = self.custom_obj_api.list_namespaced_custom_object(
@ -323,108 +506,109 @@ class K8sDeployer(Deployer):
if before < now < after:
# Check the status is Ready
for condition in status.get("conditions", []):
if "True" == condition.get(
"status"
) and "Ready" == condition.get("type"):
if "True" == condition.get("status") and "Ready" == condition.get(
"type"
):
return cert
return None
def up(self, detach, skip_cluster_management, services):
self._setup_cluster_and_namespace(skip_cluster_management)
self._create_infrastructure()
self._create_deployment()
def _setup_cluster_and_namespace(self, skip_cluster_management):
"""Create kind cluster (if needed) and namespace.
Shared by up() and prepare().
"""
self.skip_cluster_management = skip_cluster_management
if not opts.o.dry_run:
if self.is_kind() and not self.skip_cluster_management:
# Create the kind cluster (or reuse existing one)
kind_config = str(
self.deployment_dir.joinpath(constants.kind_config_filename)
)
ensure_local_registry()
kind_config = str(self.deployment_dir.joinpath(constants.kind_config_filename))
actual_cluster = create_cluster(self.kind_cluster_name, kind_config)
if actual_cluster != self.kind_cluster_name:
# An existing cluster was found, use it instead
self.kind_cluster_name = actual_cluster
# Only load locally-built images into kind
# Registry images (docker.io, ghcr.io, etc.) will be pulled by k8s
local_containers = self.deployment_context.stack.obj.get(
"containers", []
)
connect_registry_to_kind_network(self.kind_cluster_name)
local_containers = self.deployment_context.stack.obj.get("containers", [])
if local_containers:
# Filter image_set to only images matching local containers
local_images = {
img
for img in self.cluster_info.image_set
if any(c in img for c in local_containers)
}
if local_images:
load_images_into_kind(self.kind_cluster_name, local_images)
# Note: if no local containers defined, all images come from registries
push_images_to_local_registry(local_images)
self.connect_api()
# Create deployment-specific namespace for resource isolation
self._ensure_namespace()
if self.is_kind() and not self.skip_cluster_management:
# Configure ingress controller (not installed by default in kind)
# Skip if already running (idempotent for shared cluster)
if not is_ingress_running():
install_ingress_for_kind(self.cluster_info.spec.get_acme_email())
# Wait for ingress to start
# (deployment provisioning will fail unless this is done)
wait_for_ingress_in_kind()
# Create RuntimeClass if unlimited_memlock is enabled
if self.cluster_info.spec.get_unlimited_memlock():
_create_runtime_class(
constants.high_memlock_runtime,
constants.high_memlock_runtime,
)
else:
print("Dry run mode enabled, skipping k8s API connect")
# Create registry secret if configured
def _create_infrastructure(self):
"""Create PVs, PVCs, ConfigMaps, Services, Ingresses, NodePorts.
Everything except the Deployment resource (which starts pods).
Shared by up() and prepare().
"""
from stack_orchestrator.deploy.deployment_create import create_registry_secret
create_registry_secret(self.cluster_info.spec, self.cluster_info.app_name)
self._create_volume_data()
self._create_deployment()
# Create the ClusterIP service (paired with the deployment)
service = self.cluster_info.get_service()
if service and not opts.o.dry_run:
if opts.o.debug:
print(f"Sending this service: {service}")
self._ensure_service(service)
http_proxy_info = self.cluster_info.spec.get_http_proxy()
# Note: we don't support tls for kind (enabling tls causes errors)
use_tls = http_proxy_info and not self.is_kind()
certificate = (
self._find_certificate_for_host_name(http_proxy_info[0]["host-name"])
if use_tls
else None
)
if opts.o.debug:
if certificate:
print(f"Using existing certificate: {certificate}")
if opts.o.debug and certificate:
print(f"Using existing certificate: {certificate}")
ingress = self.cluster_info.get_ingress(
use_tls=use_tls, certificate=certificate
)
ingress = self.cluster_info.get_ingress(use_tls=use_tls, certificate=certificate)
if ingress:
if opts.o.debug:
print(f"Sending this ingress: {ingress}")
if not opts.o.dry_run:
ingress_resp = self.networking_api.create_namespaced_ingress(
namespace=self.k8s_namespace, body=ingress
)
if opts.o.debug:
print("Ingress created:")
print(f"{ingress_resp}")
else:
if opts.o.debug:
print("No ingress configured")
self._ensure_ingress(ingress)
elif opts.o.debug:
print("No ingress configured")
nodeports: List[client.V1Service] = self.cluster_info.get_nodeports()
nodeports: list[client.V1Service] = self.cluster_info.get_nodeports()
for nodeport in nodeports:
if opts.o.debug:
print(f"Sending this nodeport: {nodeport}")
if not opts.o.dry_run:
nodeport_resp = self.core_api.create_namespaced_service(
namespace=self.k8s_namespace, body=nodeport
)
if opts.o.debug:
print("NodePort created:")
print(f"{nodeport_resp}")
self._ensure_service(nodeport, kind="NodePort")
def prepare(self, skip_cluster_management):
"""Create cluster infrastructure without starting pods.
Sets up kind cluster, namespace, PVs, PVCs, ConfigMaps, Services,
Ingresses, and NodePorts everything that up() does EXCEPT creating
the Deployment resource.
"""
self._setup_cluster_and_namespace(skip_cluster_management)
self._create_infrastructure()
print("Cluster infrastructure prepared (no pods started).")
def down(self, timeout, volumes, skip_cluster_management):
self.skip_cluster_management = skip_cluster_management
@ -488,7 +672,7 @@ class K8sDeployer(Deployer):
return
cert = cast(
Dict[str, Any],
dict[str, Any],
self.custom_obj_api.get_namespaced_custom_object(
group="cert-manager.io",
version="v1",
@ -504,7 +688,7 @@ class K8sDeployer(Deployer):
if lb_ingress:
ip = lb_ingress[0].ip or "?"
cert_status = cert.get("status", {})
tls = "notBefore: %s; notAfter: %s; names: %s" % (
tls = "notBefore: {}; notAfter: {}; names: {}".format(
cert_status.get("notBefore", "?"),
cert_status.get("notAfter", "?"),
ingress.spec.tls[0].hosts,
@ -545,9 +729,7 @@ class K8sDeployer(Deployer):
if c.ports:
for prt in c.ports:
ports[str(prt.container_port)] = [
AttrDict(
{"HostIp": pod_ip, "HostPort": prt.container_port}
)
AttrDict({"HostIp": pod_ip, "HostPort": prt.container_port})
]
ret.append(
@ -598,7 +780,7 @@ class K8sDeployer(Deployer):
log_data = "******* No logs available ********\n"
return log_stream_from_string(log_data)
def update(self):
def update_envs(self):
self.connect_api()
ref_deployment = self.cluster_info.get_deployment()
if not ref_deployment or not ref_deployment.metadata:
@ -609,9 +791,7 @@ class K8sDeployer(Deployer):
deployment = cast(
client.V1Deployment,
self.apps_api.read_namespaced_deployment(
name=ref_name, namespace=self.k8s_namespace
),
self.apps_api.read_namespaced_deployment(name=ref_name, namespace=self.k8s_namespace),
)
if not deployment.spec or not deployment.spec.template:
return
@ -650,14 +830,14 @@ class K8sDeployer(Deployer):
user=None,
volumes=None,
entrypoint=None,
env={},
ports=[],
env=None,
ports=None,
detach=False,
):
# We need to figure out how to do this -- check why we're being called first
pass
def run_job(self, job_name: str, helm_release: Optional[str] = None):
def run_job(self, job_name: str, helm_release: str | None = None):
if not opts.o.dry_run:
from stack_orchestrator.deploy.k8s.helm.job_runner import run_helm_job
@ -699,13 +879,9 @@ class K8sDeployerConfigGenerator(DeployerConfigGenerator):
# Must be done before generate_kind_config() which references it.
if self.deployment_context.spec.get_unlimited_memlock():
spec_content = generate_high_memlock_spec_json()
spec_file = deployment_dir.joinpath(
constants.high_memlock_spec_filename
)
spec_file = deployment_dir.joinpath(constants.high_memlock_spec_filename)
if opts.o.debug:
print(
f"Creating high-memlock spec for unlimited memlock: {spec_file}"
)
print(f"Creating high-memlock spec for unlimited memlock: {spec_file}")
with open(spec_file, "w") as output_file:
output_file.write(spec_content)

View File

@ -16,21 +16,21 @@
from pathlib import Path
from stack_orchestrator import constants
from stack_orchestrator.opts import opts
from stack_orchestrator.util import (
get_parsed_stack_config,
get_pod_list,
get_pod_file_path,
get_job_list,
get_job_file_path,
error_exit,
)
from stack_orchestrator.deploy.k8s.helm.kompose_wrapper import (
check_kompose_available,
get_kompose_version,
convert_to_helm_chart,
get_kompose_version,
)
from stack_orchestrator.opts import opts
from stack_orchestrator.util import (
error_exit,
get_job_file_path,
get_job_list,
get_parsed_stack_config,
get_pod_file_path,
get_pod_list,
get_yaml,
)
from stack_orchestrator.util import get_yaml
def _wrap_job_templates_with_conditionals(chart_dir: Path, jobs: list) -> None:
@ -88,7 +88,7 @@ def _post_process_chart(chart_dir: Path, chart_name: str, jobs: list) -> None:
# Fix Chart.yaml
chart_yaml_path = chart_dir / "Chart.yaml"
if chart_yaml_path.exists():
chart_yaml = yaml.load(open(chart_yaml_path, "r"))
chart_yaml = yaml.load(open(chart_yaml_path))
# Fix name
chart_yaml["name"] = chart_name
@ -108,9 +108,7 @@ def _post_process_chart(chart_dir: Path, chart_name: str, jobs: list) -> None:
_wrap_job_templates_with_conditionals(chart_dir, jobs)
def generate_helm_chart(
stack_path: str, spec_file: str, deployment_dir_path: Path
) -> None:
def generate_helm_chart(stack_path: str, spec_file: str, deployment_dir_path: Path) -> None:
"""
Generate a self-sufficient Helm chart from stack compose files using Kompose.
@ -152,7 +150,7 @@ def generate_helm_chart(
error_exit(f"Deployment file not found: {deployment_file}")
yaml = get_yaml()
deployment_config = yaml.load(open(deployment_file, "r"))
deployment_config = yaml.load(open(deployment_file))
cluster_id = deployment_config.get(constants.cluster_id_key)
if not cluster_id:
error_exit(f"cluster-id not found in {deployment_file}")
@ -219,10 +217,7 @@ def generate_helm_chart(
# 5. Create chart directory and invoke Kompose
chart_dir = deployment_dir_path / "chart"
print(
f"Converting {len(compose_files)} compose file(s) to Helm chart "
"using Kompose..."
)
print(f"Converting {len(compose_files)} compose file(s) to Helm chart " "using Kompose...")
try:
output = convert_to_helm_chart(
@ -304,9 +299,7 @@ Edit the generated template files in `templates/` to customize:
# Count generated files
template_files = (
list((chart_dir / "templates").glob("*.yaml"))
if (chart_dir / "templates").exists()
else []
list((chart_dir / "templates").glob("*.yaml")) if (chart_dir / "templates").exists() else []
)
print(f" Files: {len(template_files)} template(s) generated")

View File

@ -13,12 +13,12 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import json
import os
import subprocess
import tempfile
import os
import json
from pathlib import Path
from typing import Optional
from stack_orchestrator.util import get_yaml
@ -40,18 +40,19 @@ def get_release_name_from_chart(chart_dir: Path) -> str:
raise Exception(f"Chart.yaml not found: {chart_yaml_path}")
yaml = get_yaml()
chart_yaml = yaml.load(open(chart_yaml_path, "r"))
chart_yaml = yaml.load(open(chart_yaml_path))
if "name" not in chart_yaml:
raise Exception(f"Chart name not found in {chart_yaml_path}")
return chart_yaml["name"]
name: str = chart_yaml["name"]
return name
def run_helm_job(
chart_dir: Path,
job_name: str,
release: Optional[str] = None,
release: str | None = None,
namespace: str = "default",
timeout: int = 600,
verbose: bool = False,
@ -94,9 +95,7 @@ def run_helm_job(
print(f"Running job '{job_name}' from helm chart: {chart_dir}")
# Use helm template to render the job manifest
with tempfile.NamedTemporaryFile(
mode="w", suffix=".yaml", delete=False
) as tmp_file:
with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as tmp_file:
try:
# Render job template with job enabled
# Use --set-json to properly handle job names with dashes
@ -116,9 +115,7 @@ def run_helm_job(
if verbose:
print(f"Running: {' '.join(helm_cmd)}")
result = subprocess.run(
helm_cmd, check=True, capture_output=True, text=True
)
result = subprocess.run(helm_cmd, check=True, capture_output=True, text=True)
tmp_file.write(result.stdout)
tmp_file.flush()
@ -139,9 +136,7 @@ def run_helm_job(
"-n",
namespace,
]
subprocess.run(
kubectl_apply_cmd, check=True, capture_output=True, text=True
)
subprocess.run(kubectl_apply_cmd, check=True, capture_output=True, text=True)
if verbose:
print(f"Job {actual_job_name} created, waiting for completion...")
@ -164,7 +159,7 @@ def run_helm_job(
except subprocess.CalledProcessError as e:
error_msg = e.stderr if e.stderr else str(e)
raise Exception(f"Job failed: {error_msg}")
raise Exception(f"Job failed: {error_msg}") from e
finally:
# Clean up temp file
if os.path.exists(tmp_file.name):

View File

@ -13,10 +13,9 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import subprocess
import shutil
import subprocess
from pathlib import Path
from typing import List, Optional
def check_kompose_available() -> bool:
@ -37,9 +36,7 @@ def get_kompose_version() -> str:
if not check_kompose_available():
raise Exception("kompose not found in PATH")
result = subprocess.run(
["kompose", "version"], capture_output=True, text=True, timeout=10
)
result = subprocess.run(["kompose", "version"], capture_output=True, text=True, timeout=10)
if result.returncode != 0:
raise Exception(f"Failed to get kompose version: {result.stderr}")
@ -53,7 +50,7 @@ def get_kompose_version() -> str:
def convert_to_helm_chart(
compose_files: List[Path], output_dir: Path, chart_name: Optional[str] = None
compose_files: list[Path], output_dir: Path, chart_name: str | None = None
) -> str:
"""
Invoke kompose to convert Docker Compose files to a Helm chart.
@ -71,8 +68,7 @@ def convert_to_helm_chart(
"""
if not check_kompose_available():
raise Exception(
"kompose not found in PATH. "
"Install from: https://kompose.io/installation/"
"kompose not found in PATH. " "Install from: https://kompose.io/installation/"
)
# Ensure output directory exists
@ -95,9 +91,7 @@ def convert_to_helm_chart(
if result.returncode != 0:
raise Exception(
f"Kompose conversion failed:\n"
f"Command: {' '.join(cmd)}\n"
f"Error: {result.stderr}"
f"Kompose conversion failed:\n" f"Command: {' '.join(cmd)}\n" f"Error: {result.stderr}"
)
return result.stdout

View File

@ -13,20 +13,22 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import os
import re
import subprocess
from collections.abc import Mapping
from pathlib import Path
from typing import cast
import yaml
from kubernetes import client, utils, watch
from kubernetes.client.exceptions import ApiException
import os
from pathlib import Path
import subprocess
import re
from typing import Set, Mapping, List, Optional, cast
import yaml
from stack_orchestrator.util import get_k8s_dir, error_exit
from stack_orchestrator.opts import opts
from stack_orchestrator import constants
from stack_orchestrator.deploy.deploy_util import parsed_pod_files_map_from_file_names
from stack_orchestrator.deploy.deployer import DeployerException
from stack_orchestrator import constants
from stack_orchestrator.opts import opts
from stack_orchestrator.util import error_exit, get_k8s_dir
def is_host_path_mount(volume_name: str) -> bool:
@ -77,9 +79,7 @@ def get_kind_cluster():
Uses `kind get clusters` to find existing clusters.
Returns the cluster name or None if no cluster exists.
"""
result = subprocess.run(
"kind get clusters", shell=True, capture_output=True, text=True
)
result = subprocess.run("kind get clusters", shell=True, capture_output=True, text=True)
if result.returncode != 0:
return None
@ -98,12 +98,12 @@ def _run_command(command: str):
return result
def _get_etcd_host_path_from_kind_config(config_file: str) -> Optional[str]:
def _get_etcd_host_path_from_kind_config(config_file: str) -> str | None:
"""Extract etcd host path from kind config extraMounts."""
import yaml
try:
with open(config_file, "r") as f:
with open(config_file) as f:
config = yaml.safe_load(f)
except Exception:
return None
@ -113,7 +113,8 @@ def _get_etcd_host_path_from_kind_config(config_file: str) -> Optional[str]:
extra_mounts = node.get("extraMounts", [])
for mount in extra_mounts:
if mount.get("containerPath") == "/var/lib/etcd":
return mount.get("hostPath")
host_path: str | None = mount.get("hostPath")
return host_path
return None
@ -133,8 +134,7 @@ def _clean_etcd_keeping_certs(etcd_path: str) -> bool:
db_path = Path(etcd_path) / "member" / "snap" / "db"
# Check existence using docker since etcd dir is root-owned
check_cmd = (
f"docker run --rm -v {etcd_path}:/etcd:ro alpine:3.19 "
"test -f /etcd/member/snap/db"
f"docker run --rm -v {etcd_path}:/etcd:ro alpine:3.19 " "test -f /etcd/member/snap/db"
)
check_result = subprocess.run(check_cmd, shell=True, capture_output=True)
if check_result.returncode != 0:
@ -148,8 +148,16 @@ def _clean_etcd_keeping_certs(etcd_path: str) -> bool:
etcd_image = "gcr.io/etcd-development/etcd:v3.5.9"
temp_dir = "/tmp/laconic-etcd-cleanup"
# Whitelist: prefixes to KEEP - everything else gets deleted
keep_prefixes = "/registry/secrets/caddy-system"
# Whitelist: prefixes to KEEP - everything else gets deleted.
# Must include core cluster resources (kubernetes service, kube-system
# secrets) or kindnet panics on restart — KUBERNETES_SERVICE_HOST is
# injected from the kubernetes ClusterIP service in default namespace.
keep_prefixes = [
"/registry/secrets/caddy-system",
"/registry/services/specs/default/kubernetes",
"/registry/services/endpoints/default/kubernetes",
]
keep_prefixes_str = " ".join(keep_prefixes)
# The etcd image is distroless (no shell). We extract the statically-linked
# etcdctl binary and run it from alpine which has shell + jq support.
@ -195,13 +203,21 @@ def _clean_etcd_keeping_certs(etcd_path: str) -> bool:
sleep 3
# Use alpine with extracted etcdctl to run commands (alpine has shell + jq)
# Export caddy secrets
# Export whitelisted keys (caddy TLS certs + core cluster services)
docker run --rm \
-v {temp_dir}:/backup \
--network container:laconic-etcd-cleanup \
$ALPINE_IMAGE sh -c \
'/backup/etcdctl get --prefix "{keep_prefixes}" -w json \
> /backup/kept.json 2>/dev/null || echo "{{}}" > /backup/kept.json'
$ALPINE_IMAGE sh -c '
apk add --no-cache jq >/dev/null 2>&1
echo "[]" > /backup/all-kvs.json
for prefix in {keep_prefixes_str}; do
/backup/etcdctl get --prefix "$prefix" -w json 2>/dev/null \
| jq ".kvs // []" >> /backup/all-kvs.json || true
done
jq -s "add" /backup/all-kvs.json \
| jq "{{kvs: .}}" > /backup/kept.json 2>/dev/null \
|| echo "{{}}" > /backup/kept.json
'
# Delete ALL registry keys
docker run --rm \
@ -321,7 +337,7 @@ def is_ingress_running() -> bool:
def wait_for_ingress_in_kind():
core_v1 = client.CoreV1Api()
for i in range(20):
for _i in range(20):
warned_waiting = False
w = watch.Watch()
for event in w.stream(
@ -348,9 +364,7 @@ def wait_for_ingress_in_kind():
def install_ingress_for_kind(acme_email: str = ""):
api_client = client.ApiClient()
ingress_install = os.path.abspath(
get_k8s_dir().joinpath(
"components", "ingress", "ingress-caddy-kind-deploy.yaml"
)
get_k8s_dir().joinpath("components", "ingress", "ingress-caddy-kind-deploy.yaml")
)
if opts.o.debug:
print("Installing Caddy ingress controller in kind cluster")
@ -384,13 +398,82 @@ def install_ingress_for_kind(acme_email: str = ""):
)
def load_images_into_kind(kind_cluster_name: str, image_set: Set[str]):
for image in image_set:
LOCAL_REGISTRY_NAME = "kind-registry"
LOCAL_REGISTRY_HOST_PORT = 5001
LOCAL_REGISTRY_CONTAINER_PORT = 5000
def ensure_local_registry():
"""Ensure a persistent local registry container is running.
The registry survives kind cluster recreates images pushed to it
remain available without re-pushing. After ensuring the registry is
running, connects it to the kind Docker network so kind nodes can
pull from it.
"""
# Check if registry container exists (running or stopped)
check = subprocess.run(
f"docker inspect {LOCAL_REGISTRY_NAME}",
shell=True,
capture_output=True,
)
if check.returncode != 0:
# Create the registry container
result = _run_command(
f"kind load docker-image {image} --name {kind_cluster_name}"
f"docker run -d --restart=always"
f" -p {LOCAL_REGISTRY_HOST_PORT}:{LOCAL_REGISTRY_CONTAINER_PORT}"
f" --name {LOCAL_REGISTRY_NAME} registry:2"
)
if result.returncode != 0:
raise DeployerException(f"kind load docker-image failed: {result}")
raise DeployerException(f"Failed to start local registry: {result}")
print(f"Started local registry on port {LOCAL_REGISTRY_HOST_PORT}")
else:
# Ensure it's running (may have been stopped)
_run_command(f"docker start {LOCAL_REGISTRY_NAME}")
if opts.o.debug:
print("Local registry already exists, ensured running")
def connect_registry_to_kind_network(kind_cluster_name: str):
"""Connect the local registry to the kind Docker network.
Idempotent silently succeeds if already connected.
"""
network = "kind"
result = subprocess.run(
f"docker network connect {network} {LOCAL_REGISTRY_NAME}",
shell=True,
capture_output=True,
)
if result.returncode != 0 and b"already exists" not in result.stderr:
raise DeployerException(
f"Failed to connect registry to kind network: " f"{result.stderr.decode()}"
)
def push_images_to_local_registry(image_set: set[str]):
"""Tag and push images to the local registry.
Near-instant compared to kind load (shared filesystem, layer dedup).
"""
for image in image_set:
registry_image = local_registry_image(image)
tag_result = _run_command(f"docker tag {image} {registry_image}")
if tag_result.returncode != 0:
raise DeployerException(f"docker tag failed for {image}: {tag_result}")
push_result = _run_command(f"docker push {registry_image}")
if push_result.returncode != 0:
raise DeployerException(f"docker push failed for {registry_image}: {push_result}")
if opts.o.debug:
print(f"Pushed {registry_image} to local registry")
def local_registry_image(image: str) -> str:
"""Rewrite an image reference to use the local registry.
e.g. laconicnetwork/agave:local -> localhost:5001/laconicnetwork/agave:local
"""
return f"localhost:{LOCAL_REGISTRY_HOST_PORT}/{image}"
def pods_in_deployment(core_api: client.CoreV1Api, deployment_name: str):
@ -406,11 +489,9 @@ def pods_in_deployment(core_api: client.CoreV1Api, deployment_name: str):
return pods
def containers_in_pod(core_api: client.CoreV1Api, pod_name: str) -> List[str]:
containers: List[str] = []
pod_response = cast(
client.V1Pod, core_api.read_namespaced_pod(pod_name, namespace="default")
)
def containers_in_pod(core_api: client.CoreV1Api, pod_name: str) -> list[str]:
containers: list[str] = []
pod_response = cast(client.V1Pod, core_api.read_namespaced_pod(pod_name, namespace="default"))
if opts.o.debug:
print(f"pod_response: {pod_response}")
if not pod_response.spec or not pod_response.spec.containers:
@ -433,7 +514,7 @@ def named_volumes_from_pod_files(parsed_pod_files):
parsed_pod_file = parsed_pod_files[pod]
if "volumes" in parsed_pod_file:
volumes = parsed_pod_file["volumes"]
for volume, value in volumes.items():
for volume, _value in volumes.items():
# Volume definition looks like:
# 'laconicd-data': None
named_volumes.append(volume)
@ -465,14 +546,10 @@ def volume_mounts_for_service(parsed_pod_files, service):
mount_split = mount_string.split(":")
volume_name = mount_split[0]
mount_path = mount_split[1]
mount_options = (
mount_split[2] if len(mount_split) == 3 else None
)
mount_options = mount_split[2] if len(mount_split) == 3 else None
# For host path mounts, use sanitized name
if is_host_path_mount(volume_name):
k8s_volume_name = sanitize_host_path_to_volume_name(
volume_name
)
k8s_volume_name = sanitize_host_path_to_volume_name(volume_name)
else:
k8s_volume_name = volume_name
if opts.o.debug:
@ -511,9 +588,7 @@ def volumes_for_pod_files(parsed_pod_files, spec, app_name):
claim = client.V1PersistentVolumeClaimVolumeSource(
claim_name=f"{app_name}-{volume_name}"
)
volume = client.V1Volume(
name=volume_name, persistent_volume_claim=claim
)
volume = client.V1Volume(name=volume_name, persistent_volume_claim=claim)
result.append(volume)
# Handle host path mounts from service volumes
@ -526,15 +601,11 @@ def volumes_for_pod_files(parsed_pod_files, spec, app_name):
mount_split = mount_string.split(":")
volume_source = mount_split[0]
if is_host_path_mount(volume_source):
sanitized_name = sanitize_host_path_to_volume_name(
volume_source
)
sanitized_name = sanitize_host_path_to_volume_name(volume_source)
if sanitized_name not in seen_host_path_volumes:
seen_host_path_volumes.add(sanitized_name)
# Create hostPath volume for mount inside kind node
kind_mount_path = get_kind_host_path_mount_path(
sanitized_name
)
kind_mount_path = get_kind_host_path_mount_path(sanitized_name)
host_path_source = client.V1HostPathVolumeSource(
path=kind_mount_path, type="FileOrCreate"
)
@ -569,18 +640,18 @@ def _generate_kind_mounts(parsed_pod_files, deployment_dir, deployment_context):
deployment_id = deployment_context.id
backup_subdir = f"cluster-backups/{deployment_id}"
etcd_host_path = _make_absolute_host_path(
Path(f"./data/{backup_subdir}/etcd"), deployment_dir
)
etcd_host_path = _make_absolute_host_path(Path(f"./data/{backup_subdir}/etcd"), deployment_dir)
volume_definitions.append(
f" - hostPath: {etcd_host_path}\n" f" containerPath: /var/lib/etcd\n"
f" - hostPath: {etcd_host_path}\n"
f" containerPath: /var/lib/etcd\n"
f" propagation: HostToContainer\n"
)
pki_host_path = _make_absolute_host_path(
Path(f"./data/{backup_subdir}/pki"), deployment_dir
)
pki_host_path = _make_absolute_host_path(Path(f"./data/{backup_subdir}/pki"), deployment_dir)
volume_definitions.append(
f" - hostPath: {pki_host_path}\n" f" containerPath: /etc/kubernetes/pki\n"
f" - hostPath: {pki_host_path}\n"
f" containerPath: /etc/kubernetes/pki\n"
f" propagation: HostToContainer\n"
)
# Note these paths are relative to the location of the pod files (at present)
@ -606,21 +677,16 @@ def _generate_kind_mounts(parsed_pod_files, deployment_dir, deployment_context):
if is_host_path_mount(volume_name):
# Host path mount - add extraMount for kind
sanitized_name = sanitize_host_path_to_volume_name(
volume_name
)
sanitized_name = sanitize_host_path_to_volume_name(volume_name)
if sanitized_name not in seen_host_path_mounts:
seen_host_path_mounts.add(sanitized_name)
# Resolve path relative to compose directory
host_path = resolve_host_path_for_kind(
volume_name, deployment_dir
)
container_path = get_kind_host_path_mount_path(
sanitized_name
)
host_path = resolve_host_path_for_kind(volume_name, deployment_dir)
container_path = get_kind_host_path_mount_path(sanitized_name)
volume_definitions.append(
f" - hostPath: {host_path}\n"
f" containerPath: {container_path}\n"
f" propagation: HostToContainer\n"
)
if opts.o.debug:
print(f"Added host path mount: {host_path}")
@ -630,10 +696,7 @@ def _generate_kind_mounts(parsed_pod_files, deployment_dir, deployment_context):
print(f"volume_name: {volume_name}")
print(f"map: {volume_host_path_map}")
print(f"mount path: {mount_path}")
if (
volume_name
not in deployment_context.spec.get_configmaps()
):
if volume_name not in deployment_context.spec.get_configmaps():
if (
volume_name in volume_host_path_map
and volume_host_path_map[volume_name]
@ -642,12 +705,11 @@ def _generate_kind_mounts(parsed_pod_files, deployment_dir, deployment_context):
volume_host_path_map[volume_name],
deployment_dir,
)
container_path = get_kind_pv_bind_mount_path(
volume_name
)
container_path = get_kind_pv_bind_mount_path(volume_name)
volume_definitions.append(
f" - hostPath: {host_path}\n"
f" containerPath: {container_path}\n"
f" propagation: HostToContainer\n"
)
return (
""
@ -671,8 +733,7 @@ def _generate_kind_port_mappings_from_services(parsed_pod_files):
# TODO handle the complex cases
# Looks like: 80 or something more complicated
port_definitions.append(
f" - containerPort: {port_string}\n"
f" hostPort: {port_string}\n"
f" - containerPort: {port_string}\n" f" hostPort: {port_string}\n"
)
return (
""
@ -685,9 +746,7 @@ def _generate_kind_port_mappings(parsed_pod_files):
port_definitions = []
# Map port 80 and 443 for the Caddy ingress controller (HTTPS support)
for port_string in ["80", "443"]:
port_definitions.append(
f" - containerPort: {port_string}\n hostPort: {port_string}\n"
)
port_definitions.append(f" - containerPort: {port_string}\n hostPort: {port_string}\n")
return (
""
if len(port_definitions) == 0
@ -703,7 +762,11 @@ def _generate_high_memlock_spec_mount(deployment_dir: Path):
references an absolute path.
"""
spec_path = deployment_dir.joinpath(constants.high_memlock_spec_filename).resolve()
return f" - hostPath: {spec_path}\n" f" containerPath: {spec_path}\n"
return (
f" - hostPath: {spec_path}\n"
f" containerPath: {spec_path}\n"
f" propagation: HostToContainer\n"
)
def generate_high_memlock_spec_json():
@ -877,27 +940,37 @@ def generate_cri_base_json():
return generate_high_memlock_spec_json()
def _generate_containerd_config_patches(
deployment_dir: Path, has_high_memlock: bool
) -> str:
"""Generate containerdConfigPatches YAML for custom runtime handlers.
def _generate_containerd_config_patches(deployment_dir: Path, has_high_memlock: bool) -> str:
"""Generate containerdConfigPatches YAML for containerd configuration.
This configures containerd to have a runtime handler named 'high-memlock'
that uses a custom OCI base spec with unlimited RLIMIT_MEMLOCK.
Includes:
- Local registry mirror (localhost:5001 -> http://kind-registry:5000)
- Custom runtime handler for high-memlock (if enabled)
"""
if not has_high_memlock:
patches = []
# Always configure the local registry mirror so kind nodes pull from it
registry_plugin = f'plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:{LOCAL_REGISTRY_HOST_PORT}"'
endpoint = f"http://{LOCAL_REGISTRY_NAME}:{LOCAL_REGISTRY_CONTAINER_PORT}"
patches.append(f" [{registry_plugin}]\n" f' endpoint = ["{endpoint}"]')
if has_high_memlock:
spec_path = deployment_dir.joinpath(constants.high_memlock_spec_filename).resolve()
runtime_name = constants.high_memlock_runtime
plugin_path = 'plugins."io.containerd.grpc.v1.cri".containerd.runtimes'
patches.append(
f" [{plugin_path}.{runtime_name}]\n"
' runtime_type = "io.containerd.runc.v2"\n'
f' base_runtime_spec = "{spec_path}"'
)
if not patches:
return ""
spec_path = deployment_dir.joinpath(constants.high_memlock_spec_filename).resolve()
runtime_name = constants.high_memlock_runtime
plugin_path = 'plugins."io.containerd.grpc.v1.cri".containerd.runtimes'
return (
"containerdConfigPatches:\n"
" - |-\n"
f" [{plugin_path}.{runtime_name}]\n"
' runtime_type = "io.containerd.runc.v2"\n'
f' base_runtime_spec = "{spec_path}"\n'
)
result = "containerdConfigPatches:\n"
for patch in patches:
result += " - |-\n" + patch + "\n"
return result
# Note: this makes any duplicate definition in b overwrite a
@ -906,9 +979,7 @@ def merge_envs(a: Mapping[str, str], b: Mapping[str, str]) -> Mapping[str, str]:
return result
def _expand_shell_vars(
raw_val: str, env_map: Optional[Mapping[str, str]] = None
) -> str:
def _expand_shell_vars(raw_val: str, env_map: Mapping[str, str] | None = None) -> str:
# Expand docker-compose style variable substitution:
# ${VAR} - use VAR value or empty string
# ${VAR:-default} - use VAR value or default if unset/empty
@ -933,7 +1004,7 @@ def _expand_shell_vars(
def envs_from_compose_file(
compose_file_envs: Mapping[str, str], env_map: Optional[Mapping[str, str]] = None
compose_file_envs: Mapping[str, str], env_map: Mapping[str, str] | None = None
) -> Mapping[str, str]:
result = {}
for env_var, env_val in compose_file_envs.items():
@ -943,7 +1014,7 @@ def envs_from_compose_file(
def translate_sidecar_service_names(
envs: Mapping[str, str], sibling_service_names: List[str]
envs: Mapping[str, str], sibling_service_names: list[str]
) -> Mapping[str, str]:
"""Translate docker-compose service names to localhost for sidecar containers.
@ -970,7 +1041,12 @@ def translate_sidecar_service_names(
# Handle URLs like: postgres://user:pass@db:5432/dbname
# and simple refs like: db:5432 or just db
pattern = rf"\b{re.escape(service_name)}(:\d+)?\b"
new_val = re.sub(pattern, lambda m: f'localhost{m.group(1) or ""}', new_val)
def _replace_with_localhost(m: re.Match[str]) -> str:
port: str = m.group(1) or ""
return "localhost" + port
new_val = re.sub(pattern, _replace_with_localhost, new_val)
result[env_var] = new_val
@ -978,8 +1054,8 @@ def translate_sidecar_service_names(
def envs_from_environment_variables_map(
map: Mapping[str, str]
) -> List[client.V1EnvVar]:
map: Mapping[str, str],
) -> list[client.V1EnvVar]:
result = []
for env_var, env_val in map.items():
result.append(client.V1EnvVar(env_var, env_val))
@ -1010,17 +1086,13 @@ def generate_kind_config(deployment_dir: Path, deployment_context):
pod_files = [p for p in compose_file_dir.iterdir() if p.is_file()]
parsed_pod_files_map = parsed_pod_files_map_from_file_names(pod_files)
port_mappings_yml = _generate_kind_port_mappings(parsed_pod_files_map)
mounts_yml = _generate_kind_mounts(
parsed_pod_files_map, deployment_dir, deployment_context
)
mounts_yml = _generate_kind_mounts(parsed_pod_files_map, deployment_dir, deployment_context)
# Check if unlimited_memlock is enabled
unlimited_memlock = deployment_context.spec.get_unlimited_memlock()
# Generate containerdConfigPatches for RuntimeClass support
containerd_patches_yml = _generate_containerd_config_patches(
deployment_dir, unlimited_memlock
)
containerd_patches_yml = _generate_containerd_config_patches(deployment_dir, unlimited_memlock)
# Add high-memlock spec file mount if needed
if unlimited_memlock:

View File

@ -14,19 +14,18 @@
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import typing
from typing import Optional
import humanfriendly
from pathlib import Path
from stack_orchestrator.util import get_yaml
import humanfriendly
from stack_orchestrator import constants
from stack_orchestrator.util import get_yaml
class ResourceLimits:
cpus: Optional[float] = None
memory: Optional[int] = None
storage: Optional[int] = None
cpus: float | None = None
memory: int | None = None
storage: int | None = None
def __init__(self, obj=None):
if obj is None:
@ -50,8 +49,8 @@ class ResourceLimits:
class Resources:
limits: Optional[ResourceLimits] = None
reservations: Optional[ResourceLimits] = None
limits: ResourceLimits | None = None
reservations: ResourceLimits | None = None
def __init__(self, obj=None):
if obj is None:
@ -74,9 +73,9 @@ class Resources:
class Spec:
obj: typing.Any
file_path: Optional[Path]
file_path: Path | None
def __init__(self, file_path: Optional[Path] = None, obj=None) -> None:
def __init__(self, file_path: Path | None = None, obj=None) -> None:
if obj is None:
obj = {}
self.file_path = file_path
@ -92,13 +91,13 @@ class Spec:
return self.obj.get(item, default)
def init_from_file(self, file_path: Path):
self.obj = get_yaml().load(open(file_path, "r"))
self.obj = get_yaml().load(open(file_path))
self.file_path = file_path
def get_image_registry(self):
return self.obj.get(constants.image_registry_key)
def get_image_registry_config(self) -> typing.Optional[typing.Dict]:
def get_image_registry_config(self) -> dict | None:
"""Returns registry auth config: {server, username, token-env}.
Used for private container registries like GHCR. The token-env field
@ -107,7 +106,8 @@ class Spec:
Note: Uses 'registry-credentials' key to avoid collision with
'image-registry' key which is for pushing images.
"""
return self.obj.get("registry-credentials")
result: dict[str, str] | None = self.obj.get("registry-credentials")
return result
def get_volumes(self):
return self.obj.get(constants.volumes_key, {})
@ -116,9 +116,22 @@ class Spec:
return self.obj.get(constants.configmaps_key, {})
def get_container_resources(self):
return Resources(
self.obj.get(constants.resources_key, {}).get("containers", {})
)
return Resources(self.obj.get(constants.resources_key, {}).get("containers", {}))
def get_container_resources_for(self, container_name: str) -> Resources | None:
"""Look up per-container resource overrides from spec.yml.
Checks resources.containers.<container_name> in the spec. Returns None
if no per-container override exists (caller falls back to other sources).
"""
containers_block = self.obj.get(constants.resources_key, {}).get("containers", {})
if container_name in containers_block:
entry = containers_block[container_name]
# Only treat it as a per-container override if it's a dict with
# reservations/limits nested inside (not a top-level global key)
if isinstance(entry, dict) and ("reservations" in entry or "limits" in entry):
return Resources(entry)
return None
def get_container_resources_for(
self, container_name: str
@ -142,9 +155,7 @@ class Spec:
return None
def get_volume_resources(self):
return Resources(
self.obj.get(constants.resources_key, {}).get(constants.volumes_key, {})
)
return Resources(self.obj.get(constants.resources_key, {}).get(constants.volumes_key, {}))
def get_http_proxy(self):
return self.obj.get(constants.network_key, {}).get(constants.http_proxy_key, [])
@ -167,9 +178,7 @@ class Spec:
def get_privileged(self):
return (
"true"
== str(
self.obj.get(constants.security_key, {}).get("privileged", "false")
).lower()
== str(self.obj.get(constants.security_key, {}).get("privileged", "false")).lower()
)
def get_capabilities(self):
@ -196,9 +205,7 @@ class Spec:
Runtime class name string, or None to use default runtime.
"""
# Explicit runtime class takes precedence
explicit = self.obj.get(constants.security_key, {}).get(
constants.runtime_class_key, None
)
explicit = self.obj.get(constants.security_key, {}).get(constants.runtime_class_key, None)
if explicit:
return explicit

View File

@ -13,8 +13,9 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from pathlib import Path
import typing
from pathlib import Path
from stack_orchestrator.util import get_yaml
@ -26,4 +27,4 @@ class Stack:
self.name = name
def init_from_file(self, file_path: Path):
self.obj = get_yaml().load(open(file_path, "r"))
self.obj = get_yaml().load(open(file_path))

View File

@ -13,23 +13,22 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import click
import os
from pathlib import Path
from urllib.parse import urlparse
from tempfile import NamedTemporaryFile
from urllib.parse import urlparse
import click
from stack_orchestrator.util import error_exit, global_options2
from stack_orchestrator.deploy.deployment_create import init_operation, create_operation
from stack_orchestrator.deploy.deploy import create_deploy_context
from stack_orchestrator.deploy.deploy_types import DeployCommandContext
from stack_orchestrator.deploy.deployment_create import create_operation, init_operation
from stack_orchestrator.util import error_exit, global_options2
def _fixup_container_tag(deployment_dir: str, image: str):
deployment_dir_path = Path(deployment_dir)
compose_file = deployment_dir_path.joinpath(
"compose", "docker-compose-webapp-template.yml"
)
compose_file = deployment_dir_path.joinpath("compose", "docker-compose-webapp-template.yml")
# replace "cerc/webapp-container:local" in the file with our image tag
with open(compose_file) as rfile:
contents = rfile.read()
@ -56,9 +55,7 @@ def _fixup_url_spec(spec_file_name: str, url: str):
wfile.write(contents)
def create_deployment(
ctx, deployment_dir, image, url, kube_config, image_registry, env_file
):
def create_deployment(ctx, deployment_dir, image, url, kube_config, image_registry, env_file):
# Do the equivalent of:
# 1. laconic-so --stack webapp-template deploy --deploy-to k8s init \
# --output webapp-spec.yml
@ -117,9 +114,7 @@ def command(ctx):
"--image-registry",
help="Provide a container image registry url for this k8s cluster",
)
@click.option(
"--deployment-dir", help="Create deployment files in this directory", required=True
)
@click.option("--deployment-dir", help="Create deployment files in this directory", required=True)
@click.option("--image", help="image to deploy", required=True)
@click.option("--url", help="url to serve", required=True)
@click.option("--env-file", help="environment file for webapp")
@ -127,6 +122,4 @@ def command(ctx):
def create(ctx, deployment_dir, image, url, kube_config, image_registry, env_file):
"""create a deployment for the specified webapp container"""
return create_deployment(
ctx, deployment_dir, image, url, kube_config, image_registry, env_file
)
return create_deployment(ctx, deployment_dir, image, url, kube_config, image_registry, env_file)

View File

@ -21,10 +21,10 @@ import sys
import tempfile
import time
import uuid
import yaml
import click
import gnupg
import yaml
from stack_orchestrator.deploy.images import remote_image_exists
from stack_orchestrator.deploy.webapp import deploy_webapp
@ -34,16 +34,16 @@ from stack_orchestrator.deploy.webapp.util import (
TimedLogger,
build_container_image,
confirm_auction,
push_container_image,
file_hash,
deploy_to_k8s,
publish_deployment,
hostname_for_deployment_request,
generate_hostname_for_app,
match_owner,
skip_by_tag,
confirm_payment,
deploy_to_k8s,
file_hash,
generate_hostname_for_app,
hostname_for_deployment_request,
load_known_requests,
match_owner,
publish_deployment,
push_container_image,
skip_by_tag,
)
@ -70,9 +70,7 @@ def process_app_deployment_request(
logger.log("BEGIN - process_app_deployment_request")
# 1. look up application
app = laconic.get_record(
app_deployment_request.attributes.application, require=True
)
app = laconic.get_record(app_deployment_request.attributes.application, require=True)
assert app is not None # require=True ensures this
logger.log(f"Retrieved app record {app_deployment_request.attributes.application}")
@ -84,9 +82,7 @@ def process_app_deployment_request(
if "allow" == fqdn_policy or "preexisting" == fqdn_policy:
fqdn = requested_name
else:
raise Exception(
f"{requested_name} is invalid: only unqualified hostnames are allowed."
)
raise Exception(f"{requested_name} is invalid: only unqualified hostnames are allowed.")
else:
fqdn = f"{requested_name}.{default_dns_suffix}"
@ -108,8 +104,7 @@ def process_app_deployment_request(
logger.log(f"Matched DnsRecord ownership: {matched_owner}")
else:
raise Exception(
"Unable to confirm ownership of DnsRecord %s for request %s"
% (dns_lrn, app_deployment_request.id)
f"Unable to confirm ownership of DnsRecord {dns_lrn} for request {app_deployment_request.id}"
)
elif "preexisting" == fqdn_policy:
raise Exception(
@ -144,7 +139,7 @@ def process_app_deployment_request(
env_filename = tempfile.mktemp()
with open(env_filename, "w") as file:
for k, v in env.items():
file.write("%s=%s\n" % (k, shlex.quote(str(v))))
file.write(f"{k}={shlex.quote(str(v))}\n")
# 5. determine new or existing deployment
# a. check for deployment lrn
@ -153,8 +148,7 @@ def process_app_deployment_request(
app_deployment_lrn = app_deployment_request.attributes.deployment
if not app_deployment_lrn.startswith(deployment_record_namespace):
raise Exception(
"Deployment LRN %s is not in a supported namespace"
% app_deployment_request.attributes.deployment
f"Deployment LRN {app_deployment_request.attributes.deployment} is not in a supported namespace"
)
deployment_record = laconic.get_record(app_deployment_lrn)
@ -165,14 +159,14 @@ def process_app_deployment_request(
# already-unique deployment id
unique_deployment_id = hashlib.md5(fqdn.encode()).hexdigest()[:16]
deployment_config_file = os.path.join(deployment_dir, "config.env")
deployment_container_tag = "laconic-webapp/%s:local" % unique_deployment_id
deployment_container_tag = f"laconic-webapp/{unique_deployment_id}:local"
app_image_shared_tag = f"laconic-webapp/{app.id}:local"
# b. check for deployment directory (create if necessary)
if not os.path.exists(deployment_dir):
if deployment_record:
raise Exception(
"Deployment record %s exists, but not deployment dir %s. "
"Please remove name." % (app_deployment_lrn, deployment_dir)
f"Deployment record {app_deployment_lrn} exists, but not deployment dir {deployment_dir}. "
"Please remove name."
)
logger.log(
f"Creating webapp deployment in: {deployment_dir} "
@ -198,11 +192,7 @@ def process_app_deployment_request(
)
# 6. build container (if needed)
# TODO: add a comment that explains what this code is doing (not clear to me)
if (
not deployment_record
or deployment_record.attributes.application != app.id
or force_rebuild
):
if not deployment_record or deployment_record.attributes.application != app.id or force_rebuild:
needs_k8s_deploy = True
# check if the image already exists
shared_tag_exists = remote_image_exists(image_registry, app_image_shared_tag)
@ -224,11 +214,9 @@ def process_app_deployment_request(
# )
logger.log("Tag complete")
else:
extra_build_args = [] # TODO: pull from request
extra_build_args: list[str] = [] # TODO: pull from request
logger.log(f"Building container image: {deployment_container_tag}")
build_container_image(
app, deployment_container_tag, extra_build_args, logger
)
build_container_image(app, deployment_container_tag, extra_build_args, logger)
logger.log("Build complete")
logger.log(f"Pushing container image: {deployment_container_tag}")
push_container_image(deployment_dir, logger)
@ -287,9 +275,7 @@ def dump_known_requests(filename, requests, status="SEEN"):
@click.command()
@click.option("--kube-config", help="Provide a config file for a k8s deployment")
@click.option(
"--laconic-config", help="Provide a config file for laconicd", required=True
)
@click.option("--laconic-config", help="Provide a config file for laconicd", required=True)
@click.option(
"--image-registry",
help="Provide a container image registry url for this k8s cluster",
@ -306,9 +292,7 @@ def dump_known_requests(filename, requests, status="SEEN"):
is_flag=True,
default=False,
)
@click.option(
"--state-file", help="File to store state about previously seen requests."
)
@click.option("--state-file", help="File to store state about previously seen requests.")
@click.option(
"--only-update-state",
help="Only update the state file, don't process any requests anything.",
@ -331,9 +315,7 @@ def dump_known_requests(filename, requests, status="SEEN"):
help="eg, lrn://laconic/deployments",
required=True,
)
@click.option(
"--dry-run", help="Don't do anything, just report what would be done.", is_flag=True
)
@click.option("--dry-run", help="Don't do anything, just report what would be done.", is_flag=True)
@click.option(
"--include-tags",
help="Only include requests with matching tags (comma-separated).",
@ -344,17 +326,13 @@ def dump_known_requests(filename, requests, status="SEEN"):
help="Exclude requests with matching tags (comma-separated).",
default="",
)
@click.option(
"--force-rebuild", help="Rebuild even if the image already exists.", is_flag=True
)
@click.option("--force-rebuild", help="Rebuild even if the image already exists.", is_flag=True)
@click.option(
"--recreate-on-deploy",
help="Remove and recreate deployments instead of updating them.",
is_flag=True,
)
@click.option(
"--log-dir", help="Output build/deployment logs to directory.", default=None
)
@click.option("--log-dir", help="Output build/deployment logs to directory.", default=None)
@click.option(
"--min-required-payment",
help="Requests must have a minimum payment to be processed (in alnt)",
@ -378,9 +356,7 @@ def dump_known_requests(filename, requests, status="SEEN"):
help="The directory containing uploaded config.",
required=True,
)
@click.option(
"--private-key-file", help="The private key for decrypting config.", required=True
)
@click.option("--private-key-file", help="The private key for decrypting config.", required=True)
@click.option(
"--registry-lock-file",
help="File path to use for registry mutex lock",
@ -435,11 +411,7 @@ def command( # noqa: C901
sys.exit(2)
if not only_update_state:
if (
not record_namespace_dns
or not record_namespace_deployments
or not dns_suffix
):
if not record_namespace_dns or not record_namespace_deployments or not dns_suffix:
print(
"--dns-suffix, --record-namespace-dns, and "
"--record-namespace-deployments are all required",
@ -491,8 +463,7 @@ def command( # noqa: C901
if min_required_payment and not payment_address:
print(
f"Minimum payment required, but no payment address listed "
f"for deployer: {lrn}.",
f"Minimum payment required, but no payment address listed " f"for deployer: {lrn}.",
file=sys.stderr,
)
sys.exit(2)
@ -557,26 +528,18 @@ def command( # noqa: C901
requested_name = r.attributes.dns
if not requested_name:
requested_name = generate_hostname_for_app(app)
main_logger.log(
"Generating name %s for request %s." % (requested_name, r_id)
)
main_logger.log(f"Generating name {requested_name} for request {r_id}.")
if (
requested_name in skipped_by_name
or requested_name in requests_by_name
):
main_logger.log(
"Ignoring request %s, it has been superseded." % r_id
)
if requested_name in skipped_by_name or requested_name in requests_by_name:
main_logger.log(f"Ignoring request {r_id}, it has been superseded.")
result = "SKIP"
continue
if skip_by_tag(r, include_tags, exclude_tags):
r_tags = r.attributes.tags if r.attributes else None
main_logger.log(
"Skipping request %s, filtered by tag "
"(include %s, exclude %s, present %s)"
% (r_id, include_tags, exclude_tags, r_tags)
f"Skipping request {r_id}, filtered by tag "
f"(include {include_tags}, exclude {exclude_tags}, present {r_tags})"
)
skipped_by_name[requested_name] = r
result = "SKIP"
@ -584,8 +547,7 @@ def command( # noqa: C901
r_app = r.attributes.application if r.attributes else "unknown"
main_logger.log(
"Found pending request %s to run application %s on %s."
% (r_id, r_app, requested_name)
f"Found pending request {r_id} to run application {r_app} on {requested_name}."
)
requests_by_name[requested_name] = r
except Exception as e:
@ -617,17 +579,14 @@ def command( # noqa: C901
requests_to_check_for_payment = []
for r in requests_by_name.values():
if r.id in cancellation_requests and match_owner(
cancellation_requests[r.id], r
):
if r.id in cancellation_requests and match_owner(cancellation_requests[r.id], r):
main_logger.log(
f"Found deployment cancellation request for {r.id} "
f"at {cancellation_requests[r.id].id}"
)
elif r.id in deployments_by_request:
main_logger.log(
f"Found satisfied request for {r.id} "
f"at {deployments_by_request[r.id].id}"
f"Found satisfied request for {r.id} " f"at {deployments_by_request[r.id].id}"
)
else:
if (
@ -635,8 +594,7 @@ def command( # noqa: C901
and previous_requests[r.id].get("status", "") != "RETRY"
):
main_logger.log(
f"Skipping unsatisfied request {r.id} "
"because we have seen it before."
f"Skipping unsatisfied request {r.id} " "because we have seen it before."
)
else:
main_logger.log(f"Request {r.id} needs to processed.")
@ -650,14 +608,10 @@ def command( # noqa: C901
main_logger.log(f"{r.id}: Auction confirmed.")
requests_to_execute.append(r)
else:
main_logger.log(
f"Skipping request {r.id}: unable to verify auction."
)
main_logger.log(f"Skipping request {r.id}: unable to verify auction.")
dump_known_requests(state_file, [r], status="SKIP")
else:
main_logger.log(
f"Skipping request {r.id}: not handling requests with auction."
)
main_logger.log(f"Skipping request {r.id}: not handling requests with auction.")
dump_known_requests(state_file, [r], status="SKIP")
elif min_required_payment:
main_logger.log(f"{r.id}: Confirming payment...")
@ -671,16 +625,12 @@ def command( # noqa: C901
main_logger.log(f"{r.id}: Payment confirmed.")
requests_to_execute.append(r)
else:
main_logger.log(
f"Skipping request {r.id}: unable to verify payment."
)
main_logger.log(f"Skipping request {r.id}: unable to verify payment.")
dump_known_requests(state_file, [r], status="UNPAID")
else:
requests_to_execute.append(r)
main_logger.log(
"Found %d unsatisfied request(s) to process." % len(requests_to_execute)
)
main_logger.log(f"Found {len(requests_to_execute)} unsatisfied request(s) to process.")
if not dry_run:
for r in requests_to_execute:
@ -700,10 +650,8 @@ def command( # noqa: C901
if not os.path.exists(run_log_dir):
os.mkdir(run_log_dir)
run_log_file_path = os.path.join(run_log_dir, f"{run_id}.log")
main_logger.log(
f"Directing deployment logs to: {run_log_file_path}"
)
run_log_file = open(run_log_file_path, "wt")
main_logger.log(f"Directing deployment logs to: {run_log_file_path}")
run_log_file = open(run_log_file_path, "w")
run_reg_client = LaconicRegistryClient(
laconic_config,
log_file=run_log_file,

View File

@ -12,18 +12,18 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import sys
import json
import sys
import click
from stack_orchestrator.deploy.webapp.util import (
AUCTION_KIND_PROVIDER,
AttrDict,
AuctionStatus,
LaconicRegistryClient,
TimedLogger,
load_known_requests,
AUCTION_KIND_PROVIDER,
AuctionStatus,
)
@ -44,16 +44,13 @@ def process_app_deployment_auction(
# Check auction kind
if auction.kind != AUCTION_KIND_PROVIDER:
raise Exception(
f"Auction kind needs to be ${AUCTION_KIND_PROVIDER}, got {auction.kind}"
)
raise Exception(f"Auction kind needs to be ${AUCTION_KIND_PROVIDER}, got {auction.kind}")
if current_status == "PENDING":
# Skip if pending auction not in commit state
if auction.status != AuctionStatus.COMMIT:
logger.log(
f"Skipping pending request, auction {auction_id} "
f"status: {auction.status}"
f"Skipping pending request, auction {auction_id} " f"status: {auction.status}"
)
return "SKIP", ""
@ -115,9 +112,7 @@ def dump_known_auction_requests(filename, requests, status="SEEN"):
@click.command()
@click.option(
"--laconic-config", help="Provide a config file for laconicd", required=True
)
@click.option("--laconic-config", help="Provide a config file for laconicd", required=True)
@click.option(
"--state-file",
help="File to store state about previously seen auction requests.",
@ -133,9 +128,7 @@ def dump_known_auction_requests(filename, requests, status="SEEN"):
help="File path to use for registry mutex lock",
default=None,
)
@click.option(
"--dry-run", help="Don't do anything, just report what would be done.", is_flag=True
)
@click.option("--dry-run", help="Don't do anything, just report what would be done.", is_flag=True)
@click.pass_context
def command(
ctx,
@ -198,8 +191,7 @@ def command(
continue
logger.log(
f"Found pending auction request {r.id} for application "
f"{application}."
f"Found pending auction request {r.id} for application " f"{application}."
)
# Add requests to be processed
@ -209,9 +201,7 @@ def command(
result_status = "ERROR"
logger.log(f"ERROR: examining request {r.id}: " + str(e))
finally:
logger.log(
f"DONE: Examining request {r.id} with result {result_status}."
)
logger.log(f"DONE: Examining request {r.id} with result {result_status}.")
if result_status in ["ERROR"]:
dump_known_auction_requests(
state_file,

View File

@ -30,9 +30,7 @@ def fatal(msg: str):
@click.command()
@click.option(
"--laconic-config", help="Provide a config file for laconicd", required=True
)
@click.option("--laconic-config", help="Provide a config file for laconicd", required=True)
@click.option(
"--app",
help="The LRN of the application to deploy.",

View File

@ -13,28 +13,24 @@
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import base64
import click
import sys
import yaml
from urllib.parse import urlparse
import click
import yaml
from stack_orchestrator.deploy.webapp.util import LaconicRegistryClient
@click.command()
@click.option(
"--laconic-config", help="Provide a config file for laconicd", required=True
)
@click.option("--laconic-config", help="Provide a config file for laconicd", required=True)
@click.option("--api-url", help="The API URL of the deployer.", required=True)
@click.option(
"--public-key-file",
help="The public key to use. This should be a binary file.",
required=True,
)
@click.option(
"--lrn", help="eg, lrn://laconic/deployers/my.deployer.name", required=True
)
@click.option("--lrn", help="eg, lrn://laconic/deployers/my.deployer.name", required=True)
@click.option(
"--payment-address",
help="The address to which payments should be made. "
@ -84,9 +80,7 @@ def command( # noqa: C901
}
if min_required_payment:
webapp_deployer_record["record"][
"minimumPayment"
] = f"{min_required_payment}alnt"
webapp_deployer_record["record"]["minimumPayment"] = f"{min_required_payment}alnt"
if dry_run:
yaml.dump(webapp_deployer_record, sys.stdout)

View File

@ -1,6 +1,6 @@
from functools import wraps
import os
import time
from functools import wraps
# Define default file path for the lock
DEFAULT_LOCK_FILE_PATH = "/tmp/registry_mutex_lock_file"
@ -17,7 +17,7 @@ def acquire_lock(client, lock_file_path, timeout):
try:
# Check if lock file exists and is potentially stale
if os.path.exists(lock_file_path):
with open(lock_file_path, "r") as lock_file:
with open(lock_file_path) as lock_file:
timestamp = float(lock_file.read().strip())
# If lock is stale, remove the lock file
@ -25,9 +25,7 @@ def acquire_lock(client, lock_file_path, timeout):
print(f"Stale lock detected, removing lock file {lock_file_path}")
os.remove(lock_file_path)
else:
print(
f"Lock file {lock_file_path} exists and is recent, waiting..."
)
print(f"Lock file {lock_file_path} exists and is recent, waiting...")
time.sleep(LOCK_RETRY_INTERVAL)
continue

View File

@ -12,24 +12,24 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import base64
import shutil
import sys
import tempfile
from datetime import datetime
from typing import NoReturn
import base64
import gnupg
import click
import gnupg
import requests
import yaml
from dotenv import dotenv_values
from stack_orchestrator.deploy.webapp.util import (
AUCTION_KIND_PROVIDER,
AuctionStatus,
LaconicRegistryClient,
)
from dotenv import dotenv_values
def fatal(msg: str) -> NoReturn:
@ -38,9 +38,7 @@ def fatal(msg: str) -> NoReturn:
@click.command()
@click.option(
"--laconic-config", help="Provide a config file for laconicd", required=True
)
@click.option("--laconic-config", help="Provide a config file for laconicd", required=True)
@click.option(
"--app",
help="The LRN of the application to deploy.",
@ -63,9 +61,7 @@ def fatal(msg: str) -> NoReturn:
"'auto' to use the deployer's minimum required payment."
),
)
@click.option(
"--use-payment", help="The TX id of an existing, unused payment", default=None
)
@click.option("--use-payment", help="The TX id of an existing, unused payment", default=None)
@click.option("--dns", help="the DNS name to request (default is autogenerated)")
@click.option(
"--dry-run",
@ -144,9 +140,7 @@ def command( # noqa: C901
# Check auction kind
auction_kind = auction.kind if auction else None
if auction_kind != AUCTION_KIND_PROVIDER:
fatal(
f"Auction kind needs to be ${AUCTION_KIND_PROVIDER}, got {auction_kind}"
)
fatal(f"Auction kind needs to be ${AUCTION_KIND_PROVIDER}, got {auction_kind}")
# Check auction status
auction_status = auction.status if auction else None
@ -163,14 +157,9 @@ def command( # noqa: C901
# Get deployer record for all the auction winners
for auction_winner in auction_winners:
# TODO: Match auction winner address with provider address?
deployer_records_by_owner = laconic.webapp_deployers(
{"paymentAddress": auction_winner}
)
deployer_records_by_owner = laconic.webapp_deployers({"paymentAddress": auction_winner})
if len(deployer_records_by_owner) == 0:
print(
f"WARNING: Unable to locate deployer for auction winner "
f"{auction_winner}"
)
print(f"WARNING: Unable to locate deployer for auction winner " f"{auction_winner}")
# Take first record with name set
target_deployer_record = deployer_records_by_owner[0]
@ -196,9 +185,7 @@ def command( # noqa: C901
gpg = gnupg.GPG(gnupghome=tempdir)
# Import the deployer's public key
result = gpg.import_keys(
base64.b64decode(deployer_record.attributes.publicKey)
)
result = gpg.import_keys(base64.b64decode(deployer_record.attributes.publicKey))
if 1 != result.imported:
fatal("Failed to import deployer's public key.")
@ -237,15 +224,9 @@ def command( # noqa: C901
if (not deployer) and len(deployer_record.names):
target_deployer = deployer_record.names[0]
app_name = (
app_record.attributes.name
if app_record and app_record.attributes
else "unknown"
)
app_name = app_record.attributes.name if app_record and app_record.attributes else "unknown"
app_version = (
app_record.attributes.version
if app_record and app_record.attributes
else "unknown"
app_record.attributes.version if app_record and app_record.attributes else "unknown"
)
deployment_request = {
"record": {
@ -273,15 +254,11 @@ def command( # noqa: C901
deployment_request["record"]["payment"] = "DRY_RUN"
elif "auto" == make_payment:
if "minimumPayment" in deployer_record.attributes:
amount = int(
deployer_record.attributes.minimumPayment.replace("alnt", "")
)
amount = int(deployer_record.attributes.minimumPayment.replace("alnt", ""))
else:
amount = make_payment
if amount:
receipt = laconic.send_tokens(
deployer_record.attributes.paymentAddress, amount
)
receipt = laconic.send_tokens(deployer_record.attributes.paymentAddress, amount)
deployment_request["record"]["payment"] = receipt.tx.hash
print("Payment TX:", receipt.tx.hash)
elif use_payment:

View File

@ -26,12 +26,8 @@ def fatal(msg: str) -> None:
@click.command()
@click.option(
"--laconic-config", help="Provide a config file for laconicd", required=True
)
@click.option(
"--deployer", help="The LRN of the deployer to process this request.", required=True
)
@click.option("--laconic-config", help="Provide a config file for laconicd", required=True)
@click.option("--deployer", help="The LRN of the deployer to process this request.", required=True)
@click.option(
"--deployment",
help="Deployment record (ApplicationDeploymentRecord) id of the deployment.",
@ -44,9 +40,7 @@ def fatal(msg: str) -> None:
"'auto' to use the deployer's minimum required payment."
),
)
@click.option(
"--use-payment", help="The TX id of an existing, unused payment", default=None
)
@click.option("--use-payment", help="The TX id of an existing, unused payment", default=None)
@click.option(
"--dry-run",
help="Don't publish anything, just report what would be done.",

View File

@ -22,6 +22,7 @@
# all or specific containers
import hashlib
import click
from dotenv import dotenv_values

View File

@ -21,11 +21,11 @@ import sys
import click
from stack_orchestrator.deploy.webapp.util import (
TimedLogger,
LaconicRegistryClient,
TimedLogger,
confirm_payment,
match_owner,
skip_by_tag,
confirm_payment,
)
main_logger = TimedLogger(file=sys.stderr)
@ -40,9 +40,7 @@ def process_app_removal_request(
delete_names,
webapp_deployer_record,
):
deployment_record = laconic.get_record(
app_removal_request.attributes.deployment, require=True
)
deployment_record = laconic.get_record(app_removal_request.attributes.deployment, require=True)
assert deployment_record is not None # require=True ensures this
assert deployment_record.attributes is not None
@ -50,12 +48,10 @@ def process_app_removal_request(
assert dns_record is not None # require=True ensures this
assert dns_record.attributes is not None
deployment_dir = os.path.join(
deployment_parent_dir, dns_record.attributes.name.lower()
)
deployment_dir = os.path.join(deployment_parent_dir, dns_record.attributes.name.lower())
if not os.path.exists(deployment_dir):
raise Exception("Deployment directory %s does not exist." % deployment_dir)
raise Exception(f"Deployment directory {deployment_dir} does not exist.")
# Check if the removal request is from the owner of the DnsRecord or
# deployment record.
@ -63,9 +59,7 @@ def process_app_removal_request(
# Or of the original deployment request.
if not matched_owner and deployment_record.attributes.request:
original_request = laconic.get_record(
deployment_record.attributes.request, require=True
)
original_request = laconic.get_record(deployment_record.attributes.request, require=True)
assert original_request is not None # require=True ensures this
matched_owner = match_owner(app_removal_request, original_request)
@ -75,8 +69,7 @@ def process_app_removal_request(
deployment_id = deployment_record.id if deployment_record else "unknown"
request_id = app_removal_request.id if app_removal_request else "unknown"
raise Exception(
"Unable to confirm ownership of deployment %s for removal request %s"
% (deployment_id, request_id)
f"Unable to confirm ownership of deployment {deployment_id} for removal request {request_id}"
)
# TODO(telackey): Call the function directly. The easiest way to build
@ -124,7 +117,7 @@ def process_app_removal_request(
def load_known_requests(filename):
if filename and os.path.exists(filename):
return json.load(open(filename, "r"))
return json.load(open(filename))
return {}
@ -138,9 +131,7 @@ def dump_known_requests(filename, requests):
@click.command()
@click.option(
"--laconic-config", help="Provide a config file for laconicd", required=True
)
@click.option("--laconic-config", help="Provide a config file for laconicd", required=True)
@click.option(
"--deployment-parent-dir",
help="Create deployment directories beneath this directory",
@ -153,9 +144,7 @@ def dump_known_requests(filename, requests):
is_flag=True,
default=False,
)
@click.option(
"--state-file", help="File to store state about previously seen requests."
)
@click.option("--state-file", help="File to store state about previously seen requests.")
@click.option(
"--only-update-state",
help="Only update the state file, don't process any requests anything.",
@ -166,12 +155,8 @@ def dump_known_requests(filename, requests):
help="Delete all names associated with removed deployments.",
default=True,
)
@click.option(
"--delete-volumes/--preserve-volumes", default=True, help="delete data volumes"
)
@click.option(
"--dry-run", help="Don't do anything, just report what would be done.", is_flag=True
)
@click.option("--delete-volumes/--preserve-volumes", default=True, help="delete data volumes")
@click.option("--dry-run", help="Don't do anything, just report what would be done.", is_flag=True)
@click.option(
"--include-tags",
help="Only include requests with matching tags (comma-separated).",
@ -245,8 +230,7 @@ def command( # noqa: C901
if min_required_payment and not payment_address:
print(
f"Minimum payment required, but no payment address listed "
f"for deployer: {lrn}.",
f"Minimum payment required, but no payment address listed " f"for deployer: {lrn}.",
file=sys.stderr,
)
sys.exit(2)
@ -303,9 +287,7 @@ def command( # noqa: C901
continue
if not r.attributes.deployment:
r_id = r.id if r else "unknown"
main_logger.log(
f"Skipping removal request {r_id} since it was a cancellation."
)
main_logger.log(f"Skipping removal request {r_id} since it was a cancellation.")
elif r.attributes.deployment in one_per_deployment:
r_id = r.id if r else "unknown"
main_logger.log(f"Skipping removal request {r_id} since it was superseded.")
@ -323,14 +305,12 @@ def command( # noqa: C901
)
elif skip_by_tag(r, include_tags, exclude_tags):
main_logger.log(
"Skipping removal request %s, filtered by tag "
"(include %s, exclude %s, present %s)"
% (r.id, include_tags, exclude_tags, r.attributes.tags)
f"Skipping removal request {r.id}, filtered by tag "
f"(include {include_tags}, exclude {exclude_tags}, present {r.attributes.tags})"
)
elif r.id in removals_by_request:
main_logger.log(
f"Found satisfied request for {r.id} "
f"at {removals_by_request[r.id].id}"
f"Found satisfied request for {r.id} " f"at {removals_by_request[r.id].id}"
)
elif r.attributes.deployment in removals_by_deployment:
main_logger.log(
@ -344,8 +324,7 @@ def command( # noqa: C901
requests_to_check_for_payment.append(r)
else:
main_logger.log(
f"Skipping unsatisfied request {r.id} "
"because we have seen it before."
f"Skipping unsatisfied request {r.id} " "because we have seen it before."
)
except Exception as e:
main_logger.log(f"ERROR examining {r.id}: {e}")
@ -370,9 +349,7 @@ def command( # noqa: C901
else:
requests_to_execute = requests_to_check_for_payment
main_logger.log(
"Found %d unsatisfied request(s) to process." % len(requests_to_execute)
)
main_logger.log(f"Found {len(requests_to_execute)} unsatisfied request(s) to process.")
if not dry_run:
for r in requests_to_execute:

View File

@ -22,10 +22,10 @@ import subprocess
import sys
import tempfile
import uuid
import yaml
from enum import Enum
from typing import Any, List, Optional, TextIO
from typing import Any, TextIO
import yaml
from stack_orchestrator.deploy.webapp.registry_mutex import registry_mutex
@ -43,17 +43,17 @@ AUCTION_KIND_PROVIDER = "provider"
class AttrDict(dict):
def __init__(self, *args: Any, **kwargs: Any) -> None:
super(AttrDict, self).__init__(*args, **kwargs)
super().__init__(*args, **kwargs)
self.__dict__ = self
def __getattribute__(self, attr: str) -> Any:
__dict__ = super(AttrDict, self).__getattribute__("__dict__")
__dict__ = super().__getattribute__("__dict__")
if attr in __dict__:
v = super(AttrDict, self).__getattribute__(attr)
v = super().__getattribute__(attr)
if isinstance(v, dict):
return AttrDict(v)
return v
return super(AttrDict, self).__getattribute__(attr)
return super().__getattribute__(attr)
def __getattr__(self, attr: str) -> Any:
# This method is called when attribute is not found
@ -62,15 +62,13 @@ class AttrDict(dict):
class TimedLogger:
def __init__(self, id: str = "", file: Optional[TextIO] = None) -> None:
def __init__(self, id: str = "", file: TextIO | None = None) -> None:
self.start = datetime.datetime.now()
self.last = self.start
self.id = id
self.file = file
def log(
self, msg: str, show_step_time: bool = True, show_total_time: bool = False
) -> None:
def log(self, msg: str, show_step_time: bool = True, show_total_time: bool = False) -> None:
prefix = f"{datetime.datetime.utcnow()} - {self.id}"
if show_step_time:
prefix += f" - {datetime.datetime.now() - self.last} (step)"
@ -84,11 +82,11 @@ class TimedLogger:
def load_known_requests(filename):
if filename and os.path.exists(filename):
return json.load(open(filename, "r"))
return json.load(open(filename))
return {}
def logged_cmd(log_file: Optional[TextIO], *vargs: str) -> str:
def logged_cmd(log_file: TextIO | None, *vargs: str) -> str:
result = None
try:
if log_file:
@ -105,15 +103,14 @@ def logged_cmd(log_file: Optional[TextIO], *vargs: str) -> str:
raise err
def match_owner(
recordA: Optional[AttrDict], *records: Optional[AttrDict]
) -> Optional[str]:
def match_owner(recordA: AttrDict | None, *records: AttrDict | None) -> str | None:
if not recordA or not recordA.owners:
return None
for owner in recordA.owners:
for otherRecord in records:
if otherRecord and otherRecord.owners and owner in otherRecord.owners:
return owner
result: str | None = owner
return result
return None
@ -147,9 +144,7 @@ class LaconicRegistryClient:
return self.cache["whoami"]
args = ["laconic", "-c", self.config_file, "registry", "account", "get"]
results = [
AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r
]
results = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
if len(results):
self.cache["whoami"] = results[0]
@ -178,9 +173,7 @@ class LaconicRegistryClient:
"--address",
address,
]
results = [
AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r
]
results = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
if len(results):
self.cache["accounts"][address] = results[0]
return results[0]
@ -203,9 +196,7 @@ class LaconicRegistryClient:
"--id",
id,
]
results = [
AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r
]
results = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
self._add_to_cache(results)
if len(results):
return results[0]
@ -216,9 +207,7 @@ class LaconicRegistryClient:
def list_bonds(self):
args = ["laconic", "-c", self.config_file, "registry", "bond", "list"]
results = [
AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r
]
results = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
self._add_to_cache(results)
return results
@ -232,12 +221,10 @@ class LaconicRegistryClient:
if criteria:
for k, v in criteria.items():
args.append("--%s" % k)
args.append(f"--{k}")
args.append(str(v))
results = [
AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r
]
results = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
# Most recent records first
results.sort(key=lambda r: r.createTime or "")
@ -246,7 +233,7 @@ class LaconicRegistryClient:
return results
def _add_to_cache(self, records: List[AttrDict]) -> None:
def _add_to_cache(self, records: list[AttrDict]) -> None:
if not records:
return
@ -271,9 +258,7 @@ class LaconicRegistryClient:
args = ["laconic", "-c", self.config_file, "registry", "name", "resolve", name]
parsed = [
AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r
]
parsed = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
if parsed:
self._add_to_cache(parsed)
return parsed[0]
@ -303,9 +288,7 @@ class LaconicRegistryClient:
name_or_id,
]
parsed = [
AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r
]
parsed = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
if len(parsed):
self._add_to_cache(parsed)
return parsed[0]
@ -356,9 +339,7 @@ class LaconicRegistryClient:
results = None
try:
results = [
AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r
]
results = [AttrDict(r) for r in json.loads(logged_cmd(self.log_file, *args)) if r]
except: # noqa: E722
pass
@ -422,7 +403,7 @@ class LaconicRegistryClient:
record_file = open(record_fname, "w")
yaml.dump(record, record_file)
record_file.close()
print(open(record_fname, "r").read(), file=self.log_file)
print(open(record_fname).read(), file=self.log_file)
new_record_id = json.loads(
logged_cmd(
@ -573,10 +554,10 @@ def determine_base_container(clone_dir, app_type="webapp"):
def build_container_image(
app_record: Optional[AttrDict],
app_record: AttrDict | None,
tag: str,
extra_build_args: Optional[List[str]] = None,
logger: Optional[TimedLogger] = None,
extra_build_args: list[str] | None = None,
logger: TimedLogger | None = None,
) -> None:
if app_record is None:
raise ValueError("app_record cannot be None")
@ -649,9 +630,7 @@ def build_container_image(
)
result.check_returncode()
base_container = determine_base_container(
clone_dir, app_record.attributes.app_type
)
base_container = determine_base_container(clone_dir, app_record.attributes.app_type)
if logger:
logger.log("Building webapp ...")
@ -696,7 +675,7 @@ def deploy_to_k8s(deploy_record, deployment_dir, recreate, logger):
if not deploy_record:
commands_to_run = ["start"]
else:
commands_to_run = ["update"]
commands_to_run = ["update-envs"]
for command in commands_to_run:
logger.log(f"Running {command} command on deployment dir: {deployment_dir}")
@ -727,14 +706,12 @@ def publish_deployment(
if not deploy_record:
deploy_ver = "0.0.1"
else:
deploy_ver = "0.0.%d" % (
int(deploy_record.attributes.version.split(".")[-1]) + 1
)
deploy_ver = f"0.0.{int(deploy_record.attributes.version.split('.')[-1]) + 1}"
if not dns_record:
dns_ver = "0.0.1"
else:
dns_ver = "0.0.%d" % (int(dns_record.attributes.version.split(".")[-1]) + 1)
dns_ver = f"0.0.{int(dns_record.attributes.version.split('.')[-1]) + 1}"
spec = yaml.full_load(open(os.path.join(deployment_dir, "spec.yml")))
fqdn = spec["network"]["http-proxy"][0]["host-name"]
@ -779,13 +756,9 @@ def publish_deployment(
# Set auction or payment id from request
if app_deployment_request.attributes.auction:
new_deployment_record["record"][
"auction"
] = app_deployment_request.attributes.auction
new_deployment_record["record"]["auction"] = app_deployment_request.attributes.auction
elif app_deployment_request.attributes.payment:
new_deployment_record["record"][
"payment"
] = app_deployment_request.attributes.payment
new_deployment_record["record"]["payment"] = app_deployment_request.attributes.payment
if webapp_deployer_record:
new_deployment_record["record"]["deployer"] = webapp_deployer_record.names[0]
@ -799,9 +772,7 @@ def publish_deployment(
def hostname_for_deployment_request(app_deployment_request, laconic):
dns_name = app_deployment_request.attributes.dns
if not dns_name:
app = laconic.get_record(
app_deployment_request.attributes.application, require=True
)
app = laconic.get_record(app_deployment_request.attributes.application, require=True)
dns_name = generate_hostname_for_app(app)
elif dns_name.startswith("lrn://"):
record = laconic.get_record(dns_name, require=True)
@ -818,7 +789,7 @@ def generate_hostname_for_app(app):
m.update(app.attributes.repository[0].encode())
else:
m.update(app.attributes.repository.encode())
return "%s-%s" % (last_part, m.hexdigest()[0:10])
return f"{last_part}-{m.hexdigest()[0:10]}"
def skip_by_tag(r, include_tags, exclude_tags):
@ -881,16 +852,13 @@ def confirm_payment(
pay_denom = "".join([i for i in tx_amount if not i.isdigit()])
if pay_denom != "alnt":
logger.log(
f"{record.id}: {pay_denom} in tx {tx.hash} is not an expected "
"payment denomination"
f"{record.id}: {pay_denom} in tx {tx.hash} is not an expected " "payment denomination"
)
return False
pay_amount = int("".join([i for i in tx_amount if i.isdigit()]) or "0")
if pay_amount < min_amount:
logger.log(
f"{record.id}: payment amount {tx.amount} is less than minimum {min_amount}"
)
logger.log(f"{record.id}: payment amount {tx.amount} is less than minimum {min_amount}")
return False
# Check if the payment was already used on a deployment
@ -914,9 +882,7 @@ def confirm_payment(
{"deployer": record.attributes.deployer, "payment": tx.hash}, all=True
)
if len(used):
logger.log(
f"{record.id}: payment {tx.hash} already used on deployment removal {used}"
)
logger.log(f"{record.id}: payment {tx.hash} already used on deployment removal {used}")
return False
return True
@ -940,9 +906,7 @@ def confirm_auction(
# Cross check app against application in the auction record
requested_app = laconic.get_record(record.attributes.application, require=True)
auction_app = laconic.get_record(
auction_records_by_id[0].attributes.application, require=True
)
auction_app = laconic.get_record(auction_records_by_id[0].attributes.application, require=True)
requested_app_id = requested_app.id if requested_app else None
auction_app_id = auction_app.id if auction_app else None
if requested_app_id != auction_app_id:

View File

@ -15,30 +15,24 @@
import click
from stack_orchestrator import opts, update, version
from stack_orchestrator.build import build_containers, build_npms, build_webapp, fetch_containers
from stack_orchestrator.command_types import CommandOptions
from stack_orchestrator.repos import setup_repositories
from stack_orchestrator.repos import fetch_stack
from stack_orchestrator.build import build_containers, fetch_containers
from stack_orchestrator.build import build_npms
from stack_orchestrator.build import build_webapp
from stack_orchestrator.deploy import deploy, deployment
from stack_orchestrator.deploy.webapp import (
run_webapp,
deploy_webapp,
deploy_webapp_from_registry,
undeploy_webapp_from_registry,
publish_webapp_deployer,
publish_deployment_auction,
handle_deployment_auction,
publish_deployment_auction,
publish_webapp_deployer,
request_webapp_deployment,
request_webapp_undeployment,
run_webapp,
undeploy_webapp_from_registry,
)
from stack_orchestrator.deploy import deploy
from stack_orchestrator import version
from stack_orchestrator.deploy import deployment
from stack_orchestrator import opts
from stack_orchestrator import update
from stack_orchestrator.repos import fetch_stack, setup_repositories
CONTEXT_SETTINGS = dict(help_option_names=["-h", "--help"])
CONTEXT_SETTINGS = {"help_option_names": ["-h", "--help"]}
@click.group(context_settings=CONTEXT_SETTINGS)

View File

@ -17,9 +17,9 @@
# CERC_REPO_BASE_DIR defaults to ~/cerc
import click
import os
import click
from decouple import config
from git import exc
@ -36,9 +36,7 @@ from stack_orchestrator.util import error_exit
@click.pass_context
def command(ctx, stack_locator, git_ssh, check_only, pull):
"""Optionally resolve then git clone a repository with stack definitions."""
dev_root_path = os.path.expanduser(
str(config("CERC_REPO_BASE_DIR", default="~/cerc"))
)
dev_root_path = os.path.expanduser(str(config("CERC_REPO_BASE_DIR", default="~/cerc")))
if not opts.o.quiet:
print(f"Dev Root is: {dev_root_path}")
try:

View File

@ -16,20 +16,22 @@
# env vars:
# CERC_REPO_BASE_DIR defaults to ~/cerc
import importlib.resources
import os
import sys
from decouple import config
import git
from git.exc import GitCommandError, InvalidGitRepositoryError
from typing import Any
from tqdm import tqdm
import click
import importlib.resources
import git
from decouple import config
from git.exc import GitCommandError, InvalidGitRepositoryError
from tqdm import tqdm
from stack_orchestrator.opts import opts
from stack_orchestrator.util import (
error_exit,
get_parsed_stack_config,
include_exclude_check,
error_exit,
warn_exit,
)
@ -86,48 +88,38 @@ def _get_repo_current_branch_or_tag(full_filesystem_repo_path):
current_repo_branch_or_tag = "***UNDETERMINED***"
is_branch = False
try:
current_repo_branch_or_tag = git.Repo(
full_filesystem_repo_path
).active_branch.name
current_repo_branch_or_tag = git.Repo(full_filesystem_repo_path).active_branch.name
is_branch = True
except TypeError:
# This means that the current ref is not a branch, so possibly a tag
# Let's try to get the tag
try:
current_repo_branch_or_tag = git.Repo(
full_filesystem_repo_path
).git.describe("--tags", "--exact-match")
current_repo_branch_or_tag = git.Repo(full_filesystem_repo_path).git.describe(
"--tags", "--exact-match"
)
# Note that git is asymmetric -- the tag you told it to check out
# may not be the one you get back here (if there are multiple tags
# associated with the same commit)
except GitCommandError:
# If there is no matching branch or tag checked out, just use the current
# SHA
current_repo_branch_or_tag = (
git.Repo(full_filesystem_repo_path).commit("HEAD").hexsha
)
current_repo_branch_or_tag = git.Repo(full_filesystem_repo_path).commit("HEAD").hexsha
return current_repo_branch_or_tag, is_branch
# TODO: fix the messy arg list here
def process_repo(
pull, check_only, git_ssh, dev_root_path, branches_array, fully_qualified_repo
):
def process_repo(pull, check_only, git_ssh, dev_root_path, branches_array, fully_qualified_repo):
if opts.o.verbose:
print(f"Processing repo: {fully_qualified_repo}")
repo_host, repo_path, repo_branch = host_and_path_for_repo(fully_qualified_repo)
git_ssh_prefix = f"git@{repo_host}:"
git_http_prefix = f"https://{repo_host}/"
full_github_repo_path = (
f"{git_ssh_prefix if git_ssh else git_http_prefix}{repo_path}"
)
full_github_repo_path = f"{git_ssh_prefix if git_ssh else git_http_prefix}{repo_path}"
repoName = repo_path.split("/")[-1]
full_filesystem_repo_path = os.path.join(dev_root_path, repoName)
is_present = os.path.isdir(full_filesystem_repo_path)
(current_repo_branch_or_tag, is_branch) = (
_get_repo_current_branch_or_tag(full_filesystem_repo_path)
if is_present
else (None, None)
_get_repo_current_branch_or_tag(full_filesystem_repo_path) if is_present else (None, None)
)
if not opts.o.quiet:
present_text = (
@ -140,10 +132,7 @@ def process_repo(
# Quick check that it's actually a repo
if is_present:
if not is_git_repo(full_filesystem_repo_path):
print(
f"Error: {full_filesystem_repo_path} does not contain "
"a valid git repository"
)
print(f"Error: {full_filesystem_repo_path} does not contain " "a valid git repository")
sys.exit(1)
else:
if pull:
@ -190,8 +179,7 @@ def process_repo(
if branch_to_checkout:
if current_repo_branch_or_tag is None or (
current_repo_branch_or_tag
and (current_repo_branch_or_tag != branch_to_checkout)
current_repo_branch_or_tag and (current_repo_branch_or_tag != branch_to_checkout)
):
if not opts.o.quiet:
print(f"switching to branch {branch_to_checkout} in repo {repo_path}")
@ -245,14 +233,9 @@ def command(ctx, include, exclude, git_ssh, check_only, pull, branches):
if local_stack:
dev_root_path = os.getcwd()[0 : os.getcwd().rindex("stack-orchestrator")]
print(
f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: "
f"{dev_root_path}"
)
print(f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: " f"{dev_root_path}")
else:
dev_root_path = os.path.expanduser(
str(config("CERC_REPO_BASE_DIR", default="~/cerc"))
)
dev_root_path = os.path.expanduser(str(config("CERC_REPO_BASE_DIR", default="~/cerc")))
if not quiet:
print(f"Dev Root is: {dev_root_path}")
@ -265,9 +248,7 @@ def command(ctx, include, exclude, git_ssh, check_only, pull, branches):
# See: https://stackoverflow.com/a/20885799/1701505
from stack_orchestrator import data
with importlib.resources.open_text(
data, "repository-list.txt"
) as repository_list_file:
with importlib.resources.open_text(data, "repository-list.txt") as repository_list_file:
all_repos = repository_list_file.read().splitlines()
repos_in_scope = []

View File

@ -13,16 +13,18 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
import click
import datetime
import filecmp
import os
from pathlib import Path
import requests
import sys
import stat
import shutil
import stat
import sys
from pathlib import Path
import click
import requests
import validators
from stack_orchestrator.util import get_yaml
@ -40,9 +42,7 @@ def _error_exit(s: str):
# Note at present this probably won't work on non-Unix based OSes like Windows
@click.command()
@click.option(
"--check-only", is_flag=True, default=False, help="only check, don't update"
)
@click.option("--check-only", is_flag=True, default=False, help="only check, don't update")
@click.pass_context
def command(ctx, check_only):
"""update shiv binary from a distribution url"""
@ -52,7 +52,7 @@ def command(ctx, check_only):
if not config_file_path.exists():
_error_exit(f"Error: Config file: {config_file_path} not found")
yaml = get_yaml()
config = yaml.load(open(config_file_path, "r"))
config = yaml.load(open(config_file_path))
if "distribution-url" not in config:
_error_exit(f"Error: {config_key} not defined in {config_file_path}")
distribution_url = config[config_key]
@ -61,9 +61,7 @@ def command(ctx, check_only):
_error_exit(f"ERROR: distribution url: {distribution_url} is not valid")
# Figure out the filename for ourselves
shiv_binary_path = Path(sys.argv[0])
timestamp_filename = (
f"laconic-so-download-{datetime.datetime.now().strftime('%y%m%d-%H%M%S')}"
)
timestamp_filename = f"laconic-so-download-{datetime.datetime.now().strftime('%y%m%d-%H%M%S')}"
temp_download_path = shiv_binary_path.parent.joinpath(timestamp_filename)
# Download the file to a temp filename
if ctx.obj.verbose:

View File

@ -13,14 +13,17 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from decouple import config
import os.path
import sys
import ruamel.yaml
from collections.abc import Mapping
from pathlib import Path
from typing import NoReturn
import ruamel.yaml
from decouple import config
from dotenv import dotenv_values
from typing import Mapping, NoReturn, Optional, Set, List
from stack_orchestrator.constants import stack_file_name, deployment_file_name
from stack_orchestrator.constants import deployment_file_name, stack_file_name
def include_exclude_check(s, include, exclude):
@ -50,14 +53,9 @@ def get_dev_root_path(ctx):
if ctx and ctx.local_stack:
# TODO: This code probably doesn't work
dev_root_path = os.getcwd()[0 : os.getcwd().rindex("stack-orchestrator")]
print(
f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: "
f"{dev_root_path}"
)
print(f"Local stack dev_root_path (CERC_REPO_BASE_DIR) overridden to: " f"{dev_root_path}")
else:
dev_root_path = os.path.expanduser(
str(config("CERC_REPO_BASE_DIR", default="~/cerc"))
)
dev_root_path = os.path.expanduser(str(config("CERC_REPO_BASE_DIR", default="~/cerc")))
return dev_root_path
@ -65,7 +63,7 @@ def get_dev_root_path(ctx):
def get_parsed_stack_config(stack):
stack_file_path = get_stack_path(stack).joinpath(stack_file_name)
if stack_file_path.exists():
return get_yaml().load(open(stack_file_path, "r"))
return get_yaml().load(open(stack_file_path))
# We try here to generate a useful diagnostic error
# First check if the stack directory is present
if stack_file_path.parent.exists():
@ -101,10 +99,10 @@ def get_job_list(parsed_stack):
return result
def get_plugin_code_paths(stack) -> List[Path]:
def get_plugin_code_paths(stack) -> list[Path]:
parsed_stack = get_parsed_stack_config(stack)
pods = parsed_stack["pods"]
result: Set[Path] = set()
result: set[Path] = set()
for pod in pods:
if type(pod) is str:
result.add(get_stack_path(stack))
@ -191,7 +189,7 @@ def get_job_file_path(stack, parsed_stack, job_name: str):
def get_pod_script_paths(parsed_stack, pod_name: str):
pods = parsed_stack["pods"]
result = []
if not type(pods[0]) is str:
if type(pods[0]) is not str:
for pod in pods:
if pod["name"] == pod_name:
pod_root_dir = os.path.join(
@ -243,7 +241,7 @@ def get_k8s_dir():
def get_parsed_deployment_spec(spec_file):
spec_file_path = Path(spec_file)
try:
return get_yaml().load(open(spec_file_path, "r"))
return get_yaml().load(open(spec_file_path))
except FileNotFoundError as error:
# We try here to generate a useful diagnostic error
print(f"Error: spec file: {spec_file_path} does not exist")
@ -293,5 +291,6 @@ def warn_exit(s) -> NoReturn:
sys.exit(0)
def env_var_map_from_file(file: Path) -> Mapping[str, Optional[str]]:
return dotenv_values(file)
def env_var_map_from_file(file: Path) -> Mapping[str, str | None]:
result: Mapping[str, str | None] = dotenv_values(file)
return result

View File

@ -13,8 +13,9 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
from importlib import metadata, resources
import click
from importlib import resources, metadata
@click.command()

2092
uv.lock generated Normal file

File diff suppressed because it is too large Load Diff