Compare commits
No commits in common. "main" and "roysc/deployment-create-sync" have entirely different histories.
main
...
roysc/depl
71
CLAUDE.md
71
CLAUDE.md
@ -8,7 +8,6 @@ NEVER assume your hypotheses are true without evidence
|
||||
|
||||
ALWAYS clearly state when something is a hypothesis
|
||||
ALWAYS use evidence from the systems your interacting with to support your claims and hypotheses
|
||||
ALWAYS run `pre-commit run --all-files` before committing changes
|
||||
|
||||
## Key Principles
|
||||
|
||||
@ -44,76 +43,6 @@ This project follows principles inspired by literate programming, where developm
|
||||
|
||||
This approach treats the human-AI collaboration as a form of **conversational literate programming** where understanding emerges through dialogue before code implementation.
|
||||
|
||||
## External Stacks Preferred
|
||||
|
||||
When creating new stacks for any reason, **use the external stack pattern** rather than adding stacks directly to this repository.
|
||||
|
||||
External stacks follow this structure:
|
||||
|
||||
```
|
||||
my-stack/
|
||||
└── stack-orchestrator/
|
||||
├── stacks/
|
||||
│ └── my-stack/
|
||||
│ ├── stack.yml
|
||||
│ └── README.md
|
||||
├── compose/
|
||||
│ └── docker-compose-my-stack.yml
|
||||
└── config/
|
||||
└── my-stack/
|
||||
└── (config files)
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
# Fetch external stack
|
||||
laconic-so fetch-stack github.com/org/my-stack
|
||||
|
||||
# Use external stack
|
||||
STACK_PATH=~/cerc/my-stack/stack-orchestrator/stacks/my-stack
|
||||
laconic-so --stack $STACK_PATH deploy init --output spec.yml
|
||||
laconic-so --stack $STACK_PATH deploy create --spec-file spec.yml --deployment-dir deployment
|
||||
laconic-so deployment --dir deployment start
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
- `zenith-karma-stack` - Karma watcher deployment
|
||||
- `urbit-stack` - Fake Urbit ship for testing
|
||||
- `zenith-desk-stack` - Desk deployment stack
|
||||
|
||||
## Architecture: k8s-kind Deployments
|
||||
|
||||
### One Cluster Per Host
|
||||
One Kind cluster per host by design. Never request or expect separate clusters.
|
||||
|
||||
- `create_cluster()` in `helpers.py` reuses any existing cluster
|
||||
- `cluster-id` in deployment.yml is an identifier, not a cluster request
|
||||
- All deployments share: ingress controller, etcd, certificates
|
||||
|
||||
### Stack Resolution
|
||||
- External stacks detected via `Path(stack).exists()` in `util.py`
|
||||
- Config/compose resolution: external path first, then internal fallback
|
||||
- External path structure: `stack_orchestrator/data/stacks/<name>/stack.yml`
|
||||
|
||||
### Secret Generation Implementation
|
||||
- `GENERATE_TOKEN_PATTERN` in `deployment_create.py` matches `$generate:type:length$`
|
||||
- `_generate_and_store_secrets()` creates K8s Secret
|
||||
- `cluster_info.py` adds `envFrom` with `secretRef` to containers
|
||||
- Non-secret config written to `config.env`
|
||||
|
||||
### Repository Cloning
|
||||
`setup-repositories --git-ssh` clones repos defined in stack.yml's `repos:` field. Requires SSH agent.
|
||||
|
||||
### Key Files (for codebase navigation)
|
||||
- `repos/setup_repositories.py`: `setup-repositories` command (git clone)
|
||||
- `deployment_create.py`: `deploy create` command, secret generation
|
||||
- `deployment.py`: `deployment start/stop/restart` commands
|
||||
- `deploy_k8s.py`: K8s deployer, cluster management calls
|
||||
- `helpers.py`: `create_cluster()`, etcd cleanup, kind operations
|
||||
- `cluster_info.py`: K8s resource generation (Deployment, Service, Ingress)
|
||||
|
||||
## Insights and Observations
|
||||
|
||||
### Design Principles
|
||||
|
||||
53
README.md
53
README.md
@ -71,59 +71,6 @@ The various [stacks](/stack_orchestrator/data/stacks) each contain instructions
|
||||
- [laconicd with console and CLI](stack_orchestrator/data/stacks/fixturenet-laconic-loaded)
|
||||
- [kubo (IPFS)](stack_orchestrator/data/stacks/kubo)
|
||||
|
||||
## Deployment Types
|
||||
|
||||
- **compose**: Docker Compose on local machine
|
||||
- **k8s**: External Kubernetes cluster (requires kubeconfig)
|
||||
- **k8s-kind**: Local Kubernetes via Kind - one cluster per host, shared by all deployments
|
||||
|
||||
## External Stacks
|
||||
|
||||
Stacks can live in external git repositories. Required structure:
|
||||
|
||||
```
|
||||
<repo>/
|
||||
stack_orchestrator/data/
|
||||
stacks/<stack-name>/stack.yml
|
||||
compose/docker-compose-<pod-name>.yml
|
||||
deployment/spec.yml
|
||||
```
|
||||
|
||||
## Deployment Commands
|
||||
|
||||
```bash
|
||||
# Create deployment from spec
|
||||
laconic-so --stack <path> deploy create --spec-file <spec.yml> --deployment-dir <dir>
|
||||
|
||||
# Start (creates cluster on first run)
|
||||
laconic-so deployment --dir <dir> start
|
||||
|
||||
# GitOps restart (git pull + redeploy, preserves data)
|
||||
laconic-so deployment --dir <dir> restart
|
||||
|
||||
# Stop
|
||||
laconic-so deployment --dir <dir> stop
|
||||
```
|
||||
|
||||
## spec.yml Reference
|
||||
|
||||
```yaml
|
||||
stack: stack-name-or-path
|
||||
deploy-to: k8s-kind
|
||||
network:
|
||||
http-proxy:
|
||||
- host-name: app.example.com
|
||||
routes:
|
||||
- path: /
|
||||
proxy-to: service-name:port
|
||||
acme-email: admin@example.com
|
||||
config:
|
||||
ENV_VAR: value
|
||||
SECRET_VAR: $generate:hex:32$ # Auto-generated, stored in K8s Secret
|
||||
volumes:
|
||||
volume-name:
|
||||
```
|
||||
|
||||
## Contributing
|
||||
|
||||
See the [CONTRIBUTING.md](/docs/CONTRIBUTING.md) for developer mode install.
|
||||
|
||||
19
TODO.md
19
TODO.md
@ -7,25 +7,6 @@ We need an "update stack" command in stack orchestrator and cleaner documentatio
|
||||
|
||||
**Context**: Currently, `deploy init` generates a spec file and `deploy create` creates a deployment directory. The `deployment update` command (added by Thomas Lackey) only syncs env vars and restarts - it doesn't regenerate configurations. There's a gap in the workflow for updating stack configurations after initial deployment.
|
||||
|
||||
## Bugs
|
||||
|
||||
### `deploy create` doesn't auto-generate volume mappings for new pods
|
||||
|
||||
When a new pod is added to `stack.yml` (e.g. `monitoring`), `deploy create`
|
||||
does not generate default host path mappings in spec.yml for the new pod's
|
||||
volumes. The deployment then fails at scheduling because the PVCs don't exist.
|
||||
|
||||
**Expected**: `deploy create` enumerates all volumes from all compose files
|
||||
in the stack and generates default host paths for any that aren't already
|
||||
mapped in the spec.yml `volumes:` section.
|
||||
|
||||
**Actual**: Only volumes already in spec.yml get PVs. New volumes are silently
|
||||
missing, causing `FailedScheduling: persistentvolumeclaim not found`.
|
||||
|
||||
**Workaround**: Manually add volume entries to spec.yml and create host dirs.
|
||||
|
||||
**Files**: `deployment_create.py` (`_write_config_file`, volume handling)
|
||||
|
||||
## Architecture Refactoring
|
||||
|
||||
### Separate Deployer from Stack Orchestrator CLI
|
||||
|
||||
76
docs/cli.md
76
docs/cli.md
@ -68,7 +68,7 @@ $ laconic-so build-npms --include <package-name> --force-rebuild
|
||||
|
||||
## deploy
|
||||
|
||||
The `deploy` command group manages persistent deployments. The general workflow is `deploy init` to generate a spec file, then `deploy create` to create a deployment directory from the spec, then runtime commands like `deployment start` and `deployment stop`.
|
||||
The `deploy` command group manages persistent deployments. The general workflow is `deploy init` to generate a spec file, then `deploy create` to create a deployment directory from the spec, then runtime commands like `deploy up` and `deploy down`.
|
||||
|
||||
### deploy init
|
||||
|
||||
@ -101,91 +101,35 @@ Options:
|
||||
- `--spec-file` (required): spec file to use
|
||||
- `--deployment-dir`: target directory for deployment files
|
||||
- `--update`: update an existing deployment directory, preserving data volumes and env file. Changed files are backed up with a `.bak` suffix. The deployment's `config.env` and `deployment.yml` are also preserved.
|
||||
- `--helm-chart`: generate Helm chart instead of deploying (k8s only)
|
||||
- `--network-dir`: network configuration supplied in this directory
|
||||
- `--initial-peers`: initial set of persistent peers
|
||||
|
||||
## deployment
|
||||
### deploy up
|
||||
|
||||
Runtime commands for managing a created deployment. Use `--dir` to specify the deployment directory.
|
||||
|
||||
### deployment start
|
||||
|
||||
Start a deployment (`up` is a legacy alias):
|
||||
Start a deployment:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> start
|
||||
$ laconic-so deployment --dir <deployment-dir> up
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--stay-attached` / `--detatch-terminal`: attach to container stdout (default: detach)
|
||||
- `--skip-cluster-management` / `--perform-cluster-management`: skip kind cluster creation/teardown (default: perform management). Only affects k8s-kind deployments. Use this when multiple stacks share a single cluster.
|
||||
### deploy down
|
||||
|
||||
### deployment stop
|
||||
|
||||
Stop a deployment (`down` is a legacy alias):
|
||||
Stop a deployment:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> stop
|
||||
$ laconic-so deployment --dir <deployment-dir> down
|
||||
```
|
||||
Use `--delete-volumes` to also remove data volumes.
|
||||
|
||||
Options:
|
||||
- `--delete-volumes` / `--preserve-volumes`: delete data volumes on stop (default: preserve)
|
||||
- `--skip-cluster-management` / `--perform-cluster-management`: skip kind cluster teardown (default: perform management). Use this to stop a single deployment without destroying a shared cluster.
|
||||
|
||||
### deployment restart
|
||||
|
||||
Restart a deployment with GitOps-aware workflow. Pulls latest stack code, syncs the deployment directory from the git-tracked spec, and restarts services:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> restart
|
||||
```
|
||||
|
||||
See [deployment_patterns.md](deployment_patterns.md) for the recommended GitOps workflow.
|
||||
|
||||
### deployment ps
|
||||
### deploy ps
|
||||
|
||||
Show running services:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> ps
|
||||
```
|
||||
|
||||
### deployment logs
|
||||
### deploy logs
|
||||
|
||||
View service logs:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> logs
|
||||
```
|
||||
Use `-f` to follow and `-n <count>` to tail.
|
||||
|
||||
### deployment exec
|
||||
|
||||
Execute a command in a running service container:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> exec <service-name> "<command>"
|
||||
```
|
||||
|
||||
### deployment status
|
||||
|
||||
Show deployment status:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> status
|
||||
```
|
||||
|
||||
### deployment port
|
||||
|
||||
Show mapped ports for a service:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> port <service-name> <port>
|
||||
```
|
||||
|
||||
### deployment push-images
|
||||
|
||||
Push deployment images to a registry:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> push-images
|
||||
```
|
||||
|
||||
### deployment run-job
|
||||
|
||||
Run a one-time job in the deployment:
|
||||
```
|
||||
$ laconic-so deployment --dir <deployment-dir> run-job <job-name>
|
||||
```
|
||||
|
||||
@ -1,202 +0,0 @@
|
||||
# Deployment Patterns
|
||||
|
||||
## GitOps Pattern
|
||||
|
||||
For production deployments, we recommend a GitOps approach where your deployment configuration is tracked in version control.
|
||||
|
||||
### Overview
|
||||
|
||||
- **spec.yml is your source of truth**: Maintain it in your operator repository
|
||||
- **Don't regenerate on every restart**: Run `deploy init` once, then customize and commit
|
||||
- **Use restart for updates**: The restart command respects your git-tracked spec.yml
|
||||
|
||||
### Workflow
|
||||
|
||||
1. **Initial setup**: Run `deploy init` once to generate a spec.yml template
|
||||
2. **Customize and commit**: Edit spec.yml with your configuration (hostnames, resources, etc.) and commit to your operator repo
|
||||
3. **Deploy from git**: Use the committed spec.yml for deployments
|
||||
4. **Update via git**: Make changes in git, then restart to apply
|
||||
|
||||
```bash
|
||||
# Initial setup (run once)
|
||||
laconic-so --stack my-stack deploy init --output spec.yml
|
||||
|
||||
# Customize for your environment
|
||||
vim spec.yml # Set hostname, resources, etc.
|
||||
|
||||
# Commit to your operator repository
|
||||
git add spec.yml
|
||||
git commit -m "Add my-stack deployment configuration"
|
||||
git push
|
||||
|
||||
# On deployment server: deploy from git-tracked spec
|
||||
laconic-so --stack my-stack deploy create \
|
||||
--spec-file /path/to/operator-repo/spec.yml \
|
||||
--deployment-dir my-deployment
|
||||
|
||||
laconic-so deployment --dir my-deployment start
|
||||
```
|
||||
|
||||
### Updating Deployments
|
||||
|
||||
When you need to update a deployment:
|
||||
|
||||
```bash
|
||||
# 1. Make changes in your operator repo
|
||||
vim /path/to/operator-repo/spec.yml
|
||||
git commit -am "Update configuration"
|
||||
git push
|
||||
|
||||
# 2. On deployment server: pull and restart
|
||||
cd /path/to/operator-repo && git pull
|
||||
laconic-so deployment --dir my-deployment restart
|
||||
```
|
||||
|
||||
The `restart` command:
|
||||
- Pulls latest code from the stack repository
|
||||
- Uses your git-tracked spec.yml (does NOT regenerate from defaults)
|
||||
- Syncs the deployment directory
|
||||
- Restarts services
|
||||
|
||||
### Anti-patterns
|
||||
|
||||
**Don't do this:**
|
||||
```bash
|
||||
# BAD: Regenerating spec on every deployment
|
||||
laconic-so --stack my-stack deploy init --output spec.yml
|
||||
laconic-so deploy create --spec-file spec.yml ...
|
||||
```
|
||||
|
||||
This overwrites your customizations with defaults from the stack's `commands.py`.
|
||||
|
||||
**Do this instead:**
|
||||
```bash
|
||||
# GOOD: Use your git-tracked spec
|
||||
git pull # Get latest spec.yml from your operator repo
|
||||
laconic-so deployment --dir my-deployment restart
|
||||
```
|
||||
|
||||
## Private Registry Authentication
|
||||
|
||||
For deployments using images from private container registries (e.g., GitHub Container Registry), configure authentication in your spec.yml:
|
||||
|
||||
### Configuration
|
||||
|
||||
Add a `registry-credentials` section to your spec.yml:
|
||||
|
||||
```yaml
|
||||
registry-credentials:
|
||||
server: ghcr.io
|
||||
username: your-org-or-username
|
||||
token-env: REGISTRY_TOKEN
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
- `server`: The registry hostname (e.g., `ghcr.io`, `docker.io`, `gcr.io`)
|
||||
- `username`: Registry username (for GHCR, use your GitHub username or org name)
|
||||
- `token-env`: Name of the environment variable containing your API token/PAT
|
||||
|
||||
### Token Environment Variable
|
||||
|
||||
The `token-env` pattern keeps credentials out of version control. Set the environment variable when running `deployment start`:
|
||||
|
||||
```bash
|
||||
export REGISTRY_TOKEN="your-personal-access-token"
|
||||
laconic-so deployment --dir my-deployment start
|
||||
```
|
||||
|
||||
For GHCR, create a Personal Access Token (PAT) with `read:packages` scope.
|
||||
|
||||
### Ansible Integration
|
||||
|
||||
When using Ansible for deployments, pass the token from a credentials file:
|
||||
|
||||
```yaml
|
||||
- name: Start deployment
|
||||
ansible.builtin.command:
|
||||
cmd: laconic-so deployment --dir {{ deployment_dir }} start
|
||||
environment:
|
||||
REGISTRY_TOKEN: "{{ lookup('file', '~/.credentials/ghcr_token') }}"
|
||||
```
|
||||
|
||||
### How It Works
|
||||
|
||||
1. laconic-so reads the `registry-credentials` config from spec.yml
|
||||
2. Creates a Kubernetes `docker-registry` secret named `{deployment}-registry`
|
||||
3. The deployment's pods reference this secret for image pulls
|
||||
|
||||
## Cluster and Volume Management
|
||||
|
||||
### Stopping Deployments
|
||||
|
||||
The `deployment stop` command has two important flags:
|
||||
|
||||
```bash
|
||||
# Default: stops deployment, deletes cluster, PRESERVES volumes
|
||||
laconic-so deployment --dir my-deployment stop
|
||||
|
||||
# Explicitly delete volumes (USE WITH CAUTION)
|
||||
laconic-so deployment --dir my-deployment stop --delete-volumes
|
||||
```
|
||||
|
||||
### Volume Persistence
|
||||
|
||||
Volumes persist across cluster deletion by design. This is important because:
|
||||
- **Data survives cluster recreation**: Ledger data, databases, and other state are preserved
|
||||
- **Faster recovery**: No need to re-sync or rebuild data after cluster issues
|
||||
- **Safe cluster upgrades**: Delete and recreate cluster without data loss
|
||||
|
||||
**Only use `--delete-volumes` when:**
|
||||
- You explicitly want to start fresh with no data
|
||||
- The user specifically requests volume deletion
|
||||
- You're cleaning up a test/dev environment completely
|
||||
|
||||
### Shared Cluster Architecture
|
||||
|
||||
In kind deployments, multiple stacks share a single cluster:
|
||||
- First `deployment start` creates the cluster
|
||||
- Subsequent deployments reuse the existing cluster
|
||||
- `deployment stop` on ANY deployment deletes the shared cluster
|
||||
- Other deployments will fail until cluster is recreated
|
||||
|
||||
To stop a single deployment without affecting the cluster:
|
||||
```bash
|
||||
laconic-so deployment --dir my-deployment stop --skip-cluster-management
|
||||
```
|
||||
|
||||
## Volume Persistence in k8s-kind
|
||||
|
||||
k8s-kind has 3 storage layers:
|
||||
|
||||
- **Docker Host**: The physical server running Docker
|
||||
- **Kind Node**: A Docker container simulating a k8s node
|
||||
- **Pod Container**: Your workload
|
||||
|
||||
For k8s-kind, volumes with paths are mounted from Docker Host → Kind Node → Pod via extraMounts.
|
||||
|
||||
| spec.yml volume | Storage Location | Survives Pod Restart | Survives Cluster Restart |
|
||||
|-----------------|------------------|---------------------|-------------------------|
|
||||
| `vol:` (empty) | Kind Node PVC | ✅ | ❌ |
|
||||
| `vol: ./data/x` | Docker Host | ✅ | ✅ |
|
||||
| `vol: /abs/path`| Docker Host | ✅ | ✅ |
|
||||
|
||||
**Recommendation**: Always use paths for data you want to keep. Relative paths
|
||||
(e.g., `./data/rpc-config`) resolve to `$DEPLOYMENT_DIR/data/rpc-config` on the
|
||||
Docker Host.
|
||||
|
||||
### Example
|
||||
|
||||
```yaml
|
||||
# In spec.yml
|
||||
volumes:
|
||||
rpc-config: ./data/rpc-config # Persists to $DEPLOYMENT_DIR/data/rpc-config
|
||||
chain-data: ./data/chain # Persists to $DEPLOYMENT_DIR/data/chain
|
||||
temp-cache: # Empty = Kind Node PVC (lost on cluster delete)
|
||||
```
|
||||
|
||||
### The Antipattern
|
||||
|
||||
Empty-path volumes appear persistent because they survive pod restarts (data lives
|
||||
in Kind Node container). However, this data is lost when the kind cluster is
|
||||
recreated. This "false persistence" has caused data loss when operators assumed
|
||||
their data was safe.
|
||||
@ -29,7 +29,6 @@ network_key = "network"
|
||||
http_proxy_key = "http-proxy"
|
||||
image_registry_key = "image-registry"
|
||||
configmaps_key = "configmaps"
|
||||
secrets_key = "secrets"
|
||||
resources_key = "resources"
|
||||
volumes_key = "volumes"
|
||||
security_key = "security"
|
||||
@ -45,4 +44,3 @@ unlimited_memlock_key = "unlimited-memlock"
|
||||
runtime_class_key = "runtime-class"
|
||||
high_memlock_runtime = "high-memlock"
|
||||
high_memlock_spec_filename = "high-memlock-spec.json"
|
||||
acme_email_key = "acme-email"
|
||||
|
||||
@ -1,5 +0,0 @@
|
||||
services:
|
||||
test-job:
|
||||
image: cerc/test-container:local
|
||||
entrypoint: /bin/sh
|
||||
command: ["-c", "echo 'Job completed successfully'"]
|
||||
@ -93,7 +93,6 @@ rules:
|
||||
- get
|
||||
- create
|
||||
- update
|
||||
- delete
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
|
||||
@ -21,7 +21,7 @@ from stack_orchestrator.deploy.deploy_util import VolumeMapping, run_container_c
|
||||
from pathlib import Path
|
||||
|
||||
default_spec_file_content = """config:
|
||||
test_variable_1: test-value-1
|
||||
test-variable-1: test-value-1
|
||||
"""
|
||||
|
||||
|
||||
|
||||
@ -7,5 +7,3 @@ containers:
|
||||
- cerc/test-container
|
||||
pods:
|
||||
- test
|
||||
jobs:
|
||||
- test-job
|
||||
|
||||
@ -35,7 +35,6 @@ from stack_orchestrator.util import (
|
||||
get_dev_root_path,
|
||||
stack_is_in_deployment,
|
||||
resolve_compose_file,
|
||||
get_job_list,
|
||||
)
|
||||
from stack_orchestrator.deploy.deployer import DeployerException
|
||||
from stack_orchestrator.deploy.deployer_factory import getDeployer
|
||||
@ -131,7 +130,6 @@ def create_deploy_context(
|
||||
compose_files=cluster_context.compose_files,
|
||||
compose_project_name=cluster_context.cluster,
|
||||
compose_env_file=cluster_context.env_file,
|
||||
job_compose_files=cluster_context.job_compose_files,
|
||||
)
|
||||
return DeployCommandContext(stack, cluster_context, deployer)
|
||||
|
||||
@ -405,7 +403,7 @@ def _make_cluster_context(ctx, stack, include, exclude, cluster, env_file):
|
||||
stack_config = get_parsed_stack_config(stack)
|
||||
if stack_config is not None:
|
||||
# TODO: syntax check the input here
|
||||
pods_in_scope = stack_config.get("pods") or []
|
||||
pods_in_scope = stack_config["pods"]
|
||||
cluster_config = (
|
||||
stack_config["config"] if "config" in stack_config else None
|
||||
)
|
||||
@ -479,22 +477,6 @@ def _make_cluster_context(ctx, stack, include, exclude, cluster, env_file):
|
||||
if ctx.verbose:
|
||||
print(f"files: {compose_files}")
|
||||
|
||||
# Gather job compose files (from compose-jobs/ directory in deployment)
|
||||
job_compose_files = []
|
||||
if deployment and stack:
|
||||
stack_config = get_parsed_stack_config(stack)
|
||||
if stack_config:
|
||||
jobs = get_job_list(stack_config)
|
||||
compose_jobs_dir = stack.joinpath("compose-jobs")
|
||||
for job in jobs:
|
||||
job_file_name = os.path.join(
|
||||
compose_jobs_dir, f"docker-compose-{job}.yml"
|
||||
)
|
||||
if os.path.exists(job_file_name):
|
||||
job_compose_files.append(job_file_name)
|
||||
if ctx.verbose:
|
||||
print(f"job files: {job_compose_files}")
|
||||
|
||||
return ClusterContext(
|
||||
ctx,
|
||||
cluster,
|
||||
@ -503,7 +485,6 @@ def _make_cluster_context(ctx, stack, include, exclude, cluster, env_file):
|
||||
post_start_commands,
|
||||
cluster_config,
|
||||
env_file,
|
||||
job_compose_files=job_compose_files if job_compose_files else None,
|
||||
)
|
||||
|
||||
|
||||
|
||||
@ -29,7 +29,6 @@ class ClusterContext:
|
||||
post_start_commands: List[str]
|
||||
config: Optional[str]
|
||||
env_file: Optional[str]
|
||||
job_compose_files: Optional[List[str]] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@ -34,12 +34,7 @@ def getDeployerConfigGenerator(type: str, deployment_context):
|
||||
|
||||
|
||||
def getDeployer(
|
||||
type: str,
|
||||
deployment_context,
|
||||
compose_files,
|
||||
compose_project_name,
|
||||
compose_env_file,
|
||||
job_compose_files=None,
|
||||
type: str, deployment_context, compose_files, compose_project_name, compose_env_file
|
||||
):
|
||||
if type == "compose" or type is None:
|
||||
return DockerDeployer(
|
||||
@ -59,7 +54,6 @@ def getDeployer(
|
||||
compose_files,
|
||||
compose_project_name,
|
||||
compose_env_file,
|
||||
job_compose_files=job_compose_files,
|
||||
)
|
||||
else:
|
||||
print(f"ERROR: deploy-to {type} is not valid")
|
||||
|
||||
@ -15,9 +15,7 @@
|
||||
|
||||
import click
|
||||
from pathlib import Path
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
from stack_orchestrator import constants
|
||||
from stack_orchestrator.deploy.images import push_images_operation
|
||||
from stack_orchestrator.deploy.deploy import (
|
||||
@ -230,176 +228,3 @@ def run_job(ctx, job_name, helm_release):
|
||||
|
||||
ctx.obj = make_deploy_context(ctx)
|
||||
run_job_operation(ctx, job_name, helm_release)
|
||||
|
||||
|
||||
@command.command()
|
||||
@click.option("--stack-path", help="Path to stack git repo (overrides stored path)")
|
||||
@click.option(
|
||||
"--spec-file", help="Path to GitOps spec.yml in repo (e.g., deployment/spec.yml)"
|
||||
)
|
||||
@click.option("--config-file", help="Config file to pass to deploy init")
|
||||
@click.option(
|
||||
"--force",
|
||||
is_flag=True,
|
||||
default=False,
|
||||
help="Skip DNS verification",
|
||||
)
|
||||
@click.option(
|
||||
"--expected-ip",
|
||||
help="Expected IP for DNS verification (if different from egress)",
|
||||
)
|
||||
@click.pass_context
|
||||
def restart(ctx, stack_path, spec_file, config_file, force, expected_ip):
|
||||
"""Pull latest code and restart deployment using git-tracked spec.
|
||||
|
||||
GitOps workflow:
|
||||
1. Operator maintains spec.yml in their git repository
|
||||
2. This command pulls latest code (including updated spec.yml)
|
||||
3. If hostname changed, verifies DNS routes to this server
|
||||
4. Syncs deployment directory with the git-tracked spec
|
||||
5. Stops and restarts the deployment
|
||||
|
||||
Data volumes are always preserved. The cluster is never destroyed.
|
||||
|
||||
Stack source resolution (in order):
|
||||
1. --stack-path argument (if provided)
|
||||
2. stack-source field in deployment.yml (if stored)
|
||||
3. Error if neither available
|
||||
|
||||
Note: spec.yml should be maintained in git, not regenerated from
|
||||
commands.py on each restart. Use 'deploy init' only for initial
|
||||
spec generation, then customize and commit to your operator repo.
|
||||
"""
|
||||
from stack_orchestrator.util import get_yaml, get_parsed_deployment_spec
|
||||
from stack_orchestrator.deploy.deployment_create import create_operation
|
||||
from stack_orchestrator.deploy.dns_probe import verify_dns_via_probe
|
||||
|
||||
deployment_context: DeploymentContext = ctx.obj
|
||||
|
||||
# Get current spec info (before git pull)
|
||||
current_spec = deployment_context.spec
|
||||
current_http_proxy = current_spec.get_http_proxy()
|
||||
current_hostname = (
|
||||
current_http_proxy[0]["host-name"] if current_http_proxy else None
|
||||
)
|
||||
|
||||
# Resolve stack source path
|
||||
if stack_path:
|
||||
stack_source = Path(stack_path).resolve()
|
||||
else:
|
||||
# Try to get from deployment.yml
|
||||
deployment_file = (
|
||||
deployment_context.deployment_dir / constants.deployment_file_name
|
||||
)
|
||||
deployment_data = get_yaml().load(open(deployment_file))
|
||||
stack_source_str = deployment_data.get("stack-source")
|
||||
if not stack_source_str:
|
||||
print(
|
||||
"Error: No stack-source in deployment.yml and --stack-path not provided"
|
||||
)
|
||||
print("Use --stack-path to specify the stack git repository location")
|
||||
sys.exit(1)
|
||||
stack_source = Path(stack_source_str)
|
||||
|
||||
if not stack_source.exists():
|
||||
print(f"Error: Stack source path does not exist: {stack_source}")
|
||||
sys.exit(1)
|
||||
|
||||
print("=== Deployment Restart ===")
|
||||
print(f"Deployment dir: {deployment_context.deployment_dir}")
|
||||
print(f"Stack source: {stack_source}")
|
||||
print(f"Current hostname: {current_hostname}")
|
||||
|
||||
# Step 1: Git pull (brings in updated spec.yml from operator's repo)
|
||||
print("\n[1/4] Pulling latest code from stack repository...")
|
||||
git_result = subprocess.run(
|
||||
["git", "pull"], cwd=stack_source, capture_output=True, text=True
|
||||
)
|
||||
if git_result.returncode != 0:
|
||||
print(f"Git pull failed: {git_result.stderr}")
|
||||
sys.exit(1)
|
||||
print(f"Git pull: {git_result.stdout.strip()}")
|
||||
|
||||
# Determine spec file location
|
||||
# Priority: --spec-file argument > repo's deployment/spec.yml > deployment dir
|
||||
# Stack path is like: repo/stack_orchestrator/data/stacks/stack-name
|
||||
# So repo root is 4 parents up
|
||||
repo_root = stack_source.parent.parent.parent.parent
|
||||
if spec_file:
|
||||
# Spec file relative to repo root
|
||||
spec_file_path = repo_root / spec_file
|
||||
else:
|
||||
# Try standard GitOps location in repo
|
||||
gitops_spec = repo_root / "deployment" / "spec.yml"
|
||||
if gitops_spec.exists():
|
||||
spec_file_path = gitops_spec
|
||||
else:
|
||||
# Fall back to deployment directory
|
||||
spec_file_path = deployment_context.deployment_dir / "spec.yml"
|
||||
|
||||
if not spec_file_path.exists():
|
||||
print(f"Error: spec.yml not found at {spec_file_path}")
|
||||
print("For GitOps, add spec.yml to your repo at deployment/spec.yml")
|
||||
print("Or specify --spec-file with path relative to repo root")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Using spec: {spec_file_path}")
|
||||
|
||||
# Parse spec to check for hostname changes
|
||||
new_spec_obj = get_parsed_deployment_spec(str(spec_file_path))
|
||||
new_http_proxy = new_spec_obj.get("network", {}).get("http-proxy", [])
|
||||
new_hostname = new_http_proxy[0]["host-name"] if new_http_proxy else None
|
||||
|
||||
print(f"Spec hostname: {new_hostname}")
|
||||
|
||||
# Step 2: DNS verification (only if hostname changed)
|
||||
if new_hostname and new_hostname != current_hostname:
|
||||
print(f"\n[2/4] Hostname changed: {current_hostname} -> {new_hostname}")
|
||||
if force:
|
||||
print("DNS verification skipped (--force)")
|
||||
else:
|
||||
print("Verifying DNS via probe...")
|
||||
if not verify_dns_via_probe(new_hostname):
|
||||
print(f"\nDNS verification failed for {new_hostname}")
|
||||
print("Ensure DNS is configured before restarting.")
|
||||
print("Use --force to skip this check.")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print("\n[2/4] Hostname unchanged, skipping DNS verification")
|
||||
|
||||
# Step 3: Sync deployment directory with spec
|
||||
print("\n[3/4] Syncing deployment directory...")
|
||||
deploy_ctx = make_deploy_context(ctx)
|
||||
create_operation(
|
||||
deployment_command_context=deploy_ctx,
|
||||
spec_file=str(spec_file_path),
|
||||
deployment_dir=str(deployment_context.deployment_dir),
|
||||
update=True,
|
||||
network_dir=None,
|
||||
initial_peers=None,
|
||||
)
|
||||
|
||||
# Reload deployment context with updated spec
|
||||
deployment_context.init(deployment_context.deployment_dir)
|
||||
ctx.obj = deployment_context
|
||||
|
||||
# Stop deployment
|
||||
print("\n[4/4] Restarting deployment...")
|
||||
ctx.obj = make_deploy_context(ctx)
|
||||
down_operation(
|
||||
ctx, delete_volumes=False, extra_args_list=[], skip_cluster_management=True
|
||||
)
|
||||
|
||||
# Brief pause to ensure clean shutdown
|
||||
time.sleep(5)
|
||||
|
||||
# Start deployment
|
||||
up_operation(
|
||||
ctx, services_list=None, stay_attached=False, skip_cluster_management=True
|
||||
)
|
||||
|
||||
print("\n=== Restart Complete ===")
|
||||
print("Deployment restarted with git-tracked configuration.")
|
||||
if new_hostname and new_hostname != current_hostname:
|
||||
print(f"\nNew hostname: {new_hostname}")
|
||||
print("Caddy will automatically provision TLS certificate.")
|
||||
|
||||
@ -15,12 +15,9 @@
|
||||
|
||||
import click
|
||||
from importlib import util
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import base64
|
||||
from pathlib import Path
|
||||
from typing import List, Optional
|
||||
from typing import List
|
||||
import random
|
||||
from shutil import copy, copyfile, copytree, rmtree
|
||||
from secrets import token_hex
|
||||
@ -265,25 +262,6 @@ def call_stack_deploy_create(deployment_context, extra_args):
|
||||
imported_stack.create(deployment_context, extra_args)
|
||||
|
||||
|
||||
def call_stack_deploy_start(deployment_context):
|
||||
"""Call start() hooks after k8s deployments and jobs are created.
|
||||
|
||||
The start() hook receives the DeploymentContext, allowing stacks to
|
||||
create additional k8s resources (Services, etc.) in the deployment namespace.
|
||||
The namespace can be derived as f"laconic-{deployment_context.id}".
|
||||
"""
|
||||
python_file_paths = _commands_plugin_paths(deployment_context.stack.name)
|
||||
for python_file_path in python_file_paths:
|
||||
if python_file_path.exists():
|
||||
spec = util.spec_from_file_location("commands", python_file_path)
|
||||
if spec is None or spec.loader is None:
|
||||
continue
|
||||
imported_stack = util.module_from_spec(spec)
|
||||
spec.loader.exec_module(imported_stack)
|
||||
if _has_method(imported_stack, "start"):
|
||||
imported_stack.start(deployment_context)
|
||||
|
||||
|
||||
# Inspect the pod yaml to find config files referenced in subdirectories
|
||||
# other than the one associated with the pod
|
||||
def _find_extra_config_dirs(parsed_pod_file, pod):
|
||||
@ -490,15 +468,9 @@ def init_operation(
|
||||
else:
|
||||
volume_descriptors[named_volume] = f"./data/{named_volume}"
|
||||
if volume_descriptors:
|
||||
# Merge with existing volumes from stack init()
|
||||
# init() volumes take precedence over compose defaults
|
||||
orig_volumes = spec_file_content.get("volumes", {})
|
||||
spec_file_content["volumes"] = {**volume_descriptors, **orig_volumes}
|
||||
spec_file_content["volumes"] = volume_descriptors
|
||||
if configmap_descriptors:
|
||||
spec_file_content["configmaps"] = configmap_descriptors
|
||||
if "k8s" in deployer_type:
|
||||
if "secrets" not in spec_file_content:
|
||||
spec_file_content["secrets"] = {}
|
||||
|
||||
if opts.o.debug:
|
||||
print(
|
||||
@ -509,180 +481,15 @@ def init_operation(
|
||||
get_yaml().dump(spec_file_content, output_file)
|
||||
|
||||
|
||||
# Token pattern: $generate:hex:32$ or $generate:base64:16$
|
||||
GENERATE_TOKEN_PATTERN = re.compile(r"\$generate:(\w+):(\d+)\$")
|
||||
|
||||
|
||||
def _generate_and_store_secrets(config_vars: dict, deployment_name: str):
|
||||
"""Generate secrets for $generate:...$ tokens and store in K8s Secret.
|
||||
|
||||
Called by `deploy create` - generates fresh secrets and stores them.
|
||||
Returns the generated secrets dict for reference.
|
||||
"""
|
||||
from kubernetes import client, config as k8s_config
|
||||
|
||||
secrets = {}
|
||||
for name, value in config_vars.items():
|
||||
if not isinstance(value, str):
|
||||
continue
|
||||
match = GENERATE_TOKEN_PATTERN.search(value)
|
||||
if not match:
|
||||
continue
|
||||
|
||||
secret_type, length = match.group(1), int(match.group(2))
|
||||
if secret_type == "hex":
|
||||
secrets[name] = token_hex(length)
|
||||
elif secret_type == "base64":
|
||||
secrets[name] = base64.b64encode(os.urandom(length)).decode()
|
||||
else:
|
||||
secrets[name] = token_hex(length)
|
||||
|
||||
if not secrets:
|
||||
return secrets
|
||||
|
||||
# Store in K8s Secret
|
||||
try:
|
||||
k8s_config.load_kube_config()
|
||||
except Exception:
|
||||
# Fall back to in-cluster config if available
|
||||
try:
|
||||
k8s_config.load_incluster_config()
|
||||
except Exception:
|
||||
print(
|
||||
"Warning: Could not load kube config, secrets will not be stored in K8s"
|
||||
)
|
||||
return secrets
|
||||
|
||||
v1 = client.CoreV1Api()
|
||||
secret_name = f"{deployment_name}-generated-secrets"
|
||||
namespace = "default"
|
||||
|
||||
secret_data = {k: base64.b64encode(v.encode()).decode() for k, v in secrets.items()}
|
||||
k8s_secret = client.V1Secret(
|
||||
metadata=client.V1ObjectMeta(name=secret_name), data=secret_data, type="Opaque"
|
||||
)
|
||||
|
||||
try:
|
||||
v1.create_namespaced_secret(namespace, k8s_secret)
|
||||
num_secrets = len(secrets)
|
||||
print(f"Created K8s Secret '{secret_name}' with {num_secrets} secret(s)")
|
||||
except client.exceptions.ApiException as e:
|
||||
if e.status == 409: # Already exists
|
||||
v1.replace_namespaced_secret(secret_name, namespace, k8s_secret)
|
||||
num_secrets = len(secrets)
|
||||
print(f"Updated K8s Secret '{secret_name}' with {num_secrets} secret(s)")
|
||||
else:
|
||||
raise
|
||||
|
||||
return secrets
|
||||
|
||||
|
||||
def create_registry_secret(spec: Spec, deployment_name: str) -> Optional[str]:
|
||||
"""Create K8s docker-registry secret from spec + environment.
|
||||
|
||||
Reads registry configuration from spec.yml and creates a Kubernetes
|
||||
secret of type kubernetes.io/dockerconfigjson for image pulls.
|
||||
|
||||
Args:
|
||||
spec: The deployment spec containing image-registry config
|
||||
deployment_name: Name of the deployment (used for secret naming)
|
||||
|
||||
Returns:
|
||||
The secret name if created, None if no registry config
|
||||
"""
|
||||
from kubernetes import client, config as k8s_config
|
||||
|
||||
registry_config = spec.get_image_registry_config()
|
||||
if not registry_config:
|
||||
return None
|
||||
|
||||
server = registry_config.get("server")
|
||||
username = registry_config.get("username")
|
||||
token_env = registry_config.get("token-env")
|
||||
|
||||
if not all([server, username, token_env]):
|
||||
return None
|
||||
|
||||
# Type narrowing for pyright - we've validated these aren't None above
|
||||
assert token_env is not None
|
||||
token = os.environ.get(token_env)
|
||||
if not token:
|
||||
print(
|
||||
f"Warning: Registry token env var '{token_env}' not set, "
|
||||
"skipping registry secret"
|
||||
)
|
||||
return None
|
||||
|
||||
# Create dockerconfigjson format (Docker API uses "password" field for tokens)
|
||||
auth = base64.b64encode(f"{username}:{token}".encode()).decode()
|
||||
docker_config = {
|
||||
"auths": {server: {"username": username, "password": token, "auth": auth}}
|
||||
}
|
||||
|
||||
# Secret name derived from deployment name
|
||||
secret_name = f"{deployment_name}-registry"
|
||||
|
||||
# Load kube config
|
||||
try:
|
||||
k8s_config.load_kube_config()
|
||||
except Exception:
|
||||
try:
|
||||
k8s_config.load_incluster_config()
|
||||
except Exception:
|
||||
print("Warning: Could not load kube config, registry secret not created")
|
||||
return None
|
||||
|
||||
v1 = client.CoreV1Api()
|
||||
namespace = "default"
|
||||
|
||||
k8s_secret = client.V1Secret(
|
||||
metadata=client.V1ObjectMeta(name=secret_name),
|
||||
data={
|
||||
".dockerconfigjson": base64.b64encode(
|
||||
json.dumps(docker_config).encode()
|
||||
).decode()
|
||||
},
|
||||
type="kubernetes.io/dockerconfigjson",
|
||||
)
|
||||
|
||||
try:
|
||||
v1.create_namespaced_secret(namespace, k8s_secret)
|
||||
print(f"Created registry secret '{secret_name}' for {server}")
|
||||
except client.exceptions.ApiException as e:
|
||||
if e.status == 409: # Already exists
|
||||
v1.replace_namespaced_secret(secret_name, namespace, k8s_secret)
|
||||
print(f"Updated registry secret '{secret_name}' for {server}")
|
||||
else:
|
||||
raise
|
||||
|
||||
return secret_name
|
||||
|
||||
|
||||
def _write_config_file(
|
||||
spec_file: Path, config_env_file: Path, deployment_name: Optional[str] = None
|
||||
):
|
||||
def _write_config_file(spec_file: Path, config_env_file: Path):
|
||||
spec_content = get_parsed_deployment_spec(spec_file)
|
||||
config_vars = spec_content.get("config", {}) or {}
|
||||
|
||||
# Generate and store secrets in K8s if deployment_name provided and tokens exist
|
||||
if deployment_name and config_vars:
|
||||
has_generate_tokens = any(
|
||||
isinstance(v, str) and GENERATE_TOKEN_PATTERN.search(v)
|
||||
for v in config_vars.values()
|
||||
)
|
||||
if has_generate_tokens:
|
||||
_generate_and_store_secrets(config_vars, deployment_name)
|
||||
|
||||
# Write non-secret config to config.env (exclude $generate:...$ tokens)
|
||||
# Note: we want to write an empty file even if we have no config variables
|
||||
with open(config_env_file, "w") as output_file:
|
||||
if config_vars:
|
||||
for variable_name, variable_value in config_vars.items():
|
||||
# Skip variables with generate tokens - they go to K8s Secret
|
||||
if isinstance(variable_value, str) and GENERATE_TOKEN_PATTERN.search(
|
||||
variable_value
|
||||
):
|
||||
continue
|
||||
output_file.write(f"{variable_name}={variable_value}\n")
|
||||
if "config" in spec_content and spec_content["config"]:
|
||||
config_vars = spec_content["config"]
|
||||
if config_vars:
|
||||
for variable_name, variable_value in config_vars.items():
|
||||
output_file.write(f"{variable_name}={variable_value}\n")
|
||||
|
||||
|
||||
def _write_kube_config_file(external_path: Path, internal_path: Path):
|
||||
@ -697,14 +504,11 @@ def _copy_files_to_directory(file_paths: List[Path], directory: Path):
|
||||
copy(path, os.path.join(directory, os.path.basename(path)))
|
||||
|
||||
|
||||
def _create_deployment_file(deployment_dir: Path, stack_source: Optional[Path] = None):
|
||||
def _create_deployment_file(deployment_dir: Path):
|
||||
deployment_file_path = deployment_dir.joinpath(constants.deployment_file_name)
|
||||
cluster = f"{constants.cluster_name_prefix}{token_hex(8)}"
|
||||
deployment_content = {constants.cluster_id_key: cluster}
|
||||
if stack_source:
|
||||
deployment_content["stack-source"] = str(stack_source)
|
||||
with open(deployment_file_path, "w") as output_file:
|
||||
get_yaml().dump(deployment_content, output_file)
|
||||
output_file.write(f"{constants.cluster_id_key}: {cluster}\n")
|
||||
|
||||
|
||||
def _check_volume_definitions(spec):
|
||||
@ -712,14 +516,10 @@ def _check_volume_definitions(spec):
|
||||
for volume_name, volume_path in spec.get_volumes().items():
|
||||
if volume_path:
|
||||
if not os.path.isabs(volume_path):
|
||||
# For k8s-kind: allow relative paths, they'll be resolved
|
||||
# by _make_absolute_host_path() during kind config generation
|
||||
if not spec.is_kind_deployment():
|
||||
deploy_type = spec.get_deployment_type()
|
||||
raise Exception(
|
||||
f"Relative path {volume_path} for volume "
|
||||
f"{volume_name} not supported for {deploy_type}"
|
||||
)
|
||||
raise Exception(
|
||||
f"Relative path {volume_path} for volume {volume_name} not "
|
||||
f"supported for deployment type {spec.get_deployment_type()}"
|
||||
)
|
||||
|
||||
|
||||
@click.command()
|
||||
@ -813,15 +613,11 @@ def create_operation(
|
||||
generate_helm_chart(stack_name, spec_file, deployment_dir_path)
|
||||
return # Exit early for helm chart generation
|
||||
|
||||
# Resolve stack source path for restart capability
|
||||
stack_source = get_stack_path(stack_name)
|
||||
|
||||
if update:
|
||||
# Sync mode: write to temp dir, then copy to deployment dir with backups
|
||||
temp_dir = Path(tempfile.mkdtemp(prefix="deployment-sync-"))
|
||||
try:
|
||||
# Write deployment files to temp dir
|
||||
# (skip deployment.yml to preserve cluster ID)
|
||||
# Write deployment files to temp dir (skip deployment.yml to preserve cluster ID)
|
||||
_write_deployment_files(
|
||||
temp_dir,
|
||||
Path(spec_file),
|
||||
@ -829,14 +625,12 @@ def create_operation(
|
||||
stack_name,
|
||||
deployment_type,
|
||||
include_deployment_file=False,
|
||||
stack_source=stack_source,
|
||||
)
|
||||
|
||||
# Copy from temp to deployment dir, excluding data volumes
|
||||
# and backing up changed files.
|
||||
# Exclude data/* to avoid touching user data volumes.
|
||||
# Exclude config file to preserve deployment settings
|
||||
# (XXX breaks passing config vars from spec)
|
||||
# Copy from temp to deployment dir, excluding data volumes and backing up changed files
|
||||
# Exclude data/* to avoid touching user data volumes
|
||||
# Exclude config file to preserve deployment settings (XXX breaks passing config vars
|
||||
# from spec. could warn about this or not exclude...)
|
||||
exclude_patterns = ["data", "data/*", constants.config_file_name]
|
||||
_safe_copy_tree(
|
||||
temp_dir, deployment_dir_path, exclude_patterns=exclude_patterns
|
||||
@ -853,7 +647,6 @@ def create_operation(
|
||||
stack_name,
|
||||
deployment_type,
|
||||
include_deployment_file=True,
|
||||
stack_source=stack_source,
|
||||
)
|
||||
|
||||
# Delegate to the stack's Python code
|
||||
@ -874,7 +667,7 @@ def create_operation(
|
||||
)
|
||||
|
||||
|
||||
def _safe_copy_tree(src: Path, dst: Path, exclude_patterns: Optional[List[str]] = None):
|
||||
def _safe_copy_tree(src: Path, dst: Path, exclude_patterns: List[str] = None):
|
||||
"""
|
||||
Recursively copy a directory tree, backing up changed files with .bak suffix.
|
||||
|
||||
@ -925,7 +718,6 @@ def _write_deployment_files(
|
||||
stack_name: str,
|
||||
deployment_type: str,
|
||||
include_deployment_file: bool = True,
|
||||
stack_source: Optional[Path] = None,
|
||||
):
|
||||
"""
|
||||
Write deployment files to target directory.
|
||||
@ -935,8 +727,7 @@ def _write_deployment_files(
|
||||
:param parsed_spec: Parsed spec object
|
||||
:param stack_name: Name of stack
|
||||
:param deployment_type: Type of deployment
|
||||
:param include_deployment_file: Whether to create deployment.yml (skip for update)
|
||||
:param stack_source: Path to stack source (git repo) for restart capability
|
||||
:param include_deployment_file: Whether to create deployment.yml file (skip for update)
|
||||
"""
|
||||
stack_file = get_stack_path(stack_name).joinpath(constants.stack_file_name)
|
||||
parsed_stack = get_parsed_stack_config(stack_name)
|
||||
@ -947,15 +738,10 @@ def _write_deployment_files(
|
||||
|
||||
# Create deployment file if requested
|
||||
if include_deployment_file:
|
||||
_create_deployment_file(target_dir, stack_source=stack_source)
|
||||
_create_deployment_file(target_dir)
|
||||
|
||||
# Copy any config variables from the spec file into an env file suitable for compose
|
||||
# Use stack_name as deployment_name for K8s secret naming
|
||||
# Extract just the name part if stack_name is a path ("path/to/stack" -> "stack")
|
||||
deployment_name = Path(stack_name).name.replace("_", "-")
|
||||
_write_config_file(
|
||||
spec_file, target_dir.joinpath(constants.config_file_name), deployment_name
|
||||
)
|
||||
_write_config_file(spec_file, target_dir.joinpath(constants.config_file_name))
|
||||
|
||||
# Copy any k8s config file into the target dir
|
||||
if deployment_type == "k8s":
|
||||
@ -1004,11 +790,20 @@ def _write_deployment_files(
|
||||
script_paths = get_pod_script_paths(parsed_stack, pod)
|
||||
_copy_files_to_directory(script_paths, destination_script_dir)
|
||||
|
||||
if not parsed_spec.is_kubernetes_deployment():
|
||||
if parsed_spec.is_kubernetes_deployment():
|
||||
for configmap in parsed_spec.get_configmaps():
|
||||
source_config_dir = resolve_config_dir(stack_name, configmap)
|
||||
if os.path.exists(source_config_dir):
|
||||
destination_config_dir = target_dir.joinpath(
|
||||
"configmaps", configmap
|
||||
)
|
||||
copytree(
|
||||
source_config_dir, destination_config_dir, dirs_exist_ok=True
|
||||
)
|
||||
else:
|
||||
# TODO:
|
||||
# This is odd - looks up config dir that matches a volume name,
|
||||
# then copies as a mount dir?
|
||||
# AFAICT not used by or relevant to any existing stack - roy
|
||||
# this is odd - looks up config dir that matches a volume name, then copies as a mount dir?
|
||||
# AFAICT this is not used by or relevant to any existing stack - roy
|
||||
|
||||
# TODO: We should probably only do this if the volume is marked :ro.
|
||||
for volume_name, volume_path in parsed_spec.get_volumes().items():
|
||||
@ -1026,22 +821,9 @@ def _write_deployment_files(
|
||||
dirs_exist_ok=True,
|
||||
)
|
||||
|
||||
# Copy configmap directories for k8s deployments (outside the pod loop
|
||||
# so this works for jobs-only stacks too)
|
||||
if parsed_spec.is_kubernetes_deployment():
|
||||
for configmap in parsed_spec.get_configmaps():
|
||||
source_config_dir = resolve_config_dir(stack_name, configmap)
|
||||
if os.path.exists(source_config_dir):
|
||||
destination_config_dir = target_dir.joinpath(
|
||||
"configmaps", configmap
|
||||
)
|
||||
copytree(
|
||||
source_config_dir, destination_config_dir, dirs_exist_ok=True
|
||||
)
|
||||
|
||||
# Copy the job files into the target dir
|
||||
# Copy the job files into the target dir (for Docker deployments)
|
||||
jobs = get_job_list(parsed_stack)
|
||||
if jobs:
|
||||
if jobs and not parsed_spec.is_kubernetes_deployment():
|
||||
destination_compose_jobs_dir = target_dir.joinpath("compose-jobs")
|
||||
os.makedirs(destination_compose_jobs_dir, exist_ok=True)
|
||||
for job in jobs:
|
||||
|
||||
@ -1,159 +0,0 @@
|
||||
# Copyright © 2024 Vulcanize
|
||||
# SPDX-License-Identifier: AGPL-3.0
|
||||
|
||||
"""DNS verification via temporary ingress probe."""
|
||||
|
||||
import secrets
|
||||
import socket
|
||||
import time
|
||||
from typing import Optional
|
||||
import requests
|
||||
from kubernetes import client
|
||||
|
||||
|
||||
def get_server_egress_ip() -> str:
|
||||
"""Get this server's public egress IP via ipify."""
|
||||
response = requests.get("https://api.ipify.org", timeout=10)
|
||||
response.raise_for_status()
|
||||
return response.text.strip()
|
||||
|
||||
|
||||
def resolve_hostname(hostname: str) -> list[str]:
|
||||
"""Resolve hostname to list of IP addresses."""
|
||||
try:
|
||||
_, _, ips = socket.gethostbyname_ex(hostname)
|
||||
return ips
|
||||
except socket.gaierror:
|
||||
return []
|
||||
|
||||
|
||||
def verify_dns_simple(hostname: str, expected_ip: Optional[str] = None) -> bool:
|
||||
"""Simple DNS verification - check hostname resolves to expected IP.
|
||||
|
||||
If expected_ip not provided, uses server's egress IP.
|
||||
Returns True if hostname resolves to expected IP.
|
||||
"""
|
||||
resolved_ips = resolve_hostname(hostname)
|
||||
if not resolved_ips:
|
||||
print(f"DNS FAIL: {hostname} does not resolve")
|
||||
return False
|
||||
|
||||
if expected_ip is None:
|
||||
expected_ip = get_server_egress_ip()
|
||||
|
||||
if expected_ip in resolved_ips:
|
||||
print(f"DNS OK: {hostname} -> {resolved_ips} (includes {expected_ip})")
|
||||
return True
|
||||
else:
|
||||
print(f"DNS WARN: {hostname} -> {resolved_ips} (expected {expected_ip})")
|
||||
return False
|
||||
|
||||
|
||||
def create_probe_ingress(hostname: str, namespace: str = "default") -> str:
|
||||
"""Create a temporary ingress for DNS probing.
|
||||
|
||||
Returns the probe token that the ingress will respond with.
|
||||
"""
|
||||
token = secrets.token_hex(16)
|
||||
|
||||
networking_api = client.NetworkingV1Api()
|
||||
|
||||
# Create a simple ingress that Caddy will pick up
|
||||
ingress = client.V1Ingress(
|
||||
metadata=client.V1ObjectMeta(
|
||||
name="laconic-dns-probe",
|
||||
annotations={
|
||||
"kubernetes.io/ingress.class": "caddy",
|
||||
"laconic.com/probe-token": token,
|
||||
},
|
||||
),
|
||||
spec=client.V1IngressSpec(
|
||||
rules=[
|
||||
client.V1IngressRule(
|
||||
host=hostname,
|
||||
http=client.V1HTTPIngressRuleValue(
|
||||
paths=[
|
||||
client.V1HTTPIngressPath(
|
||||
path="/.well-known/laconic-probe",
|
||||
path_type="Exact",
|
||||
backend=client.V1IngressBackend(
|
||||
service=client.V1IngressServiceBackend(
|
||||
name="caddy-ingress-controller",
|
||||
port=client.V1ServiceBackendPort(number=80),
|
||||
)
|
||||
),
|
||||
)
|
||||
]
|
||||
),
|
||||
)
|
||||
]
|
||||
),
|
||||
)
|
||||
|
||||
networking_api.create_namespaced_ingress(namespace=namespace, body=ingress)
|
||||
return token
|
||||
|
||||
|
||||
def delete_probe_ingress(namespace: str = "default"):
|
||||
"""Delete the temporary probe ingress."""
|
||||
networking_api = client.NetworkingV1Api()
|
||||
try:
|
||||
networking_api.delete_namespaced_ingress(
|
||||
name="laconic-dns-probe", namespace=namespace
|
||||
)
|
||||
except client.exceptions.ApiException:
|
||||
pass # Ignore if already deleted
|
||||
|
||||
|
||||
def verify_dns_via_probe(
|
||||
hostname: str, namespace: str = "default", timeout: int = 30, poll_interval: int = 2
|
||||
) -> bool:
|
||||
"""Verify DNS by creating temp ingress and probing it.
|
||||
|
||||
This definitively proves that traffic to the hostname reaches this cluster.
|
||||
|
||||
Args:
|
||||
hostname: The hostname to verify
|
||||
namespace: Kubernetes namespace for probe ingress
|
||||
timeout: Total seconds to wait for probe to succeed
|
||||
poll_interval: Seconds between probe attempts
|
||||
|
||||
Returns:
|
||||
True if probe succeeds, False otherwise
|
||||
"""
|
||||
# First check DNS resolves at all
|
||||
if not resolve_hostname(hostname):
|
||||
print(f"DNS FAIL: {hostname} does not resolve")
|
||||
return False
|
||||
|
||||
print(f"Creating probe ingress for {hostname}...")
|
||||
create_probe_ingress(hostname, namespace)
|
||||
|
||||
try:
|
||||
# Wait for Caddy to pick up the ingress
|
||||
time.sleep(3)
|
||||
|
||||
# Poll until success or timeout
|
||||
probe_url = f"http://{hostname}/.well-known/laconic-probe"
|
||||
start_time = time.time()
|
||||
last_error = None
|
||||
|
||||
while time.time() - start_time < timeout:
|
||||
try:
|
||||
response = requests.get(probe_url, timeout=5)
|
||||
# For now, just verify we get a response from this cluster
|
||||
# A more robust check would verify a unique token
|
||||
if response.status_code < 500:
|
||||
print(f"DNS PROBE OK: {hostname} routes to this cluster")
|
||||
return True
|
||||
except requests.RequestException as e:
|
||||
last_error = e
|
||||
|
||||
time.sleep(poll_interval)
|
||||
|
||||
print(f"DNS PROBE FAIL: {hostname} - {last_error}")
|
||||
return False
|
||||
|
||||
finally:
|
||||
print("Cleaning up probe ingress...")
|
||||
delete_probe_ingress(namespace)
|
||||
@ -31,7 +31,6 @@ from stack_orchestrator.deploy.k8s.helpers import (
|
||||
envs_from_environment_variables_map,
|
||||
envs_from_compose_file,
|
||||
merge_envs,
|
||||
translate_sidecar_service_names,
|
||||
)
|
||||
from stack_orchestrator.deploy.deploy_util import (
|
||||
parsed_pod_files_map_from_file_names,
|
||||
@ -72,17 +71,15 @@ def to_k8s_resource_requirements(resources: Resources) -> client.V1ResourceRequi
|
||||
|
||||
class ClusterInfo:
|
||||
parsed_pod_yaml_map: Any
|
||||
parsed_job_yaml_map: Any
|
||||
image_set: Set[str] = set()
|
||||
app_name: str
|
||||
stack_name: str
|
||||
environment_variables: DeployEnvVars
|
||||
spec: Spec
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.parsed_job_yaml_map = {}
|
||||
pass
|
||||
|
||||
def int(self, pod_files: List[str], compose_env_file, deployment_name, spec: Spec, stack_name=""):
|
||||
def int(self, pod_files: List[str], compose_env_file, deployment_name, spec: Spec):
|
||||
self.parsed_pod_yaml_map = parsed_pod_files_map_from_file_names(pod_files)
|
||||
# Find the set of images in the pods
|
||||
self.image_set = images_for_deployment(pod_files)
|
||||
@ -92,23 +89,10 @@ class ClusterInfo:
|
||||
}
|
||||
self.environment_variables = DeployEnvVars(env_vars)
|
||||
self.app_name = deployment_name
|
||||
self.stack_name = stack_name
|
||||
self.spec = spec
|
||||
if opts.o.debug:
|
||||
print(f"Env vars: {self.environment_variables.map}")
|
||||
|
||||
def init_jobs(self, job_files: List[str]):
|
||||
"""Initialize parsed job YAML map from job compose files."""
|
||||
self.parsed_job_yaml_map = parsed_pod_files_map_from_file_names(job_files)
|
||||
if opts.o.debug:
|
||||
print(f"Parsed job yaml map: {self.parsed_job_yaml_map}")
|
||||
|
||||
def _all_named_volumes(self) -> list:
|
||||
"""Return named volumes from both pod and job compose files."""
|
||||
volumes = named_volumes_from_pod_files(self.parsed_pod_yaml_map)
|
||||
volumes.extend(named_volumes_from_pod_files(self.parsed_job_yaml_map))
|
||||
return volumes
|
||||
|
||||
def get_nodeports(self):
|
||||
nodeports = []
|
||||
for pod_name in self.parsed_pod_yaml_map:
|
||||
@ -141,8 +125,7 @@ class ClusterInfo:
|
||||
name=(
|
||||
f"{self.app_name}-nodeport-"
|
||||
f"{pod_port}-{protocol.lower()}"
|
||||
),
|
||||
labels={"app": self.app_name},
|
||||
)
|
||||
),
|
||||
spec=client.V1ServiceSpec(
|
||||
type="NodePort",
|
||||
@ -225,9 +208,7 @@ class ClusterInfo:
|
||||
|
||||
ingress = client.V1Ingress(
|
||||
metadata=client.V1ObjectMeta(
|
||||
name=f"{self.app_name}-ingress",
|
||||
labels={"app": self.app_name},
|
||||
annotations=ingress_annotations,
|
||||
name=f"{self.app_name}-ingress", annotations=ingress_annotations
|
||||
),
|
||||
spec=spec,
|
||||
)
|
||||
@ -257,10 +238,7 @@ class ClusterInfo:
|
||||
]
|
||||
|
||||
service = client.V1Service(
|
||||
metadata=client.V1ObjectMeta(
|
||||
name=f"{self.app_name}-service",
|
||||
labels={"app": self.app_name},
|
||||
),
|
||||
metadata=client.V1ObjectMeta(name=f"{self.app_name}-service"),
|
||||
spec=client.V1ServiceSpec(
|
||||
type="ClusterIP",
|
||||
ports=service_ports,
|
||||
@ -272,26 +250,20 @@ class ClusterInfo:
|
||||
def get_pvcs(self):
|
||||
result = []
|
||||
spec_volumes = self.spec.get_volumes()
|
||||
named_volumes = self._all_named_volumes()
|
||||
global_resources = self.spec.get_volume_resources()
|
||||
if not global_resources:
|
||||
global_resources = DEFAULT_VOLUME_RESOURCES
|
||||
named_volumes = named_volumes_from_pod_files(self.parsed_pod_yaml_map)
|
||||
resources = self.spec.get_volume_resources()
|
||||
if not resources:
|
||||
resources = DEFAULT_VOLUME_RESOURCES
|
||||
if opts.o.debug:
|
||||
print(f"Spec Volumes: {spec_volumes}")
|
||||
print(f"Named Volumes: {named_volumes}")
|
||||
print(f"Resources: {global_resources}")
|
||||
print(f"Resources: {resources}")
|
||||
for volume_name, volume_path in spec_volumes.items():
|
||||
if volume_name not in named_volumes:
|
||||
if opts.o.debug:
|
||||
print(f"{volume_name} not in pod files")
|
||||
continue
|
||||
|
||||
# Per-volume resources override global, which overrides default.
|
||||
vol_resources = (
|
||||
self.spec.get_volume_resources_for(volume_name)
|
||||
or global_resources
|
||||
)
|
||||
|
||||
labels = {
|
||||
"app": self.app_name,
|
||||
"volume-label": f"{self.app_name}-{volume_name}",
|
||||
@ -307,7 +279,7 @@ class ClusterInfo:
|
||||
spec = client.V1PersistentVolumeClaimSpec(
|
||||
access_modes=["ReadWriteOnce"],
|
||||
storage_class_name=storage_class_name,
|
||||
resources=to_k8s_resource_requirements(vol_resources),
|
||||
resources=to_k8s_resource_requirements(resources),
|
||||
volume_name=k8s_volume_name,
|
||||
)
|
||||
pvc = client.V1PersistentVolumeClaim(
|
||||
@ -322,7 +294,7 @@ class ClusterInfo:
|
||||
def get_configmaps(self):
|
||||
result = []
|
||||
spec_configmaps = self.spec.get_configmaps()
|
||||
named_volumes = self._all_named_volumes()
|
||||
named_volumes = named_volumes_from_pod_files(self.parsed_pod_yaml_map)
|
||||
for cfg_map_name, cfg_map_path in spec_configmaps.items():
|
||||
if cfg_map_name not in named_volumes:
|
||||
if opts.o.debug:
|
||||
@ -348,7 +320,7 @@ class ClusterInfo:
|
||||
spec = client.V1ConfigMap(
|
||||
metadata=client.V1ObjectMeta(
|
||||
name=f"{self.app_name}-{cfg_map_name}",
|
||||
labels={"app": self.app_name, "configmap-label": cfg_map_name},
|
||||
labels={"configmap-label": cfg_map_name},
|
||||
),
|
||||
binary_data=data,
|
||||
)
|
||||
@ -358,10 +330,10 @@ class ClusterInfo:
|
||||
def get_pvs(self):
|
||||
result = []
|
||||
spec_volumes = self.spec.get_volumes()
|
||||
named_volumes = self._all_named_volumes()
|
||||
global_resources = self.spec.get_volume_resources()
|
||||
if not global_resources:
|
||||
global_resources = DEFAULT_VOLUME_RESOURCES
|
||||
named_volumes = named_volumes_from_pod_files(self.parsed_pod_yaml_map)
|
||||
resources = self.spec.get_volume_resources()
|
||||
if not resources:
|
||||
resources = DEFAULT_VOLUME_RESOURCES
|
||||
for volume_name, volume_path in spec_volumes.items():
|
||||
# We only need to create a volume if it is fully qualified HostPath.
|
||||
# Otherwise, we create the PVC and expect the node to allocate the volume
|
||||
@ -380,20 +352,12 @@ class ClusterInfo:
|
||||
continue
|
||||
|
||||
if not os.path.isabs(volume_path):
|
||||
# For k8s-kind, allow relative paths:
|
||||
# - PV uses /mnt/{volume_name} (path inside kind node)
|
||||
# - extraMounts resolve the relative path to Docker Host
|
||||
if not self.spec.is_kind_deployment():
|
||||
print(
|
||||
f"WARNING: {volume_name}:{volume_path} is not absolute, "
|
||||
"cannot bind volume."
|
||||
)
|
||||
continue
|
||||
print(
|
||||
f"WARNING: {volume_name}:{volume_path} is not absolute, "
|
||||
"cannot bind volume."
|
||||
)
|
||||
continue
|
||||
|
||||
vol_resources = (
|
||||
self.spec.get_volume_resources_for(volume_name)
|
||||
or global_resources
|
||||
)
|
||||
if self.spec.is_kind_deployment():
|
||||
host_path = client.V1HostPathVolumeSource(
|
||||
path=get_kind_pv_bind_mount_path(volume_name)
|
||||
@ -403,75 +367,28 @@ class ClusterInfo:
|
||||
spec = client.V1PersistentVolumeSpec(
|
||||
storage_class_name="manual",
|
||||
access_modes=["ReadWriteOnce"],
|
||||
capacity=to_k8s_resource_requirements(vol_resources).requests,
|
||||
capacity=to_k8s_resource_requirements(resources).requests,
|
||||
host_path=host_path,
|
||||
)
|
||||
pv = client.V1PersistentVolume(
|
||||
metadata=client.V1ObjectMeta(
|
||||
name=f"{self.app_name}-{volume_name}",
|
||||
labels={
|
||||
"app": self.app_name,
|
||||
"volume-label": f"{self.app_name}-{volume_name}",
|
||||
},
|
||||
labels={"volume-label": f"{self.app_name}-{volume_name}"},
|
||||
),
|
||||
spec=spec,
|
||||
)
|
||||
result.append(pv)
|
||||
return result
|
||||
|
||||
def _any_service_has_host_network(self):
|
||||
# TODO: put things like image pull policy into an object-scope struct
|
||||
def get_deployment(self, image_pull_policy: Optional[str] = None):
|
||||
containers = []
|
||||
services = {}
|
||||
resources = self.spec.get_container_resources()
|
||||
if not resources:
|
||||
resources = DEFAULT_CONTAINER_RESOURCES
|
||||
for pod_name in self.parsed_pod_yaml_map:
|
||||
pod = self.parsed_pod_yaml_map[pod_name]
|
||||
for svc in pod.get("services", {}).values():
|
||||
if svc.get("network_mode") == "host":
|
||||
return True
|
||||
return False
|
||||
|
||||
def _resolve_container_resources(
|
||||
self, container_name: str, service_info: dict, global_resources: Resources
|
||||
) -> Resources:
|
||||
"""Resolve resources for a container using layered priority.
|
||||
|
||||
Priority: spec per-container > compose deploy.resources
|
||||
> spec global > DEFAULT
|
||||
"""
|
||||
# 1. Check spec.yml for per-container override
|
||||
per_container = self.spec.get_container_resources_for(container_name)
|
||||
if per_container:
|
||||
return per_container
|
||||
|
||||
# 2. Check compose service_info for deploy.resources
|
||||
deploy_block = service_info.get("deploy", {})
|
||||
compose_resources = deploy_block.get("resources", {}) if deploy_block else {}
|
||||
if compose_resources:
|
||||
return Resources(compose_resources)
|
||||
|
||||
# 3. Fall back to spec.yml global (already resolved with DEFAULT fallback)
|
||||
return global_resources
|
||||
|
||||
def _build_containers(
|
||||
self,
|
||||
parsed_yaml_map: Any,
|
||||
image_pull_policy: Optional[str] = None,
|
||||
) -> tuple:
|
||||
"""Build k8s container specs from parsed compose YAML.
|
||||
|
||||
Returns a tuple of (containers, init_containers, services, volumes)
|
||||
where:
|
||||
- containers: list of V1Container objects
|
||||
- init_containers: list of V1Container objects for init containers
|
||||
(compose services with label ``laconic.init-container: "true"``)
|
||||
- services: the last services dict processed (used for annotations/labels)
|
||||
- volumes: list of V1Volume objects
|
||||
"""
|
||||
containers = []
|
||||
init_containers = []
|
||||
services = {}
|
||||
global_resources = self.spec.get_container_resources()
|
||||
if not global_resources:
|
||||
global_resources = DEFAULT_CONTAINER_RESOURCES
|
||||
for pod_name in parsed_yaml_map:
|
||||
pod = parsed_yaml_map[pod_name]
|
||||
services = pod["services"]
|
||||
for service_name in services:
|
||||
container_name = service_name
|
||||
@ -509,12 +426,6 @@ class ClusterInfo:
|
||||
if "environment" in service_info
|
||||
else self.environment_variables.map
|
||||
)
|
||||
# Translate docker-compose service names to localhost for sidecars
|
||||
# All services in the same pod share the network namespace
|
||||
sibling_services = [s for s in services.keys() if s != service_name]
|
||||
merged_envs = translate_sidecar_service_names(
|
||||
merged_envs, sibling_services
|
||||
)
|
||||
envs = envs_from_environment_variables_map(merged_envs)
|
||||
if opts.o.debug:
|
||||
print(f"Merged envs: {envs}")
|
||||
@ -528,7 +439,7 @@ class ClusterInfo:
|
||||
else image
|
||||
)
|
||||
volume_mounts = volume_mounts_for_service(
|
||||
parsed_yaml_map, service_name
|
||||
self.parsed_pod_yaml_map, service_name
|
||||
)
|
||||
# Handle command/entrypoint from compose file
|
||||
# In docker-compose: entrypoint -> k8s command, command -> k8s args
|
||||
@ -542,29 +453,6 @@ class ClusterInfo:
|
||||
if "command" in service_info:
|
||||
cmd = service_info["command"]
|
||||
container_args = cmd if isinstance(cmd, list) else cmd.split()
|
||||
# Add env_from to pull secrets from K8s Secret
|
||||
secret_name = f"{self.app_name}-generated-secrets"
|
||||
env_from = [
|
||||
client.V1EnvFromSource(
|
||||
secret_ref=client.V1SecretEnvSource(
|
||||
name=secret_name,
|
||||
optional=True, # Don't fail if no secrets
|
||||
)
|
||||
)
|
||||
]
|
||||
# Mount user-declared secrets from spec.yml
|
||||
for user_secret_name in self.spec.get_secrets():
|
||||
env_from.append(
|
||||
client.V1EnvFromSource(
|
||||
secret_ref=client.V1SecretEnvSource(
|
||||
name=user_secret_name,
|
||||
optional=True,
|
||||
)
|
||||
)
|
||||
)
|
||||
container_resources = self._resolve_container_resources(
|
||||
container_name, service_info, global_resources
|
||||
)
|
||||
container = client.V1Container(
|
||||
name=container_name,
|
||||
image=image_to_use,
|
||||
@ -572,56 +460,26 @@ class ClusterInfo:
|
||||
command=container_command,
|
||||
args=container_args,
|
||||
env=envs,
|
||||
env_from=env_from,
|
||||
ports=container_ports if container_ports else None,
|
||||
volume_mounts=volume_mounts,
|
||||
security_context=client.V1SecurityContext(
|
||||
privileged=self.spec.get_privileged(),
|
||||
run_as_user=int(service_info["user"]) if "user" in service_info else None,
|
||||
capabilities=client.V1Capabilities(
|
||||
add=self.spec.get_capabilities()
|
||||
)
|
||||
if self.spec.get_capabilities()
|
||||
else None,
|
||||
),
|
||||
resources=to_k8s_resource_requirements(container_resources),
|
||||
resources=to_k8s_resource_requirements(resources),
|
||||
)
|
||||
# Services with laconic.init-container label become
|
||||
# k8s init containers instead of regular containers.
|
||||
svc_labels = service_info.get("labels", {})
|
||||
if isinstance(svc_labels, list):
|
||||
# docker-compose labels can be a list of "key=value"
|
||||
svc_labels = dict(
|
||||
item.split("=", 1) for item in svc_labels
|
||||
)
|
||||
is_init = str(
|
||||
svc_labels.get("laconic.init-container", "")
|
||||
).lower() in ("true", "1", "yes")
|
||||
if is_init:
|
||||
init_containers.append(container)
|
||||
else:
|
||||
containers.append(container)
|
||||
containers.append(container)
|
||||
volumes = volumes_for_pod_files(
|
||||
parsed_yaml_map, self.spec, self.app_name
|
||||
self.parsed_pod_yaml_map, self.spec, self.app_name
|
||||
)
|
||||
return containers, init_containers, services, volumes
|
||||
|
||||
# TODO: put things like image pull policy into an object-scope struct
|
||||
def get_deployment(self, image_pull_policy: Optional[str] = None):
|
||||
containers, init_containers, services, volumes = self._build_containers(
|
||||
self.parsed_pod_yaml_map, image_pull_policy
|
||||
)
|
||||
registry_config = self.spec.get_image_registry_config()
|
||||
if registry_config:
|
||||
secret_name = f"{self.app_name}-registry"
|
||||
image_pull_secrets = [client.V1LocalObjectReference(name=secret_name)]
|
||||
else:
|
||||
image_pull_secrets = []
|
||||
image_pull_secrets = [client.V1LocalObjectReference(name="laconic-registry")]
|
||||
|
||||
annotations = None
|
||||
labels = {"app": self.app_name}
|
||||
if self.stack_name:
|
||||
labels["app.kubernetes.io/stack"] = self.stack_name
|
||||
affinity = None
|
||||
tolerations = None
|
||||
|
||||
@ -674,19 +532,15 @@ class ClusterInfo:
|
||||
)
|
||||
)
|
||||
|
||||
use_host_network = self._any_service_has_host_network()
|
||||
template = client.V1PodTemplateSpec(
|
||||
metadata=client.V1ObjectMeta(annotations=annotations, labels=labels),
|
||||
spec=client.V1PodSpec(
|
||||
containers=containers,
|
||||
init_containers=init_containers or None,
|
||||
image_pull_secrets=image_pull_secrets,
|
||||
volumes=volumes,
|
||||
affinity=affinity,
|
||||
tolerations=tolerations,
|
||||
runtime_class_name=self.spec.get_runtime_class(),
|
||||
host_network=use_host_network or None,
|
||||
dns_policy=("ClusterFirstWithHostNet" if use_host_network else None),
|
||||
),
|
||||
)
|
||||
spec = client.V1DeploymentSpec(
|
||||
@ -698,83 +552,7 @@ class ClusterInfo:
|
||||
deployment = client.V1Deployment(
|
||||
api_version="apps/v1",
|
||||
kind="Deployment",
|
||||
metadata=client.V1ObjectMeta(
|
||||
name=f"{self.app_name}-deployment",
|
||||
labels={"app": self.app_name, **({"app.kubernetes.io/stack": self.stack_name} if self.stack_name else {})},
|
||||
),
|
||||
metadata=client.V1ObjectMeta(name=f"{self.app_name}-deployment"),
|
||||
spec=spec,
|
||||
)
|
||||
return deployment
|
||||
|
||||
def get_jobs(self, image_pull_policy: Optional[str] = None) -> List[client.V1Job]:
|
||||
"""Build k8s Job objects from parsed job compose files.
|
||||
|
||||
Each job compose file produces a V1Job with:
|
||||
- restartPolicy: Never
|
||||
- backoffLimit: 0
|
||||
- Name: {app_name}-job-{job_name}
|
||||
"""
|
||||
if not self.parsed_job_yaml_map:
|
||||
return []
|
||||
|
||||
jobs = []
|
||||
registry_config = self.spec.get_image_registry_config()
|
||||
if registry_config:
|
||||
secret_name = f"{self.app_name}-registry"
|
||||
image_pull_secrets = [client.V1LocalObjectReference(name=secret_name)]
|
||||
else:
|
||||
image_pull_secrets = []
|
||||
|
||||
for job_file in self.parsed_job_yaml_map:
|
||||
# Build containers for this single job file
|
||||
single_job_map = {job_file: self.parsed_job_yaml_map[job_file]}
|
||||
containers, init_containers, _services, volumes = (
|
||||
self._build_containers(single_job_map, image_pull_policy)
|
||||
)
|
||||
|
||||
# Derive job name from file path: docker-compose-<name>.yml -> <name>
|
||||
base = os.path.basename(job_file)
|
||||
# Strip docker-compose- prefix and .yml suffix
|
||||
job_name = base
|
||||
if job_name.startswith("docker-compose-"):
|
||||
job_name = job_name[len("docker-compose-"):]
|
||||
if job_name.endswith(".yml"):
|
||||
job_name = job_name[: -len(".yml")]
|
||||
elif job_name.endswith(".yaml"):
|
||||
job_name = job_name[: -len(".yaml")]
|
||||
|
||||
# Use a distinct app label for job pods so they don't get
|
||||
# picked up by pods_in_deployment() which queries app={app_name}.
|
||||
pod_labels = {
|
||||
"app": f"{self.app_name}-job",
|
||||
**({"app.kubernetes.io/stack": self.stack_name} if self.stack_name else {}),
|
||||
}
|
||||
template = client.V1PodTemplateSpec(
|
||||
metadata=client.V1ObjectMeta(
|
||||
labels=pod_labels
|
||||
),
|
||||
spec=client.V1PodSpec(
|
||||
containers=containers,
|
||||
init_containers=init_containers or None,
|
||||
image_pull_secrets=image_pull_secrets,
|
||||
volumes=volumes,
|
||||
restart_policy="Never",
|
||||
),
|
||||
)
|
||||
job_spec = client.V1JobSpec(
|
||||
template=template,
|
||||
backoff_limit=0,
|
||||
)
|
||||
job_labels = {"app": self.app_name, **({"app.kubernetes.io/stack": self.stack_name} if self.stack_name else {})}
|
||||
job = client.V1Job(
|
||||
api_version="batch/v1",
|
||||
kind="Job",
|
||||
metadata=client.V1ObjectMeta(
|
||||
name=f"{self.app_name}-job-{job_name}",
|
||||
labels=job_labels,
|
||||
),
|
||||
spec=job_spec,
|
||||
)
|
||||
jobs.append(job)
|
||||
|
||||
return jobs
|
||||
|
||||
@ -29,7 +29,6 @@ from stack_orchestrator.deploy.k8s.helpers import (
|
||||
from stack_orchestrator.deploy.k8s.helpers import (
|
||||
install_ingress_for_kind,
|
||||
wait_for_ingress_in_kind,
|
||||
is_ingress_running,
|
||||
)
|
||||
from stack_orchestrator.deploy.k8s.helpers import (
|
||||
pods_in_deployment,
|
||||
@ -95,9 +94,8 @@ class K8sDeployer(Deployer):
|
||||
type: str
|
||||
core_api: client.CoreV1Api
|
||||
apps_api: client.AppsV1Api
|
||||
batch_api: client.BatchV1Api
|
||||
networking_api: client.NetworkingV1Api
|
||||
k8s_namespace: str
|
||||
k8s_namespace: str = "default"
|
||||
kind_cluster_name: str
|
||||
skip_cluster_management: bool
|
||||
cluster_info: ClusterInfo
|
||||
@ -111,39 +109,26 @@ class K8sDeployer(Deployer):
|
||||
compose_files,
|
||||
compose_project_name,
|
||||
compose_env_file,
|
||||
job_compose_files=None,
|
||||
) -> None:
|
||||
self.type = type
|
||||
self.skip_cluster_management = False
|
||||
self.k8s_namespace = "default" # Will be overridden below if context exists
|
||||
# TODO: workaround pending refactoring above to cope with being
|
||||
# created with a null deployment_context
|
||||
if deployment_context is None:
|
||||
return
|
||||
self.deployment_dir = deployment_context.deployment_dir
|
||||
self.deployment_context = deployment_context
|
||||
self.kind_cluster_name = deployment_context.spec.get_kind_cluster_name() or compose_project_name
|
||||
# Use spec namespace if provided, otherwise derive from cluster-id
|
||||
self.k8s_namespace = deployment_context.spec.get_namespace() or f"laconic-{compose_project_name}"
|
||||
self.kind_cluster_name = compose_project_name
|
||||
self.cluster_info = ClusterInfo()
|
||||
# stack.name may be an absolute path (from spec "stack:" key after
|
||||
# path resolution). Extract just the directory basename for labels.
|
||||
raw_name = deployment_context.stack.name if deployment_context else ""
|
||||
stack_name = Path(raw_name).name if raw_name else ""
|
||||
self.cluster_info.int(
|
||||
compose_files,
|
||||
compose_env_file,
|
||||
compose_project_name,
|
||||
deployment_context.spec,
|
||||
stack_name=stack_name,
|
||||
)
|
||||
# Initialize job compose files if provided
|
||||
if job_compose_files:
|
||||
self.cluster_info.init_jobs(job_compose_files)
|
||||
if opts.o.debug:
|
||||
print(f"Deployment dir: {deployment_context.deployment_dir}")
|
||||
print(f"Compose files: {compose_files}")
|
||||
print(f"Job compose files: {job_compose_files}")
|
||||
print(f"Project name: {compose_project_name}")
|
||||
print(f"Env file: {compose_env_file}")
|
||||
print(f"Type: {type}")
|
||||
@ -161,136 +146,8 @@ class K8sDeployer(Deployer):
|
||||
self.core_api = client.CoreV1Api()
|
||||
self.networking_api = client.NetworkingV1Api()
|
||||
self.apps_api = client.AppsV1Api()
|
||||
self.batch_api = client.BatchV1Api()
|
||||
self.custom_obj_api = client.CustomObjectsApi()
|
||||
|
||||
def _ensure_namespace(self):
|
||||
"""Create the deployment namespace if it doesn't exist."""
|
||||
if opts.o.dry_run:
|
||||
print(f"Dry run: would create namespace {self.k8s_namespace}")
|
||||
return
|
||||
try:
|
||||
self.core_api.read_namespace(name=self.k8s_namespace)
|
||||
if opts.o.debug:
|
||||
print(f"Namespace {self.k8s_namespace} already exists")
|
||||
except ApiException as e:
|
||||
if e.status == 404:
|
||||
# Create the namespace
|
||||
ns = client.V1Namespace(
|
||||
metadata=client.V1ObjectMeta(
|
||||
name=self.k8s_namespace,
|
||||
labels={"app": self.cluster_info.app_name},
|
||||
)
|
||||
)
|
||||
self.core_api.create_namespace(body=ns)
|
||||
if opts.o.debug:
|
||||
print(f"Created namespace {self.k8s_namespace}")
|
||||
else:
|
||||
raise
|
||||
|
||||
def _delete_namespace(self):
|
||||
"""Delete the deployment namespace and all resources within it."""
|
||||
if opts.o.dry_run:
|
||||
print(f"Dry run: would delete namespace {self.k8s_namespace}")
|
||||
return
|
||||
try:
|
||||
self.core_api.delete_namespace(name=self.k8s_namespace)
|
||||
if opts.o.debug:
|
||||
print(f"Deleted namespace {self.k8s_namespace}")
|
||||
except ApiException as e:
|
||||
if e.status == 404:
|
||||
if opts.o.debug:
|
||||
print(f"Namespace {self.k8s_namespace} not found")
|
||||
else:
|
||||
raise
|
||||
|
||||
def _delete_resources_by_label(self, label_selector: str, delete_volumes: bool):
|
||||
"""Delete only this stack's resources from a shared namespace."""
|
||||
ns = self.k8s_namespace
|
||||
if opts.o.dry_run:
|
||||
print(f"Dry run: would delete resources with {label_selector} in {ns}")
|
||||
return
|
||||
|
||||
# Deployments
|
||||
try:
|
||||
deps = self.apps_api.list_namespaced_deployment(
|
||||
namespace=ns, label_selector=label_selector
|
||||
)
|
||||
for dep in deps.items:
|
||||
print(f"Deleting Deployment {dep.metadata.name}")
|
||||
self.apps_api.delete_namespaced_deployment(
|
||||
name=dep.metadata.name, namespace=ns
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
# Jobs
|
||||
try:
|
||||
jobs = self.batch_api.list_namespaced_job(
|
||||
namespace=ns, label_selector=label_selector
|
||||
)
|
||||
for job in jobs.items:
|
||||
print(f"Deleting Job {job.metadata.name}")
|
||||
self.batch_api.delete_namespaced_job(
|
||||
name=job.metadata.name, namespace=ns,
|
||||
body=client.V1DeleteOptions(propagation_policy="Background"),
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
# Services (NodePorts created by SO)
|
||||
try:
|
||||
svcs = self.core_api.list_namespaced_service(
|
||||
namespace=ns, label_selector=label_selector
|
||||
)
|
||||
for svc in svcs.items:
|
||||
print(f"Deleting Service {svc.metadata.name}")
|
||||
self.core_api.delete_namespaced_service(
|
||||
name=svc.metadata.name, namespace=ns
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
# Ingresses
|
||||
try:
|
||||
ings = self.networking_api.list_namespaced_ingress(
|
||||
namespace=ns, label_selector=label_selector
|
||||
)
|
||||
for ing in ings.items:
|
||||
print(f"Deleting Ingress {ing.metadata.name}")
|
||||
self.networking_api.delete_namespaced_ingress(
|
||||
name=ing.metadata.name, namespace=ns
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
# ConfigMaps
|
||||
try:
|
||||
cms = self.core_api.list_namespaced_config_map(
|
||||
namespace=ns, label_selector=label_selector
|
||||
)
|
||||
for cm in cms.items:
|
||||
print(f"Deleting ConfigMap {cm.metadata.name}")
|
||||
self.core_api.delete_namespaced_config_map(
|
||||
name=cm.metadata.name, namespace=ns
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
# PVCs (only if --delete-volumes)
|
||||
if delete_volumes:
|
||||
try:
|
||||
pvcs = self.core_api.list_namespaced_persistent_volume_claim(
|
||||
namespace=ns, label_selector=label_selector
|
||||
)
|
||||
for pvc in pvcs.items:
|
||||
print(f"Deleting PVC {pvc.metadata.name}")
|
||||
self.core_api.delete_namespaced_persistent_volume_claim(
|
||||
name=pvc.metadata.name, namespace=ns
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
def _create_volume_data(self):
|
||||
# Create the host-path-mounted PVs for this deployment
|
||||
pvs = self.cluster_info.get_pvs()
|
||||
@ -355,11 +212,6 @@ class K8sDeployer(Deployer):
|
||||
print(f"{cfg_rsp}")
|
||||
|
||||
def _create_deployment(self):
|
||||
# Skip if there are no pods to deploy (e.g. jobs-only stacks)
|
||||
if not self.cluster_info.parsed_pod_yaml_map:
|
||||
if opts.o.debug:
|
||||
print("No pods defined, skipping Deployment creation")
|
||||
return
|
||||
# Process compose files into a Deployment
|
||||
deployment = self.cluster_info.get_deployment(
|
||||
image_pull_policy=None if self.is_kind() else "Always"
|
||||
@ -397,26 +249,6 @@ class K8sDeployer(Deployer):
|
||||
print("Service created:")
|
||||
print(f"{service_resp}")
|
||||
|
||||
def _create_jobs(self):
|
||||
# Process job compose files into k8s Jobs
|
||||
jobs = self.cluster_info.get_jobs(
|
||||
image_pull_policy=None if self.is_kind() else "Always"
|
||||
)
|
||||
for job in jobs:
|
||||
if opts.o.debug:
|
||||
print(f"Sending this job: {job}")
|
||||
if not opts.o.dry_run:
|
||||
job_resp = self.batch_api.create_namespaced_job(
|
||||
body=job, namespace=self.k8s_namespace
|
||||
)
|
||||
if opts.o.debug:
|
||||
print("Job created:")
|
||||
if job_resp.metadata:
|
||||
print(
|
||||
f" {job_resp.metadata.namespace} "
|
||||
f"{job_resp.metadata.name}"
|
||||
)
|
||||
|
||||
def _find_certificate_for_host_name(self, host_name):
|
||||
all_certificates = self.custom_obj_api.list_namespaced_custom_object(
|
||||
group="cert-manager.io",
|
||||
@ -457,40 +289,22 @@ class K8sDeployer(Deployer):
|
||||
self.skip_cluster_management = skip_cluster_management
|
||||
if not opts.o.dry_run:
|
||||
if self.is_kind() and not self.skip_cluster_management:
|
||||
# Create the kind cluster (or reuse existing one)
|
||||
kind_config = str(
|
||||
self.deployment_dir.joinpath(constants.kind_config_filename)
|
||||
# Create the kind cluster
|
||||
create_cluster(
|
||||
self.kind_cluster_name,
|
||||
str(self.deployment_dir.joinpath(constants.kind_config_filename)),
|
||||
)
|
||||
actual_cluster = create_cluster(self.kind_cluster_name, kind_config)
|
||||
if actual_cluster != self.kind_cluster_name:
|
||||
# An existing cluster was found, use it instead
|
||||
self.kind_cluster_name = actual_cluster
|
||||
# Only load locally-built images into kind
|
||||
# Registry images (docker.io, ghcr.io, etc.) will be pulled by k8s
|
||||
local_containers = self.deployment_context.stack.obj.get(
|
||||
"containers", []
|
||||
# Ensure the referenced containers are copied into kind
|
||||
load_images_into_kind(
|
||||
self.kind_cluster_name, self.cluster_info.image_set
|
||||
)
|
||||
if local_containers:
|
||||
# Filter image_set to only images matching local containers
|
||||
local_images = {
|
||||
img
|
||||
for img in self.cluster_info.image_set
|
||||
if any(c in img for c in local_containers)
|
||||
}
|
||||
if local_images:
|
||||
load_images_into_kind(self.kind_cluster_name, local_images)
|
||||
# Note: if no local containers defined, all images come from registries
|
||||
self.connect_api()
|
||||
# Create deployment-specific namespace for resource isolation
|
||||
self._ensure_namespace()
|
||||
if self.is_kind() and not self.skip_cluster_management:
|
||||
# Configure ingress controller (not installed by default in kind)
|
||||
# Skip if already running (idempotent for shared cluster)
|
||||
if not is_ingress_running():
|
||||
install_ingress_for_kind(self.cluster_info.spec.get_acme_email())
|
||||
# Wait for ingress to start
|
||||
# (deployment provisioning will fail unless this is done)
|
||||
wait_for_ingress_in_kind()
|
||||
install_ingress_for_kind()
|
||||
# Wait for ingress to start
|
||||
# (deployment provisioning will fail unless this is done)
|
||||
wait_for_ingress_in_kind()
|
||||
# Create RuntimeClass if unlimited_memlock is enabled
|
||||
if self.cluster_info.spec.get_unlimited_memlock():
|
||||
_create_runtime_class(
|
||||
@ -501,14 +315,8 @@ class K8sDeployer(Deployer):
|
||||
else:
|
||||
print("Dry run mode enabled, skipping k8s API connect")
|
||||
|
||||
# Create registry secret if configured
|
||||
from stack_orchestrator.deploy.deployment_create import create_registry_secret
|
||||
|
||||
create_registry_secret(self.cluster_info.spec, self.cluster_info.app_name)
|
||||
|
||||
self._create_volume_data()
|
||||
self._create_deployment()
|
||||
self._create_jobs()
|
||||
|
||||
http_proxy_info = self.cluster_info.spec.get_http_proxy()
|
||||
# Note: we don't support tls for kind (enabling tls causes errors)
|
||||
@ -551,42 +359,107 @@ class K8sDeployer(Deployer):
|
||||
print("NodePort created:")
|
||||
print(f"{nodeport_resp}")
|
||||
|
||||
# Call start() hooks — stacks can create additional k8s resources
|
||||
if self.deployment_context:
|
||||
from stack_orchestrator.deploy.deployment_create import call_stack_deploy_start
|
||||
call_stack_deploy_start(self.deployment_context)
|
||||
|
||||
def down(self, timeout, volumes, skip_cluster_management):
|
||||
def down(self, timeout, volumes, skip_cluster_management): # noqa: C901
|
||||
self.skip_cluster_management = skip_cluster_management
|
||||
self.connect_api()
|
||||
# Delete the k8s objects
|
||||
|
||||
app_label = f"app={self.cluster_info.app_name}"
|
||||
|
||||
# PersistentVolumes are cluster-scoped (not namespaced), so delete by label
|
||||
if volumes:
|
||||
try:
|
||||
pvs = self.core_api.list_persistent_volume(
|
||||
label_selector=app_label
|
||||
)
|
||||
for pv in pvs.items:
|
||||
if opts.o.debug:
|
||||
print(f"Deleting PV: {pv.metadata.name}")
|
||||
try:
|
||||
self.core_api.delete_persistent_volume(name=pv.metadata.name)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
except ApiException as e:
|
||||
# Create the host-path-mounted PVs for this deployment
|
||||
pvs = self.cluster_info.get_pvs()
|
||||
for pv in pvs:
|
||||
if opts.o.debug:
|
||||
print(f"Error listing PVs: {e}")
|
||||
print(f"Deleting this pv: {pv}")
|
||||
try:
|
||||
pv_resp = self.core_api.delete_persistent_volume(
|
||||
name=pv.metadata.name
|
||||
)
|
||||
if opts.o.debug:
|
||||
print("PV deleted:")
|
||||
print(f"{pv_resp}")
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
# When namespace is explicitly set in the spec, it may be shared with
|
||||
# other stacks — delete only this stack's resources by label.
|
||||
# Otherwise the namespace is owned by this deployment, delete it entirely.
|
||||
shared_namespace = self.deployment_context.spec.get_namespace() is not None
|
||||
if shared_namespace:
|
||||
self._delete_resources_by_label(app_label, volumes)
|
||||
# Figure out the PVCs for this deployment
|
||||
pvcs = self.cluster_info.get_pvcs()
|
||||
for pvc in pvcs:
|
||||
if opts.o.debug:
|
||||
print(f"Deleting this pvc: {pvc}")
|
||||
try:
|
||||
pvc_resp = self.core_api.delete_namespaced_persistent_volume_claim(
|
||||
name=pvc.metadata.name, namespace=self.k8s_namespace
|
||||
)
|
||||
if opts.o.debug:
|
||||
print("PVCs deleted:")
|
||||
print(f"{pvc_resp}")
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
# Figure out the ConfigMaps for this deployment
|
||||
cfg_maps = self.cluster_info.get_configmaps()
|
||||
for cfg_map in cfg_maps:
|
||||
if opts.o.debug:
|
||||
print(f"Deleting this ConfigMap: {cfg_map}")
|
||||
try:
|
||||
cfg_map_resp = self.core_api.delete_namespaced_config_map(
|
||||
name=cfg_map.metadata.name, namespace=self.k8s_namespace
|
||||
)
|
||||
if opts.o.debug:
|
||||
print("ConfigMap deleted:")
|
||||
print(f"{cfg_map_resp}")
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
deployment = self.cluster_info.get_deployment()
|
||||
if opts.o.debug:
|
||||
print(f"Deleting this deployment: {deployment}")
|
||||
if deployment and deployment.metadata and deployment.metadata.name:
|
||||
try:
|
||||
self.apps_api.delete_namespaced_deployment(
|
||||
name=deployment.metadata.name, namespace=self.k8s_namespace
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
service = self.cluster_info.get_service()
|
||||
if opts.o.debug:
|
||||
print(f"Deleting service: {service}")
|
||||
if service and service.metadata and service.metadata.name:
|
||||
try:
|
||||
self.core_api.delete_namespaced_service(
|
||||
namespace=self.k8s_namespace, name=service.metadata.name
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
|
||||
ingress = self.cluster_info.get_ingress(use_tls=not self.is_kind())
|
||||
if ingress and ingress.metadata and ingress.metadata.name:
|
||||
if opts.o.debug:
|
||||
print(f"Deleting this ingress: {ingress}")
|
||||
try:
|
||||
self.networking_api.delete_namespaced_ingress(
|
||||
name=ingress.metadata.name, namespace=self.k8s_namespace
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
else:
|
||||
self._delete_namespace()
|
||||
if opts.o.debug:
|
||||
print("No ingress to delete")
|
||||
|
||||
nodeports: List[client.V1Service] = self.cluster_info.get_nodeports()
|
||||
for nodeport in nodeports:
|
||||
if opts.o.debug:
|
||||
print(f"Deleting this nodeport: {nodeport}")
|
||||
if nodeport.metadata and nodeport.metadata.name:
|
||||
try:
|
||||
self.core_api.delete_namespaced_service(
|
||||
namespace=self.k8s_namespace, name=nodeport.metadata.name
|
||||
)
|
||||
except ApiException as e:
|
||||
_check_delete_exception(e)
|
||||
else:
|
||||
if opts.o.debug:
|
||||
print("No nodeport to delete")
|
||||
|
||||
if self.is_kind() and not self.skip_cluster_management:
|
||||
# Destroy the kind cluster
|
||||
@ -711,20 +584,20 @@ class K8sDeployer(Deployer):
|
||||
|
||||
def logs(self, services, tail, follow, stream):
|
||||
self.connect_api()
|
||||
pods = pods_in_deployment(self.core_api, self.cluster_info.app_name, namespace=self.k8s_namespace)
|
||||
pods = pods_in_deployment(self.core_api, self.cluster_info.app_name)
|
||||
if len(pods) > 1:
|
||||
print("Warning: more than one pod in the deployment")
|
||||
if len(pods) == 0:
|
||||
log_data = "******* Pods not running ********\n"
|
||||
else:
|
||||
k8s_pod_name = pods[0]
|
||||
containers = containers_in_pod(self.core_api, k8s_pod_name, namespace=self.k8s_namespace)
|
||||
containers = containers_in_pod(self.core_api, k8s_pod_name)
|
||||
# If pod not started, logs request below will throw an exception
|
||||
try:
|
||||
log_data = ""
|
||||
for container in containers:
|
||||
container_log = self.core_api.read_namespaced_pod_log(
|
||||
k8s_pod_name, namespace=self.k8s_namespace, container=container
|
||||
k8s_pod_name, namespace="default", container=container
|
||||
)
|
||||
container_log_lines = container_log.splitlines()
|
||||
for line in container_log_lines:
|
||||
@ -736,10 +609,6 @@ class K8sDeployer(Deployer):
|
||||
return log_stream_from_string(log_data)
|
||||
|
||||
def update(self):
|
||||
if not self.cluster_info.parsed_pod_yaml_map:
|
||||
if opts.o.debug:
|
||||
print("No pods defined, skipping update")
|
||||
return
|
||||
self.connect_api()
|
||||
ref_deployment = self.cluster_info.get_deployment()
|
||||
if not ref_deployment or not ref_deployment.metadata:
|
||||
@ -800,43 +669,26 @@ class K8sDeployer(Deployer):
|
||||
|
||||
def run_job(self, job_name: str, helm_release: Optional[str] = None):
|
||||
if not opts.o.dry_run:
|
||||
from stack_orchestrator.deploy.k8s.helm.job_runner import run_helm_job
|
||||
|
||||
# Check if this is a helm-based deployment
|
||||
chart_dir = self.deployment_dir / "chart"
|
||||
if chart_dir.exists():
|
||||
from stack_orchestrator.deploy.k8s.helm.job_runner import run_helm_job
|
||||
if not chart_dir.exists():
|
||||
# TODO: Implement job support for compose-based K8s deployments
|
||||
raise Exception(
|
||||
f"Job support is only available for helm-based "
|
||||
f"deployments. Chart directory not found: {chart_dir}"
|
||||
)
|
||||
|
||||
# Run the job using the helm job runner
|
||||
run_helm_job(
|
||||
chart_dir=chart_dir,
|
||||
job_name=job_name,
|
||||
release=helm_release,
|
||||
namespace=self.k8s_namespace,
|
||||
timeout=600,
|
||||
verbose=opts.o.verbose,
|
||||
)
|
||||
else:
|
||||
# Non-Helm path: create job from ClusterInfo
|
||||
self.connect_api()
|
||||
jobs = self.cluster_info.get_jobs(
|
||||
image_pull_policy=None if self.is_kind() else "Always"
|
||||
)
|
||||
# Find the matching job by name
|
||||
target_name = f"{self.cluster_info.app_name}-job-{job_name}"
|
||||
matched_job = None
|
||||
for job in jobs:
|
||||
if job.metadata and job.metadata.name == target_name:
|
||||
matched_job = job
|
||||
break
|
||||
if matched_job is None:
|
||||
raise Exception(
|
||||
f"Job '{job_name}' not found. Available jobs: "
|
||||
f"{[j.metadata.name for j in jobs if j.metadata]}"
|
||||
)
|
||||
if opts.o.debug:
|
||||
print(f"Creating job: {target_name}")
|
||||
self.batch_api.create_namespaced_job(
|
||||
body=matched_job, namespace=self.k8s_namespace
|
||||
)
|
||||
# Run the job using the helm job runner
|
||||
run_helm_job(
|
||||
chart_dir=chart_dir,
|
||||
job_name=job_name,
|
||||
release=helm_release,
|
||||
namespace=self.k8s_namespace,
|
||||
timeout=600,
|
||||
verbose=opts.o.verbose,
|
||||
)
|
||||
|
||||
def is_kind(self):
|
||||
return self.type == "k8s-kind"
|
||||
|
||||
@ -14,13 +14,11 @@
|
||||
# along with this program. If not, see <http:#www.gnu.org/licenses/>.
|
||||
|
||||
from kubernetes import client, utils, watch
|
||||
from kubernetes.client.exceptions import ApiException
|
||||
import os
|
||||
from pathlib import Path
|
||||
import subprocess
|
||||
import re
|
||||
from typing import Set, Mapping, List, Optional, cast
|
||||
import yaml
|
||||
|
||||
from stack_orchestrator.util import get_k8s_dir, error_exit
|
||||
from stack_orchestrator.opts import opts
|
||||
@ -98,227 +96,16 @@ def _run_command(command: str):
|
||||
return result
|
||||
|
||||
|
||||
def _get_etcd_host_path_from_kind_config(config_file: str) -> Optional[str]:
|
||||
"""Extract etcd host path from kind config extraMounts."""
|
||||
import yaml
|
||||
|
||||
try:
|
||||
with open(config_file, "r") as f:
|
||||
config = yaml.safe_load(f)
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
nodes = config.get("nodes", [])
|
||||
for node in nodes:
|
||||
extra_mounts = node.get("extraMounts", [])
|
||||
for mount in extra_mounts:
|
||||
if mount.get("containerPath") == "/var/lib/etcd":
|
||||
return mount.get("hostPath")
|
||||
return None
|
||||
|
||||
|
||||
def _clean_etcd_keeping_certs(etcd_path: str) -> bool:
|
||||
"""Clean persisted etcd, keeping only TLS certificates.
|
||||
|
||||
When etcd is persisted and a cluster is recreated, kind tries to install
|
||||
resources fresh but they already exist. Instead of trying to delete
|
||||
specific stale resources (blacklist), we keep only the valuable data
|
||||
(caddy TLS certs) and delete everything else (whitelist approach).
|
||||
|
||||
The etcd image is distroless (no shell), so we extract the statically-linked
|
||||
etcdctl binary and run it from alpine which has shell support.
|
||||
|
||||
Returns True if cleanup succeeded, False if no action needed or failed.
|
||||
"""
|
||||
db_path = Path(etcd_path) / "member" / "snap" / "db"
|
||||
# Check existence using docker since etcd dir is root-owned
|
||||
check_cmd = (
|
||||
f"docker run --rm -v {etcd_path}:/etcd:ro alpine:3.19 "
|
||||
"test -f /etcd/member/snap/db"
|
||||
)
|
||||
check_result = subprocess.run(check_cmd, shell=True, capture_output=True)
|
||||
if check_result.returncode != 0:
|
||||
if opts.o.debug:
|
||||
print(f"No etcd snapshot at {db_path}, skipping cleanup")
|
||||
return False
|
||||
|
||||
if opts.o.debug:
|
||||
print(f"Cleaning persisted etcd at {etcd_path}, keeping only TLS certs")
|
||||
|
||||
etcd_image = "gcr.io/etcd-development/etcd:v3.5.9"
|
||||
temp_dir = "/tmp/laconic-etcd-cleanup"
|
||||
|
||||
# Whitelist: prefixes to KEEP - everything else gets deleted
|
||||
keep_prefixes = "/registry/secrets/caddy-system"
|
||||
|
||||
# The etcd image is distroless (no shell). We extract the statically-linked
|
||||
# etcdctl binary and run it from alpine which has shell + jq support.
|
||||
cleanup_script = f"""
|
||||
set -e
|
||||
ALPINE_IMAGE="alpine:3.19"
|
||||
|
||||
# Cleanup previous runs
|
||||
docker rm -f laconic-etcd-cleanup 2>/dev/null || true
|
||||
docker rm -f etcd-extract 2>/dev/null || true
|
||||
docker run --rm -v /tmp:/tmp $ALPINE_IMAGE rm -rf {temp_dir}
|
||||
|
||||
# Create temp dir
|
||||
docker run --rm -v /tmp:/tmp $ALPINE_IMAGE mkdir -p {temp_dir}
|
||||
|
||||
# Extract etcdctl binary (it's statically linked)
|
||||
docker create --name etcd-extract {etcd_image}
|
||||
docker cp etcd-extract:/usr/local/bin/etcdctl /tmp/etcdctl-bin
|
||||
docker rm etcd-extract
|
||||
docker run --rm -v /tmp/etcdctl-bin:/src:ro -v {temp_dir}:/dst $ALPINE_IMAGE \
|
||||
sh -c "cp /src /dst/etcdctl && chmod +x /dst/etcdctl"
|
||||
|
||||
# Copy db to temp location
|
||||
docker run --rm \
|
||||
-v {etcd_path}:/etcd:ro \
|
||||
-v {temp_dir}:/tmp-work \
|
||||
$ALPINE_IMAGE cp /etcd/member/snap/db /tmp-work/etcd-snapshot.db
|
||||
|
||||
# Restore snapshot
|
||||
docker run --rm -v {temp_dir}:/work {etcd_image} \
|
||||
etcdutl snapshot restore /work/etcd-snapshot.db \
|
||||
--data-dir=/work/etcd-data --skip-hash-check 2>/dev/null
|
||||
|
||||
# Start temp etcd (runs the etcd binary, no shell needed)
|
||||
docker run -d --name laconic-etcd-cleanup \
|
||||
-v {temp_dir}/etcd-data:/etcd-data \
|
||||
-v {temp_dir}:/backup \
|
||||
{etcd_image} etcd \
|
||||
--data-dir=/etcd-data \
|
||||
--listen-client-urls=http://0.0.0.0:2379 \
|
||||
--advertise-client-urls=http://localhost:2379
|
||||
|
||||
sleep 3
|
||||
|
||||
# Use alpine with extracted etcdctl to run commands (alpine has shell + jq)
|
||||
# Export caddy secrets
|
||||
docker run --rm \
|
||||
-v {temp_dir}:/backup \
|
||||
--network container:laconic-etcd-cleanup \
|
||||
$ALPINE_IMAGE sh -c \
|
||||
'/backup/etcdctl get --prefix "{keep_prefixes}" -w json \
|
||||
> /backup/kept.json 2>/dev/null || echo "{{}}" > /backup/kept.json'
|
||||
|
||||
# Delete ALL registry keys
|
||||
docker run --rm \
|
||||
-v {temp_dir}:/backup \
|
||||
--network container:laconic-etcd-cleanup \
|
||||
$ALPINE_IMAGE /backup/etcdctl del --prefix /registry
|
||||
|
||||
# Restore kept keys using jq
|
||||
docker run --rm \
|
||||
-v {temp_dir}:/backup \
|
||||
--network container:laconic-etcd-cleanup \
|
||||
$ALPINE_IMAGE sh -c '
|
||||
apk add --no-cache jq >/dev/null 2>&1
|
||||
jq -r ".kvs[] | @base64" /backup/kept.json 2>/dev/null | \
|
||||
while read encoded; do
|
||||
key=$(echo $encoded | base64 -d | jq -r ".key" | base64 -d)
|
||||
val=$(echo $encoded | base64 -d | jq -r ".value" | base64 -d)
|
||||
echo "$val" | /backup/etcdctl put "$key"
|
||||
done
|
||||
' || true
|
||||
|
||||
# Save cleaned snapshot
|
||||
docker exec laconic-etcd-cleanup \
|
||||
etcdctl snapshot save /etcd-data/cleaned-snapshot.db
|
||||
|
||||
docker stop laconic-etcd-cleanup
|
||||
docker rm laconic-etcd-cleanup
|
||||
|
||||
# Restore to temp location first to verify it works
|
||||
docker run --rm \
|
||||
-v {temp_dir}/etcd-data/cleaned-snapshot.db:/data/db:ro \
|
||||
-v {temp_dir}:/restore \
|
||||
{etcd_image} \
|
||||
etcdutl snapshot restore /data/db --data-dir=/restore/new-etcd \
|
||||
--skip-hash-check 2>/dev/null
|
||||
|
||||
# Create timestamped backup of original (kept forever)
|
||||
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
|
||||
docker run --rm -v {etcd_path}:/etcd $ALPINE_IMAGE \
|
||||
cp -a /etcd/member /etcd/member.backup-$TIMESTAMP
|
||||
|
||||
# Replace original with cleaned version
|
||||
docker run --rm -v {etcd_path}:/etcd -v {temp_dir}:/tmp-work $ALPINE_IMAGE \
|
||||
sh -c "rm -rf /etcd/member && mv /tmp-work/new-etcd/member /etcd/member"
|
||||
|
||||
# Cleanup temp files (but NOT the timestamped backup in etcd_path)
|
||||
docker run --rm -v /tmp:/tmp $ALPINE_IMAGE rm -rf {temp_dir}
|
||||
rm -f /tmp/etcdctl-bin
|
||||
"""
|
||||
|
||||
result = subprocess.run(cleanup_script, shell=True, capture_output=True, text=True)
|
||||
if result.returncode != 0:
|
||||
if opts.o.debug:
|
||||
print(f"Warning: etcd cleanup failed: {result.stderr}")
|
||||
return False
|
||||
|
||||
if opts.o.debug:
|
||||
print("Cleaned etcd, kept only TLS certificates")
|
||||
return True
|
||||
|
||||
|
||||
def create_cluster(name: str, config_file: str):
|
||||
"""Create or reuse the single kind cluster for this host.
|
||||
|
||||
There is only one kind cluster per host by design. Multiple deployments
|
||||
share this cluster. If a cluster already exists, it is reused.
|
||||
|
||||
Args:
|
||||
name: Cluster name (used only when creating the first cluster)
|
||||
config_file: Path to kind config file (used only when creating)
|
||||
|
||||
Returns:
|
||||
The name of the cluster being used
|
||||
"""
|
||||
existing = get_kind_cluster()
|
||||
if existing:
|
||||
print(f"Using existing cluster: {existing}")
|
||||
return existing
|
||||
|
||||
# Clean persisted etcd, keeping only TLS certificates
|
||||
etcd_path = _get_etcd_host_path_from_kind_config(config_file)
|
||||
if etcd_path:
|
||||
_clean_etcd_keeping_certs(etcd_path)
|
||||
|
||||
print(f"Creating new cluster: {name}")
|
||||
result = _run_command(f"kind create cluster --name {name} --config {config_file}")
|
||||
if result.returncode != 0:
|
||||
raise DeployerException(f"kind create cluster failed: {result}")
|
||||
return name
|
||||
|
||||
|
||||
def destroy_cluster(name: str):
|
||||
_run_command(f"kind delete cluster --name {name}")
|
||||
|
||||
|
||||
def is_ingress_running() -> bool:
|
||||
"""Check if the Caddy ingress controller is already running in the cluster."""
|
||||
try:
|
||||
core_v1 = client.CoreV1Api()
|
||||
pods = core_v1.list_namespaced_pod(
|
||||
namespace="caddy-system",
|
||||
label_selector=(
|
||||
"app.kubernetes.io/name=caddy-ingress-controller,"
|
||||
"app.kubernetes.io/component=controller"
|
||||
),
|
||||
)
|
||||
for pod in pods.items:
|
||||
if pod.status and pod.status.container_statuses:
|
||||
if pod.status.container_statuses[0].ready is True:
|
||||
if opts.o.debug:
|
||||
print("Caddy ingress controller already running")
|
||||
return True
|
||||
return False
|
||||
except ApiException:
|
||||
return False
|
||||
|
||||
|
||||
def wait_for_ingress_in_kind():
|
||||
core_v1 = client.CoreV1Api()
|
||||
for i in range(20):
|
||||
@ -345,7 +132,7 @@ def wait_for_ingress_in_kind():
|
||||
error_exit("ERROR: Timed out waiting for Caddy ingress to become ready")
|
||||
|
||||
|
||||
def install_ingress_for_kind(acme_email: str = ""):
|
||||
def install_ingress_for_kind():
|
||||
api_client = client.ApiClient()
|
||||
ingress_install = os.path.abspath(
|
||||
get_k8s_dir().joinpath(
|
||||
@ -354,34 +141,7 @@ def install_ingress_for_kind(acme_email: str = ""):
|
||||
)
|
||||
if opts.o.debug:
|
||||
print("Installing Caddy ingress controller in kind cluster")
|
||||
|
||||
# Template the YAML with email before applying
|
||||
with open(ingress_install) as f:
|
||||
yaml_content = f.read()
|
||||
|
||||
if acme_email:
|
||||
yaml_content = yaml_content.replace('email: ""', f'email: "{acme_email}"')
|
||||
if opts.o.debug:
|
||||
print(f"Configured Caddy with ACME email: {acme_email}")
|
||||
|
||||
# Apply templated YAML
|
||||
yaml_objects = list(yaml.safe_load_all(yaml_content))
|
||||
utils.create_from_yaml(api_client, yaml_objects=yaml_objects)
|
||||
|
||||
# Patch ConfigMap with ACME email if provided
|
||||
if acme_email:
|
||||
if opts.o.debug:
|
||||
print(f"Configuring ACME email: {acme_email}")
|
||||
core_api = client.CoreV1Api()
|
||||
configmap = core_api.read_namespaced_config_map(
|
||||
name="caddy-ingress-controller-configmap", namespace="caddy-system"
|
||||
)
|
||||
configmap.data["email"] = acme_email
|
||||
core_api.patch_namespaced_config_map(
|
||||
name="caddy-ingress-controller-configmap",
|
||||
namespace="caddy-system",
|
||||
body=configmap,
|
||||
)
|
||||
utils.create_from_yaml(api_client, yaml_file=ingress_install)
|
||||
|
||||
|
||||
def load_images_into_kind(kind_cluster_name: str, image_set: Set[str]):
|
||||
@ -393,10 +153,10 @@ def load_images_into_kind(kind_cluster_name: str, image_set: Set[str]):
|
||||
raise DeployerException(f"kind load docker-image failed: {result}")
|
||||
|
||||
|
||||
def pods_in_deployment(core_api: client.CoreV1Api, deployment_name: str, namespace: str = "default"):
|
||||
def pods_in_deployment(core_api: client.CoreV1Api, deployment_name: str):
|
||||
pods = []
|
||||
pod_response = core_api.list_namespaced_pod(
|
||||
namespace=namespace, label_selector=f"app={deployment_name}"
|
||||
namespace="default", label_selector=f"app={deployment_name}"
|
||||
)
|
||||
if opts.o.debug:
|
||||
print(f"pod_response: {pod_response}")
|
||||
@ -406,10 +166,10 @@ def pods_in_deployment(core_api: client.CoreV1Api, deployment_name: str, namespa
|
||||
return pods
|
||||
|
||||
|
||||
def containers_in_pod(core_api: client.CoreV1Api, pod_name: str, namespace: str = "default") -> List[str]:
|
||||
def containers_in_pod(core_api: client.CoreV1Api, pod_name: str) -> List[str]:
|
||||
containers: List[str] = []
|
||||
pod_response = cast(
|
||||
client.V1Pod, core_api.read_namespaced_pod(pod_name, namespace=namespace)
|
||||
client.V1Pod, core_api.read_namespaced_pod(pod_name, namespace="default")
|
||||
)
|
||||
if opts.o.debug:
|
||||
print(f"pod_response: {pod_response}")
|
||||
@ -564,25 +324,6 @@ def _generate_kind_mounts(parsed_pod_files, deployment_dir, deployment_context):
|
||||
volume_host_path_map = _get_host_paths_for_volumes(deployment_context)
|
||||
seen_host_path_mounts = set() # Track to avoid duplicate mounts
|
||||
|
||||
# Cluster state backup for offline data recovery (unique per deployment)
|
||||
# etcd contains all k8s state; PKI certs needed to decrypt etcd offline
|
||||
deployment_id = deployment_context.id
|
||||
backup_subdir = f"cluster-backups/{deployment_id}"
|
||||
|
||||
etcd_host_path = _make_absolute_host_path(
|
||||
Path(f"./data/{backup_subdir}/etcd"), deployment_dir
|
||||
)
|
||||
volume_definitions.append(
|
||||
f" - hostPath: {etcd_host_path}\n" f" containerPath: /var/lib/etcd\n"
|
||||
)
|
||||
|
||||
pki_host_path = _make_absolute_host_path(
|
||||
Path(f"./data/{backup_subdir}/pki"), deployment_dir
|
||||
)
|
||||
volume_definitions.append(
|
||||
f" - hostPath: {pki_host_path}\n" f" containerPath: /etc/kubernetes/pki\n"
|
||||
)
|
||||
|
||||
# Note these paths are relative to the location of the pod files (at present)
|
||||
# So we need to fix up to make them correct and absolute because kind assumes
|
||||
# relative to the cwd.
|
||||
@ -942,41 +683,6 @@ def envs_from_compose_file(
|
||||
return result
|
||||
|
||||
|
||||
def translate_sidecar_service_names(
|
||||
envs: Mapping[str, str], sibling_service_names: List[str]
|
||||
) -> Mapping[str, str]:
|
||||
"""Translate docker-compose service names to localhost for sidecar containers.
|
||||
|
||||
In docker-compose, services can reference each other by name (e.g., 'db:5432').
|
||||
In Kubernetes, when multiple containers are in the same pod (sidecars), they
|
||||
share the same network namespace and must use 'localhost' instead.
|
||||
|
||||
This function replaces service name references with 'localhost' in env values.
|
||||
"""
|
||||
import re
|
||||
|
||||
if not sibling_service_names:
|
||||
return envs
|
||||
|
||||
result = {}
|
||||
for env_var, env_val in envs.items():
|
||||
if env_val is None:
|
||||
result[env_var] = env_val
|
||||
continue
|
||||
|
||||
new_val = str(env_val)
|
||||
for service_name in sibling_service_names:
|
||||
# Match service name followed by optional port (e.g., 'db:5432', 'db')
|
||||
# Handle URLs like: postgres://user:pass@db:5432/dbname
|
||||
# and simple refs like: db:5432 or just db
|
||||
pattern = rf"\b{re.escape(service_name)}(:\d+)?\b"
|
||||
new_val = re.sub(pattern, lambda m: f'localhost{m.group(1) or ""}', new_val)
|
||||
|
||||
result[env_var] = new_val
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def envs_from_environment_variables_map(
|
||||
map: Mapping[str, str]
|
||||
) -> List[client.V1EnvVar]:
|
||||
|
||||
@ -98,99 +98,25 @@ class Spec:
|
||||
def get_image_registry(self):
|
||||
return self.obj.get(constants.image_registry_key)
|
||||
|
||||
def get_image_registry_config(self) -> typing.Optional[typing.Dict]:
|
||||
"""Returns registry auth config: {server, username, token-env}.
|
||||
|
||||
Used for private container registries like GHCR. The token-env field
|
||||
specifies an environment variable containing the API token/PAT.
|
||||
|
||||
Note: Uses 'registry-credentials' key to avoid collision with
|
||||
'image-registry' key which is for pushing images.
|
||||
"""
|
||||
return self.obj.get("registry-credentials")
|
||||
|
||||
def get_volumes(self):
|
||||
return self.obj.get(constants.volumes_key, {})
|
||||
|
||||
def get_configmaps(self):
|
||||
return self.obj.get(constants.configmaps_key, {})
|
||||
|
||||
def get_secrets(self):
|
||||
return self.obj.get(constants.secrets_key, {})
|
||||
|
||||
def get_container_resources(self):
|
||||
return Resources(
|
||||
self.obj.get(constants.resources_key, {}).get("containers", {})
|
||||
)
|
||||
|
||||
def get_container_resources_for(
|
||||
self, container_name: str
|
||||
) -> typing.Optional[Resources]:
|
||||
"""Look up per-container resource overrides from spec.yml.
|
||||
|
||||
Checks resources.containers.<container_name> in the spec. Returns None
|
||||
if no per-container override exists (caller falls back to other sources).
|
||||
"""
|
||||
containers_block = self.obj.get(constants.resources_key, {}).get(
|
||||
"containers", {}
|
||||
)
|
||||
if container_name in containers_block:
|
||||
entry = containers_block[container_name]
|
||||
# Only treat it as a per-container override if it's a dict with
|
||||
# reservations/limits nested inside (not a top-level global key)
|
||||
if isinstance(entry, dict) and (
|
||||
"reservations" in entry or "limits" in entry
|
||||
):
|
||||
return Resources(entry)
|
||||
return None
|
||||
|
||||
def get_volume_resources(self):
|
||||
return Resources(
|
||||
self.obj.get(constants.resources_key, {}).get(constants.volumes_key, {})
|
||||
)
|
||||
|
||||
def get_volume_resources_for(self, volume_name: str) -> typing.Optional[Resources]:
|
||||
"""Look up per-volume resource overrides from spec.yml.
|
||||
|
||||
Supports two formats under resources.volumes:
|
||||
|
||||
Global (original):
|
||||
resources:
|
||||
volumes:
|
||||
reservations:
|
||||
storage: 5Gi
|
||||
|
||||
Per-volume (new):
|
||||
resources:
|
||||
volumes:
|
||||
my-volume:
|
||||
reservations:
|
||||
storage: 10Gi
|
||||
|
||||
Returns the per-volume Resources if found, otherwise None.
|
||||
The caller should fall back to get_volume_resources() then the default.
|
||||
"""
|
||||
vol_section = (
|
||||
self.obj.get(constants.resources_key, {}).get(constants.volumes_key, {})
|
||||
)
|
||||
if volume_name not in vol_section:
|
||||
return None
|
||||
entry = vol_section[volume_name]
|
||||
if isinstance(entry, dict) and (
|
||||
"reservations" in entry or "limits" in entry
|
||||
):
|
||||
return Resources(entry)
|
||||
return None
|
||||
|
||||
def get_http_proxy(self):
|
||||
return self.obj.get(constants.network_key, {}).get(constants.http_proxy_key, [])
|
||||
|
||||
def get_namespace(self):
|
||||
return self.obj.get("namespace")
|
||||
|
||||
def get_kind_cluster_name(self):
|
||||
return self.obj.get("kind-cluster-name")
|
||||
|
||||
def get_annotations(self):
|
||||
return self.obj.get(constants.annotations_key, {})
|
||||
|
||||
@ -253,9 +179,6 @@ class Spec:
|
||||
def get_deployment_type(self):
|
||||
return self.obj.get(constants.deploy_to_key)
|
||||
|
||||
def get_acme_email(self):
|
||||
return self.obj.get(constants.network_key, {}).get(constants.acme_email_key, "")
|
||||
|
||||
def is_kubernetes_deployment(self):
|
||||
return self.get_deployment_type() in [
|
||||
constants.k8s_kind_deploy_type,
|
||||
|
||||
@ -19,7 +19,7 @@ from pathlib import Path
|
||||
from urllib.parse import urlparse
|
||||
from tempfile import NamedTemporaryFile
|
||||
|
||||
from stack_orchestrator.util import error_exit, global_options2, get_yaml
|
||||
from stack_orchestrator.util import error_exit, global_options2
|
||||
from stack_orchestrator.deploy.deployment_create import init_operation, create_operation
|
||||
from stack_orchestrator.deploy.deploy import create_deploy_context
|
||||
from stack_orchestrator.deploy.deploy_types import DeployCommandContext
|
||||
@ -41,23 +41,19 @@ def _fixup_container_tag(deployment_dir: str, image: str):
|
||||
def _fixup_url_spec(spec_file_name: str, url: str):
|
||||
# url is like: https://example.com/path
|
||||
parsed_url = urlparse(url)
|
||||
http_proxy_spec = f"""
|
||||
http-proxy:
|
||||
- host-name: {parsed_url.hostname}
|
||||
routes:
|
||||
- path: '{parsed_url.path if parsed_url.path else "/"}'
|
||||
proxy-to: webapp:80
|
||||
"""
|
||||
spec_file_path = Path(spec_file_name)
|
||||
yaml = get_yaml()
|
||||
with open(spec_file_path) as rfile:
|
||||
contents = yaml.load(rfile)
|
||||
contents.setdefault("network", {})["http-proxy"] = [
|
||||
{
|
||||
"host-name": parsed_url.hostname,
|
||||
"routes": [
|
||||
{
|
||||
"path": parsed_url.path if parsed_url.path else "/",
|
||||
"proxy-to": "webapp:80",
|
||||
}
|
||||
],
|
||||
}
|
||||
]
|
||||
contents = rfile.read()
|
||||
contents = contents + http_proxy_spec
|
||||
with open(spec_file_path, "w") as wfile:
|
||||
yaml.dump(contents, wfile)
|
||||
wfile.write(contents)
|
||||
|
||||
|
||||
def create_deployment(
|
||||
|
||||
@ -75,8 +75,6 @@ def get_parsed_stack_config(stack):
|
||||
|
||||
def get_pod_list(parsed_stack):
|
||||
# Handle both old and new format
|
||||
if "pods" not in parsed_stack or not parsed_stack["pods"]:
|
||||
return []
|
||||
pods = parsed_stack["pods"]
|
||||
if type(pods[0]) is str:
|
||||
result = pods
|
||||
@ -105,7 +103,7 @@ def get_job_list(parsed_stack):
|
||||
|
||||
def get_plugin_code_paths(stack) -> List[Path]:
|
||||
parsed_stack = get_parsed_stack_config(stack)
|
||||
pods = parsed_stack.get("pods") or []
|
||||
pods = parsed_stack["pods"]
|
||||
result: Set[Path] = set()
|
||||
for pod in pods:
|
||||
if type(pod) is str:
|
||||
@ -155,16 +153,15 @@ def resolve_job_compose_file(stack, job_name: str):
|
||||
if proposed_file.exists():
|
||||
return proposed_file
|
||||
# If we don't find it fall through to the internal case
|
||||
data_dir = Path(__file__).absolute().parent.joinpath("data")
|
||||
compose_jobs_base = data_dir.joinpath("compose-jobs")
|
||||
# TODO: Add internal compose-jobs directory support if needed
|
||||
# For now, jobs are expected to be in external stacks only
|
||||
compose_jobs_base = Path(stack).parent.parent.joinpath("compose-jobs")
|
||||
return compose_jobs_base.joinpath(f"docker-compose-{job_name}.yml")
|
||||
|
||||
|
||||
def get_pod_file_path(stack, parsed_stack, pod_name: str):
|
||||
pods = parsed_stack.get("pods") or []
|
||||
pods = parsed_stack["pods"]
|
||||
result = None
|
||||
if not pods:
|
||||
return result
|
||||
if type(pods[0]) is str:
|
||||
result = resolve_compose_file(stack, pod_name)
|
||||
else:
|
||||
@ -192,9 +189,9 @@ def get_job_file_path(stack, parsed_stack, job_name: str):
|
||||
|
||||
|
||||
def get_pod_script_paths(parsed_stack, pod_name: str):
|
||||
pods = parsed_stack.get("pods") or []
|
||||
pods = parsed_stack["pods"]
|
||||
result = []
|
||||
if not pods or not type(pods[0]) is str:
|
||||
if not type(pods[0]) is str:
|
||||
for pod in pods:
|
||||
if pod["name"] == pod_name:
|
||||
pod_root_dir = os.path.join(
|
||||
@ -210,9 +207,9 @@ def get_pod_script_paths(parsed_stack, pod_name: str):
|
||||
|
||||
|
||||
def pod_has_scripts(parsed_stack, pod_name: str):
|
||||
pods = parsed_stack.get("pods") or []
|
||||
pods = parsed_stack["pods"]
|
||||
result = False
|
||||
if not pods or type(pods[0]) is str:
|
||||
if type(pods[0]) is str:
|
||||
result = False
|
||||
else:
|
||||
for pod in pods:
|
||||
|
||||
@ -105,15 +105,6 @@ fi
|
||||
# Add a config file to be picked up by the ConfigMap before starting.
|
||||
echo "dbfc7a4d-44a7-416d-b5f3-29842cc47650" > $test_deployment_dir/configmaps/test-config/test_config
|
||||
|
||||
# Add secrets to the deployment spec (references a pre-existing k8s Secret by name).
|
||||
# deploy init already writes an empty 'secrets: {}' key, so we replace it
|
||||
# rather than appending (ruamel.yaml rejects duplicate keys).
|
||||
deployment_spec_file=${test_deployment_dir}/spec.yml
|
||||
sed -i 's/^secrets: {}$/secrets:\n test-secret:\n - TEST_SECRET_KEY/' ${deployment_spec_file}
|
||||
|
||||
# Get the deployment ID for kubectl queries
|
||||
deployment_id=$(cat ${test_deployment_dir}/deployment.yml | cut -d ' ' -f 2)
|
||||
|
||||
echo "deploy create output file test: passed"
|
||||
# Try to start the deployment
|
||||
$TEST_TARGET_SO deployment --dir $test_deployment_dir start
|
||||
@ -175,71 +166,12 @@ else
|
||||
delete_cluster_exit
|
||||
fi
|
||||
|
||||
# --- New feature tests: namespace, labels, jobs, secrets ---
|
||||
|
||||
# Check that the pod is in the deployment-specific namespace (not default)
|
||||
ns_pod_count=$(kubectl get pods -n laconic-${deployment_id} -l app=${deployment_id} --no-headers 2>/dev/null | wc -l)
|
||||
if [ "$ns_pod_count" -gt 0 ]; then
|
||||
echo "namespace isolation test: passed"
|
||||
else
|
||||
echo "namespace isolation test: FAILED"
|
||||
echo "Expected pod in namespace laconic-${deployment_id}"
|
||||
delete_cluster_exit
|
||||
fi
|
||||
|
||||
# Check that the stack label is set on the pod
|
||||
stack_label_count=$(kubectl get pods -n laconic-${deployment_id} -l app.kubernetes.io/stack=test --no-headers 2>/dev/null | wc -l)
|
||||
if [ "$stack_label_count" -gt 0 ]; then
|
||||
echo "stack label test: passed"
|
||||
else
|
||||
echo "stack label test: FAILED"
|
||||
delete_cluster_exit
|
||||
fi
|
||||
|
||||
# Check that the job completed successfully
|
||||
for i in {1..30}; do
|
||||
job_status=$(kubectl get job ${deployment_id}-job-test-job -n laconic-${deployment_id} -o jsonpath='{.status.succeeded}' 2>/dev/null || true)
|
||||
if [ "$job_status" == "1" ]; then
|
||||
break
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
if [ "$job_status" == "1" ]; then
|
||||
echo "job completion test: passed"
|
||||
else
|
||||
echo "job completion test: FAILED"
|
||||
echo "Job status.succeeded: ${job_status}"
|
||||
delete_cluster_exit
|
||||
fi
|
||||
|
||||
# Check that the secrets spec results in an envFrom secretRef on the pod
|
||||
secret_ref=$(kubectl get pod -n laconic-${deployment_id} -l app=${deployment_id} \
|
||||
-o jsonpath='{.items[0].spec.containers[0].envFrom[?(@.secretRef.name=="test-secret")].secretRef.name}' 2>/dev/null || true)
|
||||
if [ "$secret_ref" == "test-secret" ]; then
|
||||
echo "secrets envFrom test: passed"
|
||||
else
|
||||
echo "secrets envFrom test: FAILED"
|
||||
echo "Expected secretRef 'test-secret', got: ${secret_ref}"
|
||||
delete_cluster_exit
|
||||
fi
|
||||
|
||||
# Stop then start again and check the volume was preserved.
|
||||
# Use --skip-cluster-management to reuse the existing kind cluster instead of
|
||||
# destroying and recreating it (which fails on CI runners due to stale etcd/certs
|
||||
# and cgroup detection issues).
|
||||
# Use --delete-volumes to clear PVs so fresh PVCs can bind on restart.
|
||||
# Bind-mount data survives on the host filesystem; provisioner volumes are recreated fresh.
|
||||
$TEST_TARGET_SO deployment --dir $test_deployment_dir stop --delete-volumes --skip-cluster-management
|
||||
# Wait for the namespace to be fully terminated before restarting.
|
||||
# Without this, 'start' fails with 403 Forbidden because the namespace
|
||||
# is still in Terminating state.
|
||||
for i in {1..60}; do
|
||||
if ! kubectl get namespace laconic-${deployment_id} 2>/dev/null | grep -q .; then
|
||||
break
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
$TEST_TARGET_SO deployment --dir $test_deployment_dir start --skip-cluster-management
|
||||
# Stop then start again and check the volume was preserved
|
||||
$TEST_TARGET_SO deployment --dir $test_deployment_dir stop
|
||||
# Sleep a bit just in case
|
||||
# sleep for longer to check if that's why the subsequent create cluster fails
|
||||
sleep 20
|
||||
$TEST_TARGET_SO deployment --dir $test_deployment_dir start
|
||||
wait_for_pods_started
|
||||
wait_for_log_output
|
||||
sleep 1
|
||||
@ -252,9 +184,8 @@ else
|
||||
delete_cluster_exit
|
||||
fi
|
||||
|
||||
# Provisioner volumes are destroyed when PVs are deleted (--delete-volumes on stop).
|
||||
# Unlike bind-mount volumes whose data persists on the host, provisioner storage
|
||||
# is gone, so the volume appears fresh after restart.
|
||||
# These volumes will be completely destroyed by the kind delete/create, because they lived inside
|
||||
# the kind container. So, unlike the bind-mount case, they will appear fresh after the restart.
|
||||
log_output_11=$( $TEST_TARGET_SO deployment --dir $test_deployment_dir logs )
|
||||
if [[ "$log_output_11" == *"/data2 filesystem is fresh"* ]]; then
|
||||
echo "Fresh provisioner volumes test: passed"
|
||||
|
||||
@ -206,7 +206,7 @@ fi
|
||||
# The deployment's pod should be scheduled onto node: worker3
|
||||
# Check that's what happened
|
||||
# Get get the node onto which the stack pod has been deployed
|
||||
deployment_node=$(kubectl get pods -n laconic-${deployment_id} -l app=${deployment_id} -o=jsonpath='{.items..spec.nodeName}')
|
||||
deployment_node=$(kubectl get pods -l app=${deployment_id} -o=jsonpath='{.items..spec.nodeName}')
|
||||
expected_node=${deployment_id}-worker3
|
||||
echo "Stack pod deployed to node: ${deployment_node}"
|
||||
if [[ ${deployment_node} == ${expected_node} ]]; then
|
||||
|
||||
@ -1,5 +1,5 @@
|
||||
#!/usr/bin/env bash
|
||||
# TODO: handle ARM
|
||||
curl --silent -Lo ./kind https://kind.sigs.k8s.io/dl/v0.25.0/kind-linux-amd64
|
||||
curl --silent -Lo ./kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64
|
||||
chmod +x ./kind
|
||||
mv ./kind /usr/local/bin
|
||||
|
||||
@ -1,6 +1,5 @@
|
||||
#!/usr/bin/env bash
|
||||
# TODO: handle ARM
|
||||
# Pin kubectl to match Kind's default k8s version (v1.31.x)
|
||||
curl --silent -LO "https://dl.k8s.io/release/v1.31.2/bin/linux/amd64/kubectl"
|
||||
curl --silent -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
|
||||
chmod +x ./kubectl
|
||||
mv ./kubectl /usr/local/bin
|
||||
|
||||
@ -1,53 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Run a test suite locally in an isolated venv.
|
||||
#
|
||||
# Usage:
|
||||
# ./tests/scripts/run-test-local.sh <test-script>
|
||||
#
|
||||
# Examples:
|
||||
# ./tests/scripts/run-test-local.sh tests/webapp-test/run-webapp-test.sh
|
||||
# ./tests/scripts/run-test-local.sh tests/smoke-test/run-smoke-test.sh
|
||||
# ./tests/scripts/run-test-local.sh tests/k8s-deploy/run-deploy-test.sh
|
||||
#
|
||||
# The script creates a temporary venv, installs shiv, builds the laconic-so
|
||||
# package, runs the requested test, then cleans up.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
if [ $# -lt 1 ]; then
|
||||
echo "Usage: $0 <test-script> [args...]"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
TEST_SCRIPT="$1"
|
||||
shift
|
||||
|
||||
if [ ! -f "$TEST_SCRIPT" ]; then
|
||||
echo "Error: $TEST_SCRIPT not found"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
REPO_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
|
||||
VENV_DIR=$(mktemp -d /tmp/so-test-XXXXXX)
|
||||
|
||||
cleanup() {
|
||||
echo "Cleaning up venv: $VENV_DIR"
|
||||
rm -rf "$VENV_DIR"
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
cd "$REPO_DIR"
|
||||
|
||||
echo "==> Creating venv in $VENV_DIR"
|
||||
python3 -m venv "$VENV_DIR"
|
||||
source "$VENV_DIR/bin/activate"
|
||||
|
||||
echo "==> Installing shiv"
|
||||
pip install -q shiv
|
||||
|
||||
echo "==> Building laconic-so package"
|
||||
./scripts/create_build_tag_file.sh
|
||||
./scripts/build_shiv_package.sh
|
||||
|
||||
echo "==> Running: $TEST_SCRIPT $*"
|
||||
exec "./$TEST_SCRIPT" "$@"
|
||||
Loading…
Reference in New Issue
Block a user