Skip to content

chore(deps): bump Talos to v1.14.0-alpha.1#223

Draft
Kirill Ilin (sircthulhu) wants to merge 2 commits into
mainfrom
chore/bump-talos-v1.14
Draft

chore(deps): bump Talos to v1.14.0-alpha.1#223
Kirill Ilin (sircthulhu) wants to merge 2 commits into
mainfrom
chore/bump-talos-v1.14

Conversation

@sircthulhu

@sircthulhu Kirill Ilin (sircthulhu) commented Jun 24, 2026

Copy link
Copy Markdown

Summary

Tracks upstream Talos v1.14.0-alpha.1 for both the main module and pkg/machinery. This lets talm recognize the new v1.14 machine-config documents — in particular KubeEtcdEncryptionConfig (siderolabs/talos#13253), which the previous v1.13-beta machinery rejected as "... not registered".

Version changes

Dependency Before After
github.com/siderolabs/talos v1.12.6 (cozystack fork) v1.14.0-alpha.1 (upstream)
github.com/siderolabs/talos/pkg/machinery v1.13.0-beta.1 (cozystack fork) v1.14.0-alpha.1 (upstream)
k8s.io/{api,apimachinery,client-go,component-base} v0.35.3 v0.36.1
Default Kubernetes version 1.35.0 1.36.1

⚠️ Behavioral note: the default Kubernetes version (used when a project does not set kubernetesVersion in Chart.yaml) moves from 1.35.0 to 1.36.1. Projects that pin kubernetesVersion are unaffected.

What changed

  • Dropped the cozystack/talos fork. It was carried for the --skip-verify flag (siderolabs/talos#12652) but pinned to a v1.13-beta base incompatible with v1.14 machinery. --skip-verify is reimplemented in pkg/commands: a package-level SkipVerify flag plus a talm-local WithClientSkipVerify that builds an InsecureSkipVerify client from the talosconfig context (client-certificate auth preserved).

  • Ported pkg/commands to the v1.14 talosctl client API. global.Args.WithClientNoNodes / WithClient / WithClientMaintenance now take a context.Context; the talm wrappers create a signal-cancellable context and pass it through.

  • Explicit Kubernetes version on bundle generation. v1.14's config/generate now errors on an empty Kubernetes version ("kubernetes version must be specified") instead of silently defaulting it. regenerateTalosconfig and the engine bundle path now pass one explicitly, falling back to the machinery default (1.36.1) when a project leaves it unset — preserving the prior "just works" behavior.

  • Pinned control-plane component images. The render returns a diff over the generated bundle, which drops the apiServer/controllerManager/scheduler images (they equal the bundle default). On v1.14 — which deprecated cluster.controllerManager/scheduler — patching their extraArgs also drops the sibling image during the merge. Either way the applied config carries no component image, so the node re-defaults it to the Talos release's Kubernetes version (v1.14 → k8s 1.36.x) instead of the cluster's kubernetesVersion, skewing controller-manager/scheduler away from kube-apiserver/kubelet (older Talos preserved it, so this only surfaced after the bump). The render now re-derives and pins these images.

Testing

go build, go test ./..., golangci-lint run all clean — including the new contract test TestContract_RenderPinsControlPlaneComponentImages.

Summary by CodeRabbit

  • New Features

    • --skip-verify now works independently across command paths and preserves client authentication details when connecting.
    • Kubernetes version handling now falls back to a sensible default when none is provided.
  • Bug Fixes

    • Improved certificate verification bypass behavior for Talos connections.
    • Fixed version normalization so Kubernetes versions are handled consistently.
  • Chores

    • Updated many dependencies, including Kubernetes, Talos, gRPC, protobuf, and several utility libraries.

Track upstream Talos v1.14.0-alpha.1 for both the main module and machinery.
This makes talm recognize new v1.14 machine-config documents (notably
KubeEtcdEncryptionConfig, siderolabs/talos#13253) which v1.13-beta machinery
rejected as "not registered".

The cozystack/talos fork (carried for the --skip-verify flag,
siderolabs/talos#12652) was pinned to a v1.13-beta base incompatible with v1.14
machinery, so it is dropped. --skip-verify is reimplemented in pkg/commands: a
package-level SkipVerify flag plus a talm-local WithClientSkipVerify that builds
an InsecureSkipVerify client from the talosconfig context (client-cert auth
preserved).

Port pkg/commands to the v1.14 talosctl client API:
- global.Args.WithClient* now take a context; the talm wrappers create a
  signal-cancellable context and pass it through.
- regenerateTalosconfig and the engine bundle path now pass a Kubernetes version
  explicitly (defaulting to the machinery default when unset), since v1.14's
  config/generate errors on an empty version instead of defaulting it.

go build, go test ./..., golangci-lint run all clean.

Assisted-By: Claude AI
Signed-off-by: Kirill Ilin <stitch14@yandex.ru>
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The PR upgrades siderolabs/talos from v1.12.6 to v1.14.0-alpha.1, removes the cozystack/talos fork replace directives in go.mod, and reimplements --skip-verify TLS client behavior natively in pkg/commands/root.go. A package-level SkipVerify variable replaces GlobalArgs.SkipVerify across all command entry points. A kubeVersion helper is added to normalize Kubernetes version strings with a fallback to constants.DefaultKubernetesVersion. Multiple indirect dependencies are also updated.

Changes

Talos v1.14 upgrade, fork removal, and skip-verify reimplementation

Layer / File(s) Summary
Dependency upgrades and fork removal
go.mod
Bumps siderolabs/talos to v1.14.0-alpha.1, k8s.io/* to v0.36.1, AWS SDK, etcd 3.6.11, FluxCD, gRPC 2.29.0, containerd, and other indirect deps; removes the replace block redirecting siderolabs/talos to cozystack/talos and adds an explanatory comment.
SkipVerify flag and WithClientSkipVerify reimplementation
pkg/commands/root.go
Adds exported SkipVerify bool and signalContext() helper; adds signal-aware context creation to WithClientNoNodes and WithClientMaintenance; rewrites WithClientSkipVerify to parse the talosconfig, build a skip-server-cert TLS config with optional client cert/key loading, select endpoints, construct a Talos client, and apply nodes; WithClientAuto now uses the package-level SkipVerify.
kubeVersion helper and DefaultKubernetesVersion fallback
pkg/engine/engine.go, pkg/commands/talosconfig.go
Adds kubeVersion(v string) string that trims a leading v and falls back to constants.DefaultKubernetesVersion when empty; applies it to KubeVersion in InitializeConfigBundle and both bundle builds in applyPatchesAndRenderConfig; regenerateTalosconfig passes constants.DefaultKubernetesVersion instead of an empty string to GenerateConfigBundle.
Flag wiring and call-site updates
main.go, pkg/commands/apply.go, pkg/commands/template.go, pkg/commands/rotate_ca_handler.go
Binds --skip-verify flag to commands.SkipVerify; updates GlobalArgs.SkipVerify references to SkipVerify in apply.go and template.go; switches getKubeconfigFromTalos in rotate_ca_handler.go to call WithClient directly.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI (main.go)
  participant root as pkg/commands/root.go
  participant talosconfig as talosconfig file
  participant talosClient as Talos Client

  CLI->>root: --skip-verify flag → SkipVerify = true
  CLI->>root: WithClientAuto(action)
  root->>root: SkipVerify? → WithClientSkipVerify(action)
  root->>talosconfig: open + decode JSON
  talosconfig-->>root: context config (endpoints, cert, key)
  root->>root: skipVerifyTLSConfig(cert, key) → *tls.Config
  root->>talosClient: New(endpoints, TLS config)
  root->>talosClient: WithNodes(nodes)
  root->>talosClient: action(signalContext, client)
  talosClient-->>CLI: result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 Hippity-hop, the fork is gone,
No more patches to carry on!
SkipVerify stands on its own now, free,
With TLS configs parsed carefully.
Kube versions default when empty they are —
A tidy go.mod shining like a star! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: upgrading Talos to v1.14.0-alpha.1.
Docstring Coverage ✅ Passed Docstring coverage is 93.33% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/bump-talos-v1.14

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates Sidero Talos dependencies to v1.14.0-alpha.1 and drops the cozystack/talos fork. To maintain support for the --skip-verify flag, the functionality is reimplemented locally in pkg/commands using a custom TLS configuration. Additionally, client initialization wrappers are updated to pass contexts as required by the new upstream machinery, and default Kubernetes versions are provided where required by Talos v1.14 config generation. Feedback on the changes suggests adding a nil check for configContext in WithClientSkipVerify to prevent a potential nil pointer panic if a context is defined as empty in the configuration.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread pkg/commands/root.go
Comment on lines +199 to +202
configContext, ok := cfg.Contexts[contextName]
if !ok {
return fmt.Errorf("%w: %q", errContextNotFound, contextName)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the context exists in the talosconfig but is defined as null or empty, cfg.Contexts[contextName] can return nil with ok being true. Dereferencing configContext later (e.g., accessing configContext.Crt or configContext.Endpoints) will cause a nil pointer panic. Adding a nil check here prevents this potential panic.

Suggested change
configContext, ok := cfg.Contexts[contextName]
if !ok {
return fmt.Errorf("%w: %q", errContextNotFound, contextName)
}
configContext, ok := cfg.Contexts[contextName]
if !ok || configContext == nil {
return fmt.Errorf("%w: %q", errContextNotFound, contextName)
}

@sircthulhu Kirill Ilin (sircthulhu) marked this pull request as ready for review June 24, 2026 09:51

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/commands/rotate_ca_handler.go (1)

315-324: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Honor --skip-verify when fetching kubeconfig for CA rotation.

This path now always uses WithClient, so users who need --skip-verify for SAN-mismatched Talos endpoints still fail before CA rotation can discover nodes.

Proposed fix
-	err := WithClient(func(ctx context.Context, c *client.Client) error {
+	action := func(ctx context.Context, c *client.Client) error {
 		var err error

 		kubeconfigData, err = c.Kubeconfig(ctx)
 		if err != nil {
 			return errors.Wrap(err, "failed to get kubeconfig")
@@
 
 		return nil
-	})
+	}
+
+	var err error
+	if SkipVerify {
+		err = WithClientSkipVerify(action)
+	} else {
+		err = WithClient(action)
+	}

Based on PR context, call sites should route through the local skip-verify implementation after moving the flag out of GlobalArgs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/commands/rotate_ca_handler.go` around lines 315 - 324, The kubeconfig
fetch in the CA rotation flow is bypassing the local skip-verify path, so
`--skip-verify` is not honored before node discovery. Update the kubeconfig
retrieval logic in `rotate_ca_handler.go` to route through the existing local
skip-verify implementation instead of always calling `WithClient`, and make sure
the CA rotation path uses the flag now moved out of `GlobalArgs` when calling
`c.Kubeconfig` or its wrapper.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@go.mod`:
- Line 140: The dependency on github.com/containerd/containerd/v2 is still
pinned to an unpatched indirect version; update the go.mod requirement to a
fixed release such as v2.3.2 or newer (or the appropriate older-branch patch
version) and then refresh the module metadata so the indirect entry stays
consistent with the resolved dependency set.

In `@pkg/commands/root.go`:
- Around line 209-218: The client option assembly in root.go is missing the
cluster override when rebuilding options for the skip-verify path, so
`--cluster` gets dropped. Update the option list in the client construction
logic around `WithTLSConfig`/`WithDefaultGRPCDialOptions` to also include
`client.WithCluster(GlobalArgs.Cluster)` when a cluster is set, matching the
normal client path used elsewhere in the root command handling.

---

Outside diff comments:
In `@pkg/commands/rotate_ca_handler.go`:
- Around line 315-324: The kubeconfig fetch in the CA rotation flow is bypassing
the local skip-verify path, so `--skip-verify` is not honored before node
discovery. Update the kubeconfig retrieval logic in `rotate_ca_handler.go` to
route through the existing local skip-verify implementation instead of always
calling `WithClient`, and make sure the CA rotation path uses the flag now moved
out of `GlobalArgs` when calling `c.Kubeconfig` or its wrapper.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 850479e6-a583-4fe7-bea6-052173563f06

📥 Commits

Reviewing files that changed from the base of the PR and between 2881d54 and f988a2b.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (8)
  • go.mod
  • main.go
  • pkg/commands/apply.go
  • pkg/commands/root.go
  • pkg/commands/rotate_ca_handler.go
  • pkg/commands/talosconfig.go
  • pkg/commands/template.go
  • pkg/engine/engine.go

Comment thread go.mod
github.com/cockroachdb/logtags v0.0.0-20230118201751-21c54148d20b // indirect
github.com/cockroachdb/redact v1.1.5 // indirect
github.com/containerd/containerd/v2 v2.2.2 // indirect
github.com/containerd/containerd/v2 v2.3.1 // indirect

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Current pin:"
rg -n 'github.com/containerd/containerd/v2' go.mod

echo
echo "Direct imports of containerd in this repo (if any):"
rg -n --type go '"github.com/containerd/containerd(/v2)?"' .

echo
echo "Fixed versions reported by OSV for flagged advisories:"
for ghsa in GHSA-33vj-92qq-66hc GHSA-cvxm-645q-p574 GHSA-jpcc-p29g-p8mq GHSA-rgh6-rfwx-v388 GHSA-xhf5-7wjv-pqxp; do
  curl -s "https://api.osv.dev/v1/vulns/${ghsa}" | python - "$ghsa" <<'PY'
import json, sys
ghsa = sys.argv[1]
data = json.load(sys.stdin)
fixed = set()
for aff in data.get("affected", []):
    if aff.get("package", {}).get("name") == "github.com/containerd/containerd/v2":
        for r in aff.get("ranges", []):
            for ev in r.get("events", []):
                if "fixed" in ev:
                    fixed.add(ev["fixed"])
print(f"{ghsa}: fixed -> {', '.join(sorted(fixed)) if fixed else 'none reported'}")
PY
done

Repository: cozystack/talm

Length of output: 278


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "go.mod around the containerd pin:"
sed -n '120,155p' go.mod

echo
echo "All containerd mentions in tracked files:"
rg -n --hidden --glob '!**/.git/**' 'containerd' .

echo
echo "All direct imports of github.com/containerd/containerd variants in Go files:"
rg -n --glob '*.go' '"github.com/containerd/containerd(/v2)?' .

echo
echo "Module graph entries that mention containerd (if go list is available):"
go list -m all 2>/dev/null | rg 'containerd' || true

Repository: cozystack/talm

Length of output: 4768


🌐 Web query:

containerd v2.3.1 advisories fixed versions GHSA-33vj-92qq-66hc GHSA-cvxm-645q-p574 GHSA-jpcc-p29g-p8mq GHSA-rgh6-rfwx-v388 GHSA-xhf5-7wjv-pqxp

💡 Result:

Fixed (patched) containerd versions for the specified GHSA advisories are: - GHSA-33vj-92qq-66hc: fixed in containerd 2.3.2, 2.2.5, 2.1.9 [1][2] - GHSA-cvxm-645q-p574: fixed in containerd 2.3.2, 2.2.5, 2.1.9 [3][2] - GHSA-jpcc-p29g-p8mq: fixed in containerd 2.3.2, 2.2.5, 2.1.9 [4][2] - GHSA-rgh6-rfwx-v388: fixed in containerd 2.3.2, 2.2.5, 2.1.9 [3][2] - GHSA-xhf5-7wjv-pqxp: fixed in containerd 2.3.2, 2.2.5, 2.1.9 [3][2] Note: containerd security advisories list the vulnerabilities and are published by the containerd/containerd GitHub security advisory system [2]. A related AWS bulletin covering the same set of CRI issues also states that upstream fixes are available via containerd patched releases, which correspond to upgrading to the fixed versions above [3].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
import json, urllib.request

ghsas = [
    "GHSA-33vj-92qq-66hc",
    "GHSA-cvxm-645q-p574",
    "GHSA-jpcc-p29g-p8mq",
    "GHSA-rgh6-rfwx-v388",
    "GHSA-xhf5-7wjv-pqxp",
]
for ghsa in ghsas:
    url = f"https://api.osv.dev/v1/vulns/{ghsa}"
    try:
        with urllib.request.urlopen(url, timeout=20) as r:
            data = json.load(r)
    except Exception as e:
        print(f"{ghsa}: ERROR {e}")
        continue

    fixed = []
    affected_pkg = []
    for aff in data.get("affected", []):
        pkg = aff.get("package", {}).get("name")
        if pkg:
            affected_pkg.append(pkg)
        for r in aff.get("ranges", []):
            for ev in r.get("events", []):
                if "fixed" in ev:
                    fixed.append(ev["fixed"])
    print(f"{ghsa}: packages={sorted(set(affected_pkg))} fixed={sorted(set(fixed))}")
PY

Repository: cozystack/talm

Length of output: 927


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
import json, urllib.request

ghsas = [
    "GHSA-33vj-92qq-66hc",
    "GHSA-cvxm-645q-p574",
    "GHSA-jpcc-p29g-p8mq",
    "GHSA-rgh6-rfwx-v388",
    "GHSA-xhf5-7wjv-pqxp",
]

for ghsa in ghsas:
    url = f"https://api.osv.dev/v1/vulns/{ghsa}"
    try:
        with urllib.request.urlopen(url, timeout=20) as r:
            data = json.load(r)
    except Exception as e:
        print(f"{ghsa}: ERROR {e}")
        continue

    pkg_names = []
    fixed_versions = []
    version_ranges = []
    for aff in data.get("affected", []):
        pkg = aff.get("package", {}).get("name")
        if pkg:
            pkg_names.append(pkg)
        for r in aff.get("ranges", []):
            for ev in r.get("events", []):
                if "fixed" in ev:
                    fixed_versions.append(ev["fixed"])
                if "introduced" in ev or "fixed" in ev:
                    version_ranges.append(ev)

    applies = any(p == "github.com/containerd/containerd/v2" for p in pkg_names)
    print(f"{ghsa}: applies_to_containerd_v2={applies} fixed_versions={sorted(set(fixed_versions))}")
PY

Repository: cozystack/talm

Length of output: 927


Upgrade github.com/containerd/containerd/v2 to a patched release
go.mod:140 still pins github.com/containerd/containerd/v2 v2.3.1 indirectly. Upgrade to v2.3.2 or newer (v2.2.5 / v2.1.9 on older branches).

🧰 Tools
🪛 OSV Scanner (2.4.0)

[HIGH] 140-140: github.com/containerd/containerd/v2 2.3.1: containerd CRI checkpoint restore CDI annotation smuggling

(GHSA-33vj-92qq-66hc)


[HIGH] 140-140: github.com/containerd/containerd/v2 2.3.1: containerd: CRI checkpoint import allows local image tag poisoning

(GHSA-cvxm-645q-p574)


[HIGH] 140-140: github.com/containerd/containerd/v2 2.3.1: containerd image-triggered runtime DoS via unbounded group parsing

(GHSA-jpcc-p29g-p8mq)


[HIGH] 140-140: github.com/containerd/containerd/v2 2.3.1: Arbitrary host CRI log file read via symlink following in CRI checkpoint restore

(GHSA-rgh6-rfwx-v388)


[HIGH] 140-140: github.com/containerd/containerd/v2 2.3.1: containerd CRI — image-config LABEL flows to restart-monitor binary:// logger: host-root command execution from an image pull

(GHSA-xhf5-7wjv-pqxp)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@go.mod` at line 140, The dependency on github.com/containerd/containerd/v2 is
still pinned to an unpatched indirect version; update the go.mod requirement to
a fixed release such as v2.3.2 or newer (or the appropriate older-branch patch
version) and then refresh the module metadata so the indirect entry stays
consistent with the resolved dependency set.

Source: Linters/SAST tools

Comment thread pkg/commands/root.go
Comment on lines +209 to +218
opts := []client.OptionFunc{
client.WithTLSConfig(tlsConfig),
client.WithDefaultGRPCDialOptions(),
}

if len(GlobalArgs.Endpoints) > 0 {
opts = append(opts, client.WithEndpoints(GlobalArgs.Endpoints...))
} else if len(configContext.Endpoints) > 0 {
opts = append(opts, client.WithEndpoints(configContext.Endpoints...))
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== Files of interest ==\n'
git ls-files pkg/commands/root.go pkg/client* client* 2>/dev/null || true

printf '\n== root.go around lines 180-240 ==\n'
sed -n '180,240p' pkg/commands/root.go

printf '\n== Search for cluster-related client options ==\n'
rg -n "WithCluster|WithEndpoints|SkipVerify|skip-verify|Cluster" pkg -g'*.go'

printf '\n== Locate client option definitions ==\n'
fd -a 'client' . | head -n 50

Repository: cozystack/talm

Length of output: 18683


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== Files of interest ==\n'
git ls-files pkg/commands/root.go pkg/client* client* 2>/dev/null || true

printf '\n== root.go around lines 180-240 ==\n'
sed -n '180,240p' pkg/commands/root.go

printf '\n== Search for cluster-related client options ==\n'
rg -n "WithCluster|WithEndpoints|SkipVerify|skip-verify|Cluster" pkg -g'*.go'

printf '\n== Locate client option definitions ==\n'
fd -a 'client' . | head -n 50

Repository: cozystack/talm

Length of output: 18683


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== Files of interest ==\n'
git ls-files pkg/commands/root.go pkg/client* client* 2>/dev/null || true

printf '\n== root.go around lines 180-240 ==\n'
sed -n '180,240p' pkg/commands/root.go

printf '\n== Search for cluster-related client options ==\n'
rg -n "WithCluster|WithEndpoints|SkipVerify|skip-verify|Cluster" pkg -g'*.go'

printf '\n== Locate client option definitions ==\n'
fd -a 'client' . | head -n 50

Repository: cozystack/talm

Length of output: 18683


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== go.mod Talos dependency ==\n'
rg -n 'siderolabs/talos|talos' go.mod

printf '\n== Go module cache location ==\n'
go env GOPATH GOMODCACHE

printf '\n== Search Talos client package in module cache ==\n'
modcache="$(go env GOMODCACHE)"
rg -n "func WithCluster|WithEndpoints|type OptionFunc|WithDefaultGRPCDialOptions" "$modcache"/github.com/siderolabs/talos* -g'*.go' || true

printf '\n== Search for cluster option names near client option defs ==\n'
rg -n "Cluster" "$modcache"/github.com/siderolabs/talos*"/client" -g'*.go' || true

Repository: cozystack/talm

Length of output: 5144


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== go.mod Talos dependency ==\n'
rg -n 'siderolabs/talos|talos' go.mod

printf '\n== Go module cache location ==\n'
go env GOPATH GOMODCACHE

printf '\n== Search Talos client package in module cache ==\n'
modcache="$(go env GOMODCACHE)"
rg -n "func WithCluster|WithEndpoints|type OptionFunc|WithDefaultGRPCDialOptions" "$modcache"/github.com/siderolabs/talos* -g'*.go' || true

printf '\n== Search for cluster option names near client option defs ==\n'
rg -n "Cluster" "$modcache"/github.com/siderolabs/talos*"/client" -g'*.go' || true

Repository: cozystack/talm

Length of output: 5144


🏁 Script executed:

#!/bin/bash
set -euo pipefail

modcache="$(go env GOMODCACHE)"

printf '\n== Talos client option WithCluster definition ==\n'
sed -n '130,170p' "$modcache"/github.com/siderolabs/talos/pkg/machinery@v1.14.0-alpha.1/client/options.go

printf '\n== Talosctl global client wrapper ==\n'
sed -n '1,120p' "$modcache"/github.com/siderolabs/talos@v1.14.0-alpha.1/cmd/talosctl/pkg/talos/global/client.go

printf '\n== Search local use of GlobalArgs.Cluster ==\n'
rg -n "GlobalArgs\.Cluster|WithCluster\(" pkg/commands -g'*.go'

Repository: cozystack/talm

Length of output: 5067


🏁 Script executed:

#!/bin/bash
set -euo pipefail

modcache="$(go env GOMODCACHE)"

printf '\n== Talos client option WithCluster definition ==\n'
sed -n '130,170p' "$modcache"/github.com/siderolabs/talos/pkg/machinery@v1.14.0-alpha.1/client/options.go

printf '\n== Talosctl global client wrapper ==\n'
sed -n '1,120p' "$modcache"/github.com/siderolabs/talos@v1.14.0-alpha.1/cmd/talosctl/pkg/talos/global/client.go

printf '\n== Search local use of GlobalArgs.Cluster ==\n'
rg -n "GlobalArgs\.Cluster|WithCluster\(" pkg/commands -g'*.go'

Repository: cozystack/talm

Length of output: 5067


🏁 Script executed:

#!/bin/bash
set -euo pipefail

modcache="$(go env GOMODCACHE)"

printf '\n== Talos client option WithCluster definition ==\n'
sed -n '130,170p' "$modcache"/github.com/siderolabs/talos/pkg/machinery@v1.14.0-alpha.1/client/options.go

printf '\n== Talosctl global client wrapper ==\n'
sed -n '1,120p' "$modcache"/github.com/siderolabs/talos@v1.14.0-alpha.1/cmd/talosctl/pkg/talos/global/client.go

printf '\n== Search local use of GlobalArgs.Cluster ==\n'
rg -n "GlobalArgs\.Cluster|WithCluster\(" pkg/commands -g'*.go'

Repository: cozystack/talm

Length of output: 5067


Preserve --cluster in WithClientSkipVerify

pkg/commands/root.go:209-218 still rebuilds the client without client.WithCluster(GlobalArgs.Cluster). Talos v1.14 keeps client.WithCluster, and the normal client path forwards it, so --skip-verify --cluster <name> will drop the cluster override here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/commands/root.go` around lines 209 - 218, The client option assembly in
root.go is missing the cluster override when rebuilding options for the
skip-verify path, so `--cluster` gets dropped. Update the option list in the
client construction logic around `WithTLSConfig`/`WithDefaultGRPCDialOptions` to
also include `client.WithCluster(GlobalArgs.Cluster)` when a cluster is set,
matching the normal client path used elsewhere in the root command handling.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOT LGTM — new TLS-skip / version-fallback logic ships without tests, and the bump pins a production bootstrap tool to a Talos alpha; both need resolution before merge.

Business context: bump Talos to v1.14 machinery so talm recognizes the new v1.14-only KubeEtcdEncryptionConfig document, dropping the cozystack/talos fork that previously carried --skip-verify.

What was verified and is not a problem (raising so reviewers don't re-litigate):

  • Fork patches preserved. The fork at the pinned commit was upstream v1.12.1 plus exactly two commits: the rotate-ca --k8s-endpoint fix (independently present in upstream v1.14.0-alpha.1 — rotate-ca.go:153, pkg/rotate/pki/kubernetes/kubernetes.go:46/176) and --skip-verify (faithfully reimplemented in pkg/commands/root.go). No carried patch is lost.
  • Scope is cohesive. The skip-verify reimplementation, context threading, k8s.io bump, and explicit-Kubernetes-version handling are all forced consequences of the v1.14 client-API and config/generate changes, not unrelated additions.

Blockers

B1: Version choice — pinning to a Talos alpha

File: go.mod:82,122

Issue: This pins talm (a production bootstrap tool) to v1.14.0-alpha.1, a month-old prerelease, when stable v1.13.5 exists.

Evidence: The sole capability v1.14 adds over v1.13.x is the KubeEtcdEncryptionConfig type — confirmed present only in v1.14.0-alpha.1 (absent in v1.13.5 / v1.13.3). On every other axis stable is ahead: v1.13.5's DefaultKubernetesVersion is 1.36.2 vs the alpha's 1.36.1, and its config schema is frozen while the alpha's KubeEtcdEncryptionConfig can still change before v1.14.0 ships.

Impact: The platform would track an unreleased Talos with a non-final machine-config schema and an older default Kubernetes version than current stable.

Decision needed: Is KubeEtcdEncryptionConfig required before v1.14.0 stable releases? If not, v1.13.5 is the better target. If yes, please state that justification in the PR body so the alpha pin is a deliberate, documented tradeoff.

B2: New logic ships without tests

File: pkg/commands/root.go (skipVerifyTLSConfig, WithClientSkipVerify), pkg/engine/engine.go (kubeVersion)

Issue: The reimplemented --skip-verify path and the new kubeVersion fallback have no unit tests; no existing test references them.

Evidence: grep for skipVerifyTLSConfig / kubeVersion / WithClientSkipVerify across *_test.go returns nothing, despite 33 test files in pkg/commands. go test ./... and the coverage CI job passing only confirm existing tests still pass — they do not cover the added lines.

Impact: skipVerifyTLSConfig disables TLS verification and parses the client cert/key (three error paths); kubeVersion encodes the "fall back to default when version unset" guarantee the PR body promises to preserve. Both are load-bearing and currently unverified.

Fix: Add table tests for kubeVersion (empty -> DefaultKubernetesVersion, "v1.30.0" -> "1.30.0") and for skipVerifyTLSConfig (cert+key present, both absent, malformed base64, nil/empty context). The context-resolution and errContextNotFound paths in WithClientSkipVerify are also worth a test.

Non-blocking follow-ups

  1. pkg/commands/root.go: cfg.Contexts[contextName] can return (nil, true) if the talosconfig defines a context as null; the later configContext.Crt deref would panic. Carried over from the fork, low severity (needs a malformed config), but a || configContext == nil guard is cheap — and a skipVerifyTLSConfig test should cover the nil case.
  2. docs/manual-test-plan.md was not updated for the --skip-verify reimplementation or the default-Kubernetes-version change; the repo otherwise keeps that plan in sync with behavior changes.
  3. The go.mod comment and PR body state the fork was carried only for --skip-verify; it also carried the rotate-ca --k8s-endpoint fix. Worth a one-line correction so the history is accurate.

@sircthulhu Kirill Ilin (sircthulhu) added the do-not-merge This PR should not be merged label Jun 24, 2026
@sircthulhu

Copy link
Copy Markdown
Author

Releasing as alpha does not make much sense, so these changes should only be used for testing. I'll update this PR given that talos v1.14 is released

Do not merge until Talos v1.14 is released.

@sircthulhu Kirill Ilin (sircthulhu) marked this pull request as draft June 24, 2026 12:55
talm renders the per-node config as a diff over the generated bundle, which
drops the control-plane component images (apiServer/controllerManager/scheduler)
because they equal the bundle default. On Talos v1.14 — which deprecated
cluster.controllerManager/scheduler — patching their extraArgs also drops the
sibling image during the merge. Either way the applied config carries no
component image, so the node fills it from the Talos release default (v1.14 ->
k8s 1.36.x) instead of the cluster's kubernetesVersion, silently skewing
controller-manager/scheduler away from kube-apiserver/kubelet (older Talos
preserved the cluster version, so this surfaced only after the v1.14 bump).

Re-derive the component images from kubernetesVersion + the machinery image
constants and pin them in the rendered config (both diff and --full modes).
Worker configs and an empty kubernetesVersion pass through unchanged; an image
already set (e.g. pinned by the chart) is kept. Adds a regression test.

Assisted-By: Claude AI
Signed-off-by: Kirill Ilin <stitch14@yandex.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge This PR should not be merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants