After merging !137 (merged) the wordpress helmRelease on staging ended up in a Helm upgrade failed: another operation (install/upgrade/rollback) is in progress state. Suspending and resuming did not help like it usally does for kube-prom-stack.
It's weird, because we don't use the redis in Stackspin. So I wonder if it's even related to the actual upgrade process.
What usually helps is deleting the hr and reconciling the ks, but that is a manual interference. I wonder if there's anything else we can do to prevent this
Note that that particular issue was supposedly solved some time before flux release 0.20, so we shouldn't hit that. Not to say it's not related, especially the comments about helm controller being oom-killed, which we're also experiencing.
root@staging:~# dmesg -T |grep -i oom[Tue Mar 29 09:16:17 2022] helm-controller invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=969[Tue Mar 29 09:16:17 2022] oom_kill_process+0xfa/0x130[Tue Mar 29 09:16:17 2022] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name[Tue Mar 29 09:16:17 2022] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=be294b39c4d720aa5d2ea637b5968a89042c65210e3076705a4305ca620b34b3,mems_allowed=0,oom_memcg=/kubepods/burstable/podff107bca-f961-4f58-a185-ef6106cca48b,task_memcg=/kubepods/burstable/podff107bca-f961-4f58-a185-ef6106cca48b/be294b39c4d720aa5d2ea637b5968a89042c65210e3076705a4305ca620b34b3,task=helm-controller,pid=265741,uid=100[Tue Mar 29 09:16:17 2022] Memory cgroup out of memory: Killed process 265741 (helm-controller) total-vm:2825428kB, anon-rss:2090480kB, file-rss:0kB, shmem-rss:0kB, UID:100 pgtables:4280kB oom_score_adj:969[Tue Mar 29 09:16:17 2022] oom_reaper: reaped process 265741 (helm-controller), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
!134 (merged) was merged 11:02am CEST, which is 90:02 am server time (UTC). I guess 14 mins after merge the helm-controller tried to install the upgrade and OOM'ed.
No, i think this is a nice way to do it! My method is usually flux delete hr -n stackspin-apps wordpress && reconcile ks wordpress, but that feels more brutal than your way.