Pod eviction due to low resources
Adding a big file to nextcloud or just running the cluster for a long time might lead to low disk space. In this case kubernetes will try to free up resources by deleting pods which will be recreated by the corresponding deployments. As this doesn't really free up sapce the process repeats until the cluster is completely falling apart. One reason for this is that the error logs grow faster and faster, leading to more pods being evicted and so on.
In my setup the kubelet logfile, located in the rancher-kubelet container was already 7.1G big. It seems that no logrotate mechanism takes care of it.
Deleting the logfile didn't really help in my case. I had to restart the cluster with RKE and then reinstall some applications as they didn't start properly because local-storage was not available during start-up.
In the end i added 2 additional args to kubelet by modifying the cluster group vars.
case@localhost [03:52:43 PM] [~/src/openappstack/clusters/test] [v0.3 *] -> % cat group_vars/all/settings.yml acme_staging: false admin_email: firstname.lastname@example.org cluster_dir: /home/case/src/openappstack/clusters/test domain: oas.example.net ip_address: 18.104.22.168 local_flux: true release_name: test rke_custom_config: services: kubelet: extra_args: eviction-hard: "memory.available<100Mi,nodefs.available<1Gi,imagefs.available<1Gi" eviction-minimum-reclaim: "memory.available=0Mi,nodefs.available=0Mi,imagefs.available=0Gi"