-
Varac authored
Troubleshooting
Note: cluster$
indicates that the commands should be run as root on your OAS cluster.
Upgrading
If you encounter problems when you upgrade your cluster, please make sure first
to include all potential new values of ansible/group_vars/all/settings.yml.example
to your clusters/YOUR_CLUSTERNAME/group_vars/all/settings.yml
, and rerun the installation
script.
HTTPS Certificates
OAS uses cert-manager to automatically
fetch Let's Encrypt certificates for all deployed
services. If you experience invalid SSL certificates (i.e. your browser warns you
when visiting Nextcloud (https://files.YOUR.CLUSTER.DOMAIN
) here's how to
debug this:
Did you create your cluster using the --acme-staging
argument?
Please check the resulting value of the acme_staging
key in
clusters/YOUR_CLUSTERNAME/group_vars/all/settings.yml
. If this is set to true
, certificates
are fetched from the Let's Encrypt staging API,
which can't be validated by default in your browser.
Are all cert-manager pods in the oas
namespace in the READY
state ?
cluster$ kubectl -n oas get pods | grep cert-manager
Are there any cm-acme-http-solver-*
pods still running, indicating that there
are unfinished certificate requests ?
cluster$ kubectl get pods --all-namespaces | grep cm-acme-http-solver
Show the logs of the main cert-manager
pod:
cluster$ kubectl -n oas logs -l "app.kubernetes.io/name=cert-manager"
You can grep
for your cluster domain or for any specific subdomain to narrow
down results.
Purge OAS and install from scratch
If ever things fail beyond possible recovery, here's how to completely purge an OAS installation in order to start from scratch:
cluster$ apt purge docker-ce-cli containerd.io
cluster$ mount | egrep '^(tmpfs.*kubelet|nsfs.*docker)' | cut -d' ' -f 3 | xargs umount
cluster$ systemctl reboot
cluster$ rm -rf /var/lib/docker /var/lib/OpenAppStack /etc/kubernetes /var/lib/etcd /var/lib/rancher /var/lib/kubelet /var/log/OpenAppStack /var/log/containers /var/log/pods