diff --git a/docs/installation_instructions.rst b/docs/installation_instructions.rst index 7bc503988849062232489ad72f278bfb625107c9..a3895dd2241bb3cd0b8a5842dafa1a30187fe998 100644 --- a/docs/installation_instructions.rst +++ b/docs/installation_instructions.rst @@ -318,9 +318,11 @@ OpenAppStack. $ python -m openappstack oas.example.org install -This will take a few minutes. It installs k3s, a lightweight Kubernetes. `Flux -<https://fluxcd.io>`__ is installed to manage applications and keep them -updated automatically. + +This will take a few minutes. It installs k3s, a lightweight Kubernetes. `Flux`_ +is installed to manage applications and keep them updated automatically. + +.. _flux: https://fluxcd.io In the future, we will add commands that show you the status of the application installation. For now, just wait half an hour for everything @@ -383,3 +385,14 @@ Step 6: Validate setup Because OpenAppStack is still under development, we would like you to follow our `testing instructions <testing_instructions.html>`__ to make sure that the setup process went well. + +Step 7: Let us know! +~~~~~~~~~~~~~~~~~~~~ + +We would love to hear about your experience installing OpenAppStack. If you +encountered any problems, please create an issue in our `issue tracker +<https://open.greenhost.net/groups/openappstack/-/issues>`__. If you didn't +please still reach out as described on our `contact page +<https://openappstack.net/contact.html>`__ and tell us how you like OpenAppStack +so far. We want to be in communication with our users, and we want to help you +if you run into problems. diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md deleted file mode 100644 index b553a23559f4baf37a39a42e8bd5782b4f7e60d5..0000000000000000000000000000000000000000 --- a/docs/troubleshooting.md +++ /dev/null @@ -1,266 +0,0 @@ -# Troubleshooting - -If you run into problems, there are a few things you can do to research the -problem. This document describes what you can do. - -> **NOTE:** `cluster$` indicates that the commands should be run as root on your -> OAS machine. - -## Known issues - -Take a look if the problem you have encountered is already in our [issue -tracker](https://open.greenhost.net/groups/openappstack/-/issues). - -## Run the cli tests - -To get an overall status of your cluster you can run the tests from the -command line. - -There are two types of tests: [testinfra](https://testinfra.readthedocs.io/en/latest/) -tests, and [taiko](https://taiko.dev) tests. - -### Testinfra tests - -Testinfra tests are split into two groups, lets call them `blackbox` and -`clearbox` tests. The blackbox tests run on your provisioning machine and test -the OAS cluster from the outside. For example, the certificate check will check -if the OAS will return valid certificates for the provided services. -The clearbox tests run on the OAS host and check i.e. if docker is installed -in the right version etc. - -First, enter the `test` directory in the Git repository on your provisioning -machine. - - cd test - -To run the test against your cluster, first export the `CLUSTER_DIR` environment -variabel with the location of your cluster config directory: - - export CLUSTER_DIR="../clusters/CLUSTERNAME" - -Run all tests: - - py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' - -Test all applications, that will check for: - - * proper certificate - * helm release successfully installed - * all app pods are running and healthy - -``` -pytest -s -m 'app' --connection=ansible --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' -``` - -Test a specific application: - - pytest -s -m 'app' --app="wordpress" --connection=ansible --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' - - -#### Known Issues - -- Default ssh backend for testinfra tests is `paramiko`, which doesn't work oout - of the box. It fails to connect to the host because the `ed25519` hostkey was - not verified. Therefore we need to force plain ssh:// with either - `connection=ssh` or `--hosts=ssh://…` - -### Taiko tests - -Taiko tests run in a browser and test if all the interfaces are up -and running and correctly connected to each other. They are integrated in the -`openappstack` CLI command suite. - -#### Prerequisites - -Install [taiko](https://taiko.dev): - - npm install -g taiko - -#### Usage - -To run all taiko tests, run the following command in this repository: - - python -m openappstack CLUSTERNAME test - -In the future, this command will run all tests, but now only *taiko* is -implemented. To learn more about the `test` subcommand, run: - - python -m openappstack CLUSTERNAME test --help - -You can also only run a taiko test for a specific application, i.e.: - - python -m openappstack CLUSTERNAME test --taiko-tags nextcloud - -### Advanced usage - -#### Testinfra tests - -Specify host manually: - - py.test -s --hosts='ssh://root@example.openappstack.net' - -Run only tests tagged with `prometheus`: - - py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' -m prometheus - -Run cert test manually using the ansible inventory file: - - py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' -m certs - -Run cert test manually against a different cluster, not configured in any -ansible inventory file, either by using pytest: - - FQDN='example.openappstack.net' py.test -sv -m 'certs' - -or directly: - - FQDN='example.openappstack.net' pytest/test_certs.py - -#### Running testinfra tests with local gitlab-runner docker executor - -Export the following environment variables like this: - - export CI_REGISTRY_IMAGE='open.greenhost.net:4567/openappstack/openappstack' - export SSH_PRIVATE_KEY="$(cat ~/.ssh/id_ed25519_oas_ci)" - export COSMOS_API_TOKEN='…' - -then: - - gitlab-runner exec docker --env CI_REGISTRY_IMAGE="$CI_REGISTRY_IMAGE" --env SSH_PRIVATE_KEY="$SSH_PRIVATE_KEY" --env COSMOS_API_TOKEN="$COSMOS_API_TOKEN" bootstrap - -#### Taiko tests - -##### Using Taiko without the OpenAppStack CLI - -Go to the `test/taiko` directory and run: - -For nextcloud & onlyoffice tests: - - export DOMAIN='oas.example.net' - export SSO_USERNAME='user1' - export SSO_USER_PW='...' - export TAIKO_TESTS='nextcloud' - taiko --observe taiko-tests.js - -You can replace `nextcloud` with `grafana` or `wordpress` to test the other -applications, or with `all` to test all applications. - -## SSH access - -You can SSH login to your VPS. Some programs that are available to the root user -on the VPS: - -* `kubectl`, the Kubernetes control program. The root user is connected to the - cluster automatically. -* `helm` is the "Kubernetes package manager". Use i.e. `helm ls --all-namespaces` - to see what apps are installed in your cluster. You can also use it to perform - manual upgrades; see `helm --help`. - -## Using kubectl to debug your cluster - -You can use `kubectl`, the Kubernetes control program, to find and manipulate -your Kubernetes cluster. Once you have installed `kubectl`, to get access to your -cluster with the OAS CLI: - - $ python -m openappstack oas.example.org info - -Look for these lines: - - To use kubectl with this cluster, copy-paste this in your terminal: - - export KUBECONFIG=/home/you/projects/openappstack/clusters/oas.example.org/secrets/kube_config_cluster.yml - -Copy the whole `export` line into your terminal. In *the same terminal window*, -kubectl will connect to your cluster. - -## HTTPS Certificates - -OAS uses [cert-manager](http://docs.cert-manager.io/en/latest/) to automatically -fetch [Let's Encrypt](https://letsencrypt.org/) certificates for all deployed -services. If you experience invalid SSL certificates, i.e. your browser warns you -when visiting Rocketchat (`https://chat.example.org`), here's how to -debug this. A useful resource for troubleshooting is also the official cert-manager -[Troubleshooting Issuing ACME Certificates](https://cert-manager.io/docs/faq/acme/) -documentation. - -In this example we fix a failed certificate request for `chat.example.org`. -We will start by checking if `cert-manager` is set up correctly. - -Did you create your cluster using the `--acme-staging` argument? -Please check the resulting value of the `acme_staging` key in -`clusters/YOUR_CLUSTERNAME/group_vars/all/settings.yml`. If this is set to `true`, certificates -are fetched from the [Let's Encrypt staging API](https://letsencrypt.org/docs/staging-environment/), -which can't be validated by default in your browser. - -Are all cert-manager pods in the `oas` namespace in the `READY` state ? - - $ kubectl -n oas get pods | grep cert-manager - -Are there any `cm-acme-http-solver-*` pods still running, indicating that there -are unfinished certificate requests ? - - $ kubectl get pods --all-namespaces | grep cm-acme-http-solver - -Show the logs of the main `cert-manager` pod: - - $ kubectl -n oas logs -l "app.kubernetes.io/name=cert-manager" - -You can `grep` for your cluster domain or for any specific subdomain to narrow -down results. - -Query for failed certificates, -requests, challenges or orders: - - $ kubectl get --all-namespaces certificate,certificaterequest,challenge,order | grep -iE '(false|pending)' - oas-apps certificate.cert-manager.io/oas-rocketchat False oas-rocketchat 15h - oas-apps certificaterequest.cert-manager.io/oas-rocketchat-2045852889 False 15h - oas-apps challenge.acme.cert-manager.io/oas-rocketchat-2045852889-1775447563-837515681 pending chat.example.org 15h - oas-apps order.acme.cert-manager.io/oas-rocketchat-2045852889-1775447563 pending 15h - -We see that the Rocketchat certificate resources are in a bad state since 15h. - -Show certificate resource status message: - - $ kubectl -n oas-apps get certificate oas-rocketchat -o jsonpath="{.status.conditions[*]['message']}" - Waiting for CertificateRequest "oas-rocketchat-2045852889" to complete - -We see that the `certificate` is waiting for the `certificaterequest`, lets -query it's status message: - - $ kubectl -n oas-apps get certificaterequest oas-rocketchat-2045852889 -o jsonpath="{.status.conditions[*]['message']}" - Waiting on certificate issuance from order oas-apps/oas-rocketchat-2045852889-1775447563: "pending" - -Show the related order resource and look at the status and events: - - kubectl -n oas-apps describe order oas-rocketchat-2045852889-1775447563 - -Show the failed challenge resource reason: - - $ kubectl -n oas-apps get challenge oas-rocketchat-2045852889-1775447563-837515681 -o jsonpath='{.status.reason}' - Waiting for http-01 challenge propagation: wrong status code '503', expected '200' - -In this example, deleting the challenge fixed the issue and a proper certificate -could get fetched: - - $ kubectl -n oas-apps delete challenges.acme.cert-manager.io oas-rocketchat-2045852889-1775447563-837515681 - - -## Application installation fails - -Find applications that fail to install: - - helm ls --all-namespaces | grep -i -v DEPLOYED - kubectl get helmreleases --all-namespaces | grep -i -v DEPLOYED - -Especially the nextcloud installation process is brittle and error-prone. -Lets take it as an example how to debug the root cause. - - -## Purge OAS and install from scratch - -If ever things fail beyond possible recovery, here's how to completely purge an OAS installation in order to start from scratch: - - cluster$ /usr/local/bin/k3s-killall.sh - cluster$ systemctl disable k3s - cluster$ mount | egrep '(kubelet|nsfs|k3s)' | cut -d' ' -f 3 | xargs --no-run-if-empty -n 1 umount - cluster$ rm -rf /var/lib/{rancher,OpenAppStack,kubelet,cni,docker,etcd} /etc/{kubernetes,rancher} /var/log/{OpenAppStack,containers,pods} /tmp/k3s /etc/systemd/system/k3s.service - cluster$ systemctl reboot diff --git a/docs/troubleshooting.rst b/docs/troubleshooting.rst new file mode 100644 index 0000000000000000000000000000000000000000..35db24398839b80be386c63411e99547056ba67e --- /dev/null +++ b/docs/troubleshooting.rst @@ -0,0 +1,418 @@ +Troubleshooting +=============== + +If you run into problems, there are a few things you can do to research the +problem. This document describes what you can do. + +.. note:: + ``cluster$`` indicates that the commands should be run as root on your + OAS machine. + +**We would love to hear from you!** If you have problems, please create an issue +in our `issue tracker +<https://open.greenhost.net/groups/openappstack/-/issues>`__ or reach out as +described on our `contact page <https://openappstack.net/contact.html>`__. We +want to be in communication with our users, and we want to help you if you run +into problems. + +Known issues +------------ + +If you run into a problem, please check our `issue +tracker <https://open.greenhost.net/groups/openappstack/-/issues>`__ to see if +others have run into the same problem. We might have suggested a workaround or +temporary solution in one of our issues. If your problems is not described in an +issue, please open a new one so we can solve the problems you encounter. + +Run the CLI tests +----------------- + +To get an overall status of your cluster you can run the tests from the +command line. + +There are two types of tests: [testinfra](https://testinfra.readthedocs.io/en/latest/) +tests, and [Taiko](https://taiko.dev) tests. + +Testinfra tests +~~~~~~~~~~~~~~~ + +Testinfra tests are split into two groups, lets call them *blackbox* and +*clearbox* tests. The blackbox tests run on your provisioning machine and test +the OAS cluster from the outside. For example, the certificate check will check +if the OAS returns valid certificates for the provided services. +The clearbox tests run on the OAS host and check i.e. if docker is installed +in the right version etc. Our testinfra tests are a combination of blackbox and +clearbox tests. + +First, enter the `test` directory in the Git repository **on your provisioning +machine**. + +.. code:: bash + + cd test + +To run the test against your cluster, first export the ``CLUSTER_DIR`` +environment variable with the location of your cluster config directory (replace +``oas.example.org`` with your cluster name): + +.. code:: bash + + export CLUSTER_DIR="../clusters/oas.example.org" + +Run all tests +''''''''''''' + +.. code:: bash + + py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' + +Test all applications +''''''''''''''''''''' + +This will check for: + +* The applications return proper certificates +* All helm releases are successfully installed +* All app pods are running and healthy (this test includes all optional + applications) + +These tests includes all optional applications and will fail for optional +applications that are not installed. + +.. code:: bash + + pytest -s -m 'app' --connection=ansible --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' + + +Tests a specific application +'''''''''''''''''''''''''''' + +.. code:: bash + + pytest -s -m 'app' --app="wordpress" --connection=ansible --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' + + +Known Issues +'''''''''''' + +The Default ssh backend for testinfra tests is ``paramiko``, which doesn't work +out of the box. It fails to connect to the host because the ``ed25519`` hostkey +was not verified. Therefore we need to force plain ssh:// with either +``connection=ssh`` or ``--hosts=ssh://…`` + +Taiko tests +~~~~~~~~~~~ + +Taiko tests run in a browser and test if all the interfaces are up +and running and correctly connected to each other. They are integrated in the +`openappstack` CLI command suite. + +Prerequisites +''''''''''''' + +Install [Taiko](https://taiko.dev) on your provisioning machine: + +.. code:: bash + + npm install -g taiko + +Run Taiko tests +''''''''''''''' + +To run all Taiko tests, run the following command in this repository: + +.. code:: bash + + python -m openappstack CLUSTERNAME test + +To learn more about the `test` subcommand, run: + +.. code:: bash + + python -m openappstack CLUSTERNAME test --help + +You can also only run a Taiko test for a specific application, i.e.: + +.. code:: bash + + python -m openappstack CLUSTERNAME test --taiko-tags nextcloud + +Advanced usage +-------------- + +Testinfra tests +~~~~~~~~~~~~~~~ + +Specify host manually: + +.. code:: bash + + py.test -s --hosts='ssh://root@example.openappstack.net' + +Run only tests tagged with `prometheus`: + +.. code:: bash + + py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' -m prometheus + +Run cert test manually using the ansible inventory file: + +.. code:: bash + + py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' -m certs + +Run cert test manually against a different cluster, not configured in any +ansible inventory file, either by using pytest: + +.. code:: bash + + FQDN='example.openappstack.net' py.test -sv -m 'certs' + +or directly: + +.. code:: bash + + FQDN='example.openappstack.net' pytest/test_certs.py + +Running Testinfra tests with local gitlab-runner docker executor + +Export the following environment variables like this: + +.. code:: bash + + export CI_REGISTRY_IMAGE='open.greenhost.net:4567/openappstack/openappstack' + export SSH_PRIVATE_KEY="$(cat ~/.ssh/id_ed25519_oas_ci)" + export COSMOS_API_TOKEN='…' + +then: + +.. code:: bash + + gitlab-runner exec docker --env CI_REGISTRY_IMAGE="$CI_REGISTRY_IMAGE" --env SSH_PRIVATE_KEY="$SSH_PRIVATE_KEY" --env COSMOS_API_TOKEN="$COSMOS_API_TOKEN" bootstrap + +Taiko tests +~~~~~~~~~~~ + +Using Taiko without the OpenAppStack CLI +'''''''''''''''''''''''''''''''''''''''' + +Go to the ``test/taiko`` directory and run: + +For nextcloud & onlyoffice tests: + +.. code:: bash + + export DOMAIN='oas.example.net' + export SSO_USERNAME='user1' + export SSO_USER_PW='...' + export TAIKO_TESTS='nextcloud' + taiko --observe taiko-tests.js + +You can replace ``nextcloud`` with ``grafana`` or ``wordpress`` to test the +other applications, or with ``all`` to test all applications. + +SSH access +---------- + +You can SSH login to your VPS. Some programs that are available to the root user +on the VPS: + +* ``kubectl``, the Kubernetes control program. The root user is connected to the + cluster automatically. +* ``helm`` is the "Kubernetes package manager". Use i.e. ``helm ls --all-namespaces`` + to see what apps are installed in your cluster. You can also use it to perform + manual upgrades; see ``helm --help``. +* ``flux`` is the `flux`_ command line tool + +.. _flux: https://fluxcd.io + +Using kubectl to debug your cluster +----------------------------------- + +You can use ``kubectl``, the Kubernetes control program, to find and manipulate +your Kubernetes cluster. Once you have installed ``kubectl``, to get access to +your cluster with the OAS CLI: + +.. code:: bash + + $ python -m openappstack oas.example.org info + +Look for these lines: + +.. code:: + + To use kubectl with this cluster, copy-paste this in your terminal: + export KUBECONFIG=/home/you/projects/openappstack/clusters/oas.example.org/kube_config_cluster.yml + +Copy the whole ``export`` line into your terminal. In *the same terminal +window*, ``kubectl`` will connect to your cluster. + +HTTPS Certificates +------------------ + +OAS uses `cert-manager <https://docs.cert-manager.io/en/latest/>`__ to +automatically fetch `Let's Encrypt <https://letsencrypt.org/>`__ certificates +for all deployed services. If you experience invalid SSL certificates, i.e. your +browser warns you when visiting Rocketchat (https://chat.oas.example.org), +a useful resource for troubleshooting is the official cert-manager +`Troubleshooting Issuing ACME Certificates +<https://cert-manager.io/docs/faq/acme/>`__ documentation. First, try this: + +In this example we fix a failed certificate request for +*https://chat.oas.example.org*. We will start by checking if ``cert-manager`` +is set up correctly. + +Is your cluster using the live ACME server? + +.. code:: bash + + $ kubectl get clusterissuers -o yaml | grep 'server:' + +Should return ``server: https://acme-v02.api.letsencrypt.org/directory`` and not +something with the word *staging* in it. + +Are all cert-manager pods in the `oas` namespace in the `READY` state ? + +.. code:: bash + + $ kubectl -n cert-manager get pods + +Cert-manager uses a "custom resource" to keep track of your certificates, so you +can also check the status of your certificates by running: + +This returns all the certificates for all applications on your system. The +command includes example output of healthy certificates. + +.. code:: bash + + $ kubectl get certificates -A + NAMESPACE NAME READY SECRET AGE + oas hydra-public.tls True hydra-public.tls 14d + oas single-sign-on-userpanel.tls True single-sign-on-userpanel.tls 14d + oas-apps oas-nextcloud-files True oas-nextcloud-files 14d + oas-apps oas-nextcloud-office True oas-nextcloud-office 14d + oas grafana-tls True grafana-tls 13d + oas alertmanager-tls True alertmanager-tls 13d + oas prometheus-tls True prometheus-tls 13d + +If there are problems, you can check for the specific ``certificaterequests``: + +.. code:: bash + + $ kubectl get certificaterequests -A + +If you still need more information, you can dig into the logs of the +``cert-manager`` pod: + + $ kubectl -n oas logs -l "app.kubernetes.io/name=cert-manager" + +You can `grep` for your cluster domain or for any specific subdomain to narrow +down results. + +Example +''''''' + +Query for failed certificates, -requests, challenges or orders: + +.. code:: bash + + $ kubectl get --all-namespaces certificate,certificaterequest,challenge,order | grep -iE '(false|pending)' + oas-apps certificate.cert-manager.io/oas-rocketchat False oas-rocketchat 15h + oas-apps certificaterequest.cert-manager.io/oas-rocketchat-2045852889 False 15h + oas-apps challenge.acme.cert-manager.io/oas-rocketchat-2045852889-1775447563-837515681 pending chat.oas.example.org 15h + oas-apps order.acme.cert-manager.io/oas-rocketchat-2045852889-1775447563 pending 15h + +We see that the Rocketchat certificate resources are in a bad state since 15h. + +Show certificate resource status message: + +.. code:: bash + + $ kubectl -n oas-apps get certificate oas-rocketchat -o jsonpath="{.status.conditions[*]['message']}" + Waiting for CertificateRequest "oas-rocketchat-2045852889" to complete + +We see that the `certificate` is waiting for the `certificaterequest`, lets +query its status message: + +.. code:: bash + + $ kubectl -n oas-apps get certificaterequest oas-rocketchat-2045852889 -o jsonpath="{.status.conditions[*]['message']}" + Waiting on certificate issuance from order oas-apps/oas-rocketchat-2045852889-1775447563: "pending" + +Show the related order resource and look at the status and events: + +.. code:: bash + + $ kubectl -n oas-apps describe order oas-rocketchat-2045852889-1775447563 + +Show the failed challenge resource reason: + +.. code:: bash + + $ kubectl -n oas-apps get challenge oas-rocketchat-2045852889-1775447563-837515681 -o jsonpath='{.status.reason}' + Waiting for http-01 challenge propagation: wrong status code '503', expected '200' + +In this example, deleting the challenge fixed the issue and a proper certificate +could get fetched: + +.. code:: bash + + $ kubectl -n oas-apps delete challenges.acme.cert-manager.io oas-rocketchat-2045852889-1775447563-837515681 + +Application installation or upgrade failures +-------------------------------------------- + +Application installations and upgrades are managed by `flux`_. Flux uses +``helm-controller`` to install and upgrade applications with ``helm charts``. + +An application installed with Flux consists of a ``kustomization``. This is a +resource that defines where the information about the application is stored in +our Git repository. The ``kustomization`` contains a ``helmrelease``, which is +an object that represents an installation of a Helm chart. Read more about the +difference between ``kustomizations`` and ``helmreleases`` in the `flux +documentation <https://fluxcd.io/docs>`__ + +To find out if all ``kustomizations`` have been applied correctly, run the +following flux command in your cluster: + +.. code:: bash + + cluster$ flux get kustomizations -A + +If all your ``kustomizations`` are in a ``Ready`` state, take a look at your +``helmreleases``: + +.. code:: bash + + cluster$ flux get helmreleases -A + +Often, you can resolve complications with ``kustomizations`` or ``helmreleases`` +by telling Flux to *reconcile* them: + +.. code:: bash + + cluster$ flux reconcile helmrelease nextcloud + +Will make sure that the Nextcloud ``helmrelease`` gets brought into a state that +our OpenAppStack wants it to be in. + +Purge OAS and install from scratch +---------------------------------- + +If ever things fail beyond possible recovery, here's how to completely purge an +OAS installation in order to start from scratch: + +.. warning:: + + **You will lose all your data!** This completely destroys OpenAppStack and + takes everything offline. If you chose to do this, you will need to + re-install OpenAppStack and make sure that your data is stored somewhere + other than the VPS that runs OpenAppStack. + +.. code:: bash + + cluster$ /usr/local/bin/k3s-killall.sh + cluster$ systemctl disable k3s + cluster$ mount | egrep '(kubelet|nsfs|k3s)' | cut -d' ' -f 3 | xargs --no-run-if-empty -n 1 umount + cluster$ rm -rf /var/lib/{rancher,OpenAppStack,kubelet,cni,docker,etcd} /etc/{kubernetes,rancher} /var/log/{OpenAppStack,containers,pods} /tmp/k3s /etc/systemd/system/k3s.service + cluster$ systemctl reboot