Skip to content
Snippets Groups Projects
upgrading.rst 12.1 KiB
Newer Older
Upgrading OpenAppStack
======================

Maarten de Waard's avatar
Maarten de Waard committed
Because of `problems with Helm and secret management
<https://open.greenhost.net/openappstack/openappstack/-/issues/891>`__
we had to move away from using a helm chart for secrets, and now use scripts
that run during installation to manage secrets. Because we have removed the
``oas-secrets`` helm chart, Flux will try to remove the secrets that it has
generated. **It is important that you back up these secrets before switching
from ``v0.6`` to ``v0.7``!**

To back-up your secrets, run the following script: 

Maarten de Waard's avatar
Maarten de Waard committed
.. code:: bash

   bash
   #!/usr/bin/env bash

   mkdir secrets-backup

   kubectl get secret -o yaml -n flux-system  oas-cluster-variables > secrets-backup/oas-cluster-variables.yaml
   kubectl get secret -o yaml -n flux-system  oas-wordpress-variables > secrets-backup/oas-wordpress-variables.yaml
   kubectl get secret -o yaml -n flux-system  oas-wekan-variables > secrets-backup/oas-wekan-variables.yaml
   kubectl get secret -o yaml -n flux-system  oas-single-sign-on-variables > secrets-backup/oas-single-sign-on-variables.yaml
   kubectl get secret -o yaml -n flux-system  oas-rocketchat-variables > secrets-backup/oas-rocketchat-variables.yaml
   kubectl get secret -o yaml -n flux-system  oas-kube-prometheus-stack-variables > secrets-backup/oas-kube-prometheus-stack-variables.yaml
   kubectl get secret -o yaml -n oas          oas-prometheus-basic-auth > secrets-backup/oas-prometheus-basic-auth.yaml
   kubectl get secret -o yaml -n oas          oas-alertmanager-basic-auth > secrets-backup/oas-alertmanager-basic-auth.yaml
   kubectl get secret -o yaml -n flux-system  oas-oauth-variables > secrets-backup/oas-oauth-variables.yaml
   kubectl get secret -o yaml -n flux-system  oas-nextcloud-variables > secrets-backup/oas-nextcloud-variables.yaml

This script assumes you have all applications enabled. You might get an error
like: 

Maarten de Waard's avatar
Maarten de Waard committed
.. code:: bash

   Error from server (NotFound): secrets "oas-wekan-variables" not found

Maarten de Waard's avatar
Maarten de Waard committed
This is not a problem, but it *does* mean you need to add an oauth secret for
Wekan to the file ``secrets-backup/oas-oauth-variables.yaml``. Copy one of the
lines under "data:", rename the field to ``wekan_oauth_client_secret`` and enter
Maarten de Waard's avatar
Maarten de Waard committed
a different random password. Make sure to base64 encode it (``echo "<your random
password>" | base64``).

This script creates a directory called ``secrets-backup`` and places the secrets
that have been generated by Helm in it as ``yaml`` files.

Now you can upgrade your cluster by running ``kubectl edit gitrepository -n
flux-system openappstack`` and setting ``spec.ref.branch`` to ``v0.7``

Flux will now start updating your cluster to version 0.7. This process will most
likely fail, because it will remove the secrets that you just backed up. Make
sure that the ``oas-secrets`` helmrelease has been removed by running ``flux get
hr -A``. You might also see that some helmreleases start failing to be installed
because important secrets do not exist anymore. 

As soon as the ``oas-secrets`` helmrelease does not exist anymore, you can run
the following code:

Maarten de Waard's avatar
Maarten de Waard committed
.. code:: bash

   #!/usr/bin/env bash

   # Uses https://github.com/mikefarah/yq -- install with `snap install yq`
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-cluster-variables.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-wordpress-variables.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-wekan-variables.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-single-sign-on-variables.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-rocketchat-variables.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-kube-prometheus-stack-variables.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-prometheus-basic-auth.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-alertmanager-basic-auth.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-oauth-variables.yaml | kubectl apply -f - -n flux-system
   yq eval 'del(.metadata.annotations,.metadata.labels,.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid)' secrets-backup/oas-nextcloud-variables.yaml | kubectl apply -f - -n flux-system

Again this script assumes you have all applications installed. If you get the
following error, you can ignore it:

Maarten de Waard's avatar
Maarten de Waard committed
.. code:: bash

   error: error validating "STDIN": error validating data: [apiVersion not set, kind not set]; if you choose to ignore these errors, turn validation off with --validate=false

Now Flux should succeed in finishing the update. Some helmreleases or
kustomizations might have already failed because the secrets did not exist. Once
failed, you can retrigger reconciliation of a kustomization using the commands
Maarten de Waard's avatar
Maarten de Waard committed
``flux reconcile kustomization ...`` or ``flux reconcile helmrelease ...``. This
can take quite a while (over an hour some times), because Flux waits for some
long timeouts before giving up and re-starting a reconciliation.

Some errors we've seen during our own upgrade process, and how to solve them
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SSO helm upgrade failed
'''''''''''''''''''''''

.. code::

   oas         	single-sign-on        	False	Helm upgrade failed: template: single-sign-on/templates/secret-oauth2-clients.yaml:9:55: executing "single-sign-on/templates/secret-oauth2-clients.yaml" at <b64enc>: invalid value; expected string	0.2.2   	False

This means that the ``single-sign-on`` helmrelease was created with empty oauth
secrets. The secrets will get a value once the ``core`` *kustomization* is
reconciled: ``flux reconcile ks core`` should solve the problem.

If that does not solve the problem, you should check if the secret contains a
value for all the apps: 

.. code::

   # kubectl get secret -n flux-system oas-oauth-variables -o yaml
   apiVersion: v1
   data:
     grafana_oauth_client_secret: <redacted>
     nextcloud_oauth_client_secret: <redacted>
     rocketchat_oauth_client_secret: <redacted>
     userpanel_oauth_client_secret: <redacted>
     wekan_oauth_client_secret: <redacted>
     wordpress_oauth_client_secret: <redacted>
   ...

If your secret lacks one of these variables, use ``kubectl edit`` to add them.
You can use any password generator to generate a password for it. Make sure to
base64 encode the data before you enter it in the secret.

Loki upgrade retries exhausted
''''''''''''''''''''''''''''''

While running ``flux get helmrelease -A``, you'll see:

.. code::
   oas         	loki                  	False  	upgrade retries exhausted       	2.5.2   	False

This happens sometimes because Loki takes a long time to upgrade. Usually it is
solved by running ``flux reconcile hr loki -n oas`` again.

Upgrading to 0.6.0
------------------


A few things are important when upgrading to 0.6.0:

- We now use Flux 2 and the installation procedure has been overhauled. For this
  reason we advice you to set up a completely new cluster.
- Copy your configuration details from ``settings.yaml`` to a new ``.flux.env``.
  See ``install/.flux.env.example`` and the :ref:`OpenAppStack installation
  instructions` for more information.

Please `reach out to us`_ if you are using, or plan to use OAS in
production.

Upgrading from 0.4.0 to 0.5.0
-----------------------------

Maarten de Waard's avatar
Maarten de Waard committed
Unfortunately we can’t ensure a smooth upgrade for this version neither.
Please read the section below on how to do an upgrade by installing a
the new OAS version from scratch after backing up your data.

Upgrading from 0.3.0 to 0.4.0
-----------------------------

There is no easy upgrade path from version 0.3.0 to version 0.4.0. As
far as we know, nobody was running OpenAppStack apart from the
developers, so we assume this is not a problem.

If you do need to upgrade, this is how you can migrate your data. Backup
all the data available under ``/var/lib/OpenAppStack/local-storage``,
create a new cluster using the installation instructions, and putting
back the data. This migration procedure might not work perfectly.

Use ``kubectl get pvc -A`` on your old cluster to get a mapping of all
the PVC uuids (and thus their folder names in
``/var/lib/OpenAppStack/local-storage``) to the pods they are bound to.

Then, delete your old OpenAppStack, and install a new one with version
number 0.4.0 or higher. You can upload your backed up data into
``/var/lib/OpenAppStack/local-storage``. All PVCs will have new unique
IDs (and thus different folder names). You have to manually match the
folders from your backup with the new folders.

Additionally, if you want to re-use your old ``settings.yaml`` file,
this data needs to be added to it:

::

   backup:
     s3:
       # Disabled by default. To enable, change to `true` and configure the
       # settings below. You'll also want to add "velero" to the enabled
       # applications a bit further in this file.
       # Finally, you'll also need to provide access credentials as
       # secrets; see the documentation:
       # https://docs.openappstack.net/en/latest/installation_instructions.html#step-2-optional-cluster-backups-using-velero
       enabled: false
       # URL of S3 service. Please use the principal domain name here, without the
       # bucket name.
       url: "https://store.greenhost.net"
       # Region of S3 service that's used for backups.
       # For some on-premise providers this may be irrelevant, but the S3
       # apparently requires it at some point.
       region: "ceph"
       # Name of the S3 bucket that backups will be stored in.
       # This has to exist already: Velero will not create it for you.
       bucket: "openappstack-backup"
       # Prefix that's added to backup filenames.
       prefix: "test-instance"

   # A whitelist of applications that will be enabled.
   enabled_applications:
     # System components, necessary for the system to function.
     - 'cert-manager'
     - 'letsencrypt-production'
     - 'letsencrypt-staging'
     - 'ingress'
     - 'local-path-provisioner'
     - 'single-sign-on'
     # The backup system Velero is disabled by default, see settings under `backup` above.
     # - 'velero'
     # Applications.
     - 'grafana'
     - 'loki'
     - 'promtail'
     - 'nextcloud'
     - 'prometheus'
     - 'rocketchat'
     - 'wordpress'

Upgrading to 0.3.0
------------------

Upgrading from versions earlier than ``0.3.0`` requires manual
intervention.

-  Move your local ``settings.yml`` file to a different location:

   ::

      cd CLUSTER_DIR
      mkdir -p ./group_vars/all/
      mv settings.yml ./group_vars/all/

-  `Flux`_ is now used to install and update applications. For that
   reason, we need you to remove all helm charts (WARNING: You will lose
   your data!):

   ::

      helm delete --purge oas-test-cert-manager oas-test-local-storage \
          oas-test-prometheus oas-test-proxy oas-test-files`

   -  After removing all helm charts, you probably also want to remove
      all the ``pvc``\ s that are left behind. Flux will not re-use the
      database PVCs created for these applications. Find all the pvcs by
      running ``kubectl get pvc   --namespace oas-apps`` and
      ``kubectl get pvc --namespace oas``

.. _reach out to us: https://openappstack.net/contact.html
.. _Flux: https://fluxcd.io