Newer
Older
Note: `cluster$` indicates that the commands should be run as root on your OAS cluster.
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
## Run the cli tests
To get an overall status of your cluster you can run the tests from the
command line.
There are two types of tests: [testinfra](https://testinfra.readthedocs.io/en/latest/)
tests, and [behave](https://behave.readthedocs.io/en/latest/) tests.
### Testinfra tests
Testinfra tests are split into two groups, lets call them `blackbox` and
`whitebox` tests. The blackbox tests run on your provisioning machine and test
the OAS cluster from the outside. For example, the certificate check will check
if the OAS will return valid certificates for the provided services.
The whitebox tests run on the OAS host and check i.e. if docker is installed
in the right version etc.
To run the test against your cluster, first export the `CLUSTER_DIR` environment
variabel with the location of your cluster config directory:
export CLUSTER_DIR="../clusters/CLUSTERNAME"
Run all tests:
py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*'
#### Advance usage
Specify host manually:
py.test -s --hosts='ssh://root@example.openappstack.net'
Run only tests tagged with `prometheus`:
py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' -m prometheus
Run cert test manually using the ansible inventory file:
py.test -s --ansible-inventory=${CLUSTER_DIR}/inventory.yml --hosts='ansible://*' -m certs
Run cert test manually against a different cluster, not configured in any
ansible inventory file, either by using pytest:
FQDN='example.openappstack.net' py.test -sv -m 'certs'
or directly:
FQDN='example.openappstack.net' pytest/test_certs.py
#### Known Issues
- Default ssh backend for testinfra tests is `paramiko`, which doesn't work oout
of the box. It fails to connect to the host because the `ed25519` hostkey was
not verified. Therefore we need to force plain ssh:// with either
`connection=ssh` or `--hosts=ssh://…`
#### Running tests with local gitlab-runner docker executor
Export the following environment variables like this:
export CI_REGISTRY_IMAGE='open.greenhost.net:4567/openappstack/openappstack'
export SSH_PRIVATE_KEY="$(cat ~/.ssh/id_ed25519_oas_ci)"
export COSMOS_API_TOKEN='…'
then:
gitlab-runner exec docker --env CI_REGISTRY_IMAGE="$CI_REGISTRY_IMAGE" --env SSH_PRIVATE_KEY="$SSH_PRIVATE_KEY" --env COSMOS_API_TOKEN="$COSMOS_API_TOKEN" bootstrap
## Behave tests
Behave tests run in a headless browser and test if all the interfaces are up
and running and correctly connected to each other. They are integrated in the
`openappstack` CLI command suite.
To run the behave tests, run the following command in this repository:
python -m openappstack CLUSTERNAME test
In the future, this command will run all tests, but now only *behave* is
implemented. To learn more about the `test` subcommand, run:
python -m openappstack CLUSTERNAME test --help
If you encounter problems when you upgrade your cluster, please make sure first
to include all potential new values of `ansible/group_vars/all/settings.yml.example`
to your `clusters/YOUR_CLUSTERNAME/group_vars/all/settings.yml`, and rerun the installation
## HTTPS Certificates
OAS uses [cert-manager](http://docs.cert-manager.io/en/latest/) to automatically
fetch [Let's Encrypt](https://letsencrypt.org/) certificates for all deployed
services. If you experience invalid SSL certificates (i.e. your browser warns you
when visiting Nextcloud (`https://files.YOUR.CLUSTER.DOMAIN`) here's how to
debug this:

Maarten de Waard
committed
Did you create your cluster using the `--acme-staging` argument?
Please check the resulting value of the `acme_staging` key in
`clusters/YOUR_CLUSTERNAME/group_vars/all/settings.yml`. If this is set to `true`, certificates
are fetched from the [Let's Encrypt staging API](https://letsencrypt.org/docs/staging-environment/),
which can't be validated by default in your browser.
Are all cert-manager pods in the `oas` namespace in the `READY` state ?
cluster$ kubectl -n oas get pods | grep cert-manager
Are there any `cm-acme-http-solver-*` pods still running, indicating that there
are unfinished certificate requests ?
cluster$ kubectl get pods --all-namespaces | grep cm-acme-http-solver
cluster$ kubectl -n oas logs -l "app.kubernetes.io/name=cert-manager"
You can `grep` for your cluster domain or for any specific subdomain to narrow
down results.
## Purge OAS and install from scratch
If ever things fail beyond possible recovery, here's how to completely purge an OAS installation in order to start from scratch:
cluster$ apt purge docker-ce-cli containerd.io
cluster$ mount | egrep '^(tmpfs.*kubelet|nsfs.*docker)' | cut -d' ' -f 3 | xargs umount
cluster$ rm -rf /var/lib/docker /var/lib/OpenAppStack /etc/kubernetes /var/lib/etcd /var/lib/rancher /var/lib/kubelet /var/log/OpenAppStack /var/log/containers /var/log/pods