Skip to content
Snippets Groups Projects
Commit 6ef43ef9 authored by Maarten de Waard's avatar Maarten de Waard :angel:
Browse files

Merge branch 'logging_docs' into 'main'

Improve logging docs

See merge request stackspin/stackspin!667
parents d246632b 3149ca38
No related branches found
No related tags found
No related merge requests found
...@@ -61,7 +61,6 @@ html_static_path = ['_static'] ...@@ -61,7 +61,6 @@ html_static_path = ['_static']
# 'contents' # 'contents'
master_doc = 'index' master_doc = 'index'
# https://www.sphinx-doc.org/en/master/usage/extensions/autosectionlabel.html # https://www.sphinx-doc.org/en/master/usage/extensions/autosectionlabel.html
# #
# Suppress autosectionlabel extension warnings about duplicate labels, i.e. # Suppress autosectionlabel extension warnings about duplicate labels, i.e.
...@@ -69,5 +68,7 @@ master_doc = 'index' ...@@ -69,5 +68,7 @@ master_doc = 'index'
# docs/usage.rst:105: WARNING: duplicate label wordpress, other instance in docs/testing.rst # docs/usage.rst:105: WARNING: duplicate label wordpress, other instance in docs/testing.rst
# #
# https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-suppress_warnings # https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-suppress_warnings
suppress_warnings = ['autosectionlabel.*']
autosectionlabel_prefix_document = True autosectionlabel_prefix_document = True
autosectionlabel_maxdepth = 5
suppress_warnings = ['autosectionlabel.*']
...@@ -41,6 +41,7 @@ For more information, go to `the Stackspin website`_. ...@@ -41,6 +41,7 @@ For more information, go to `the Stackspin website`_.
:maxdepth: 2 :maxdepth: 2
:caption: Administration :caption: Administration
logging
maintenance maintenance
upgrading upgrading
customizing customizing
......
Logging
=======
Logs from pods and containers can be read in different ways:
- In the cluster filesystem at ``/var/log/pods/`` or
``/var/logs/containers/``.
- Using `kubectl logs`_
- Querying aggregated logs with Grafana, see below.
Central log aggregation
-----------------------
We use `Promtail`_, `Loki`_ and `Grafana`_ for easy access of aggregated
logs. The `Loki documentation`_ is a good starting point how this setup
works.
There are two ways of viewing aggregated logs:
* Via the Grafana web interface
* Using the ``logcli`` command line tool
Viewing logs in Grafana
~~~~~~~~~~~~~~~~~~~~~~~
The `Using Loki in Grafana`_ gets you started with querying
your cluster logs with Grafana.
You will find the Loki Grafana integration on your cluster at
https://grafana.stackspin.example.org/explore together with some generic query
examples.
Please follow :ref:`logging:LogQL query examples` for more LogQL query examples.
Query logs with logcli
~~~~~~~~~~~~~~~~~~~~~~~
Please refer to `logcli`_ for installing ``logcli`` on your Laptop.
The create a port-forwarding to your cluster using the ``kubectl`` tool:
.. code:: bash
kubectl -n stackspin port-forward pod/loki-0 3100
In another terminal you can now use ``logcli`` to query ``loki`` like this:
.. code:: bash
logcli query '{app=~".+"}'
Please follow :ref:`logging:LogQL query examples` for more LogQL query examples.
Search older messages (in this case the last week and limit the output to 2000
lines):
.. code:: bash
logcli query --since=168h --limit=2000 --forward '{app="helm-controller"}'
LogQL query examples
~~~~~~~~~~~~~~~~~~~~
Please also refer to the `LogQL documentation`_ and the
`log queries documentation`_ .
Query all aggregated logs (unfortunatly we can’t find a better way of
doing this since LogQL always expects a stream label to get queried):
.. code:: PromQL
{app=~".+"}
Query all logs for a keyword:
.. code::
{app=~".+"} |= "error"
Query all k8s apps for errors using a regular expression:
.. code::
{app=~".+"} |~ `(error|fail|exception|fatal)`
Flux
^^^^
`Flux`_ is responsible for installing applications. It uses four
controllers:
- ``source-controller`` that tracks Helm and Git repositories like
https://open.greenhost.net/stackspin/stackspin for updates.
- ``kustomize-controller`` to deploy ``kustomizations`` that often
install ``helmreleases``.
- ``helm-controller`` to deploy the ``helmreleases``.
- ``notification-controller`` that is responsible for inbound and
outbound flux messages
Query all messages from the ``source-controller``:
.. code:: PromQL
{app="source-controller"}
Query all messages from ``flux`` and ``helm-controller``:
.. code:: PromQL
{app=~"(source-controller|helm-controller)"}
``helm-controller`` messages containing ``wordpress``:
.. code:: PromQL
'{app = "helm-controller"} |= "wordpress"'
``helm-controller`` messages containing ``wordpress`` without
``unchanged`` events (to only show the installation messages):
.. code:: PromQL
'{app = "helm-controller"} |= "wordpress" != "unchanged"'
Filter out redundant ``helm-controller`` messages:
.. code:: PromQL
'{app="helm-controller"} !~ `(unchanged|event=refreshed|method=Sync|component=checkpoint)`'
Cert-manager
^^^^^^^^^^^^
Cert manager is responsible for requesting Let’s Encrypt TLS
certificates.
Query ``cert-manager`` messages containing ``chat``:
.. code:: PromQL
'{app="cert-manager"} |= "chat"'
Hydra
^^^^^
Hydra is the single sign-on system.
Show only warnings and errors from ``hydra``:
.. code:: PromQL
{container_name="hydra"} != "level=info"
Debug oauth2 single sign-on with zulip:
.. code:: PromQL
{container_name=~"(hydra|zulip)"}
Etc
^^^
Query kubernetes events processed by the ``eventrouter`` app containing
``warning``:
.. code:: PromQL
'{app="eventrouter"} |~ "warning"'
.. _kubectl logs: https://kubernetes.io/docs/concepts/cluster-administration/logging
.. _Promtail: https://grafana.com/docs/loki/latest/clients/promtail/
.. _Loki: https://grafana.com/oss/loki/
.. _Grafana: https://grafana.com/
.. _Loki documentation: https://grafana.com/docs/loki/latest/
.. _Using Loki in Grafana: https://grafana.com/docs/grafana/latest/datasources/loki
.. _logcli: https://grafana.com/docs/loki/latest/getting-started/logcli/
.. _LogQL documentation: https://grafana.com/docs/loki/latest/logql
.. _log queries documentation: https://grafana.com/docs/loki/latest/logql/log_queries/
.. _Flux: https://fluxcd.io/
Maintenance Maintenance
=========== ===========
Logging
-------
Logs from pods and containers can be read in different ways:
- In the cluster filesystem at ``/var/log/pods/`` or
``/var/logs/containers/``.
- Using `kubectl logs`_
- Querying aggregated logs with Grafana, see below.
Central log aggregation
-----------------------
We use `Promtail`_, `Loki`_ and `Grafana`_ for easy access of aggregated
logs. The `Loki documentation`_ is a good starting point how this setup
works, and the `Using Loki in Grafana`_ gets you started with querying
your cluster logs with Grafana.
You will find the Loki Grafana integration on your cluster at
https://grafana.stackspin.example.org/explore together with some generic query
examples.
LogQL query examples
~~~~~~~~~~~~~~~~~~~~
Please also refer to the `LogQL documentation`_.
Query all aggregated logs (unfortunatly we can’t find a better way of
doing this since LogQL always expects a stream label to get queried):
.. code:: bash
logcli query '{foo!="bar"}'
Query all logs for a keyword:
.. code:: bash
logcli query '{foo!="bar"} |= "error"'
Query all k8s apps for errors using a regular expression:
.. code:: bash
logcli query '{job=~".*"} |~ "error|fail|exception|fatal"'
Flux
^^^^
`Flux`_ is responsible for installing applications. It uses four
controllers:
- ``source-controller`` that tracks Helm and Git repositories like
https://open.greenhost.net/stackspin/stackspin for updates.
- ``kustomize-controller`` to deploy ``kustomizations`` that often
install ``helmreleases``.
- ``helm-controller`` to deploy the ``helmreleases``.
- ``notification-controller`` that is responsible for inbound and
outbound flux messages
Query all messages from the ``source-controller``:
.. code:: bash
{app="source-controller"}
Query all messages from ``flux`` and ``helm-controller``:
.. code:: bash
{app=~"(source-controller|helm-controller)"}
``helm-controller`` messages containing ``wordpress``:
.. code:: bash
{app = "helm-controller"} |= "wordpress"
``helm-controller`` messages containing ``wordpress`` without
``unchanged`` events (to only show the installation messages):
.. code:: bash
{app = "helm-controller"} |= "wordpress" != "unchanged"
Filter out redundant ``helm-controller`` messages:
.. code:: bash
{ app = "helm-controller" } !~ "(unchanged | event=refreshed | method=Sync | component=checkpoint)"
Debug oauth2 single sign-on with zulip:
.. code:: bash
{container_name=~"(hydra|zulip)"}
Query kubernetes events processed by the ``eventrouter`` app containing
``warning``:
.. code:: bash
logcli query '{app="eventrouter"} |~ "warning"'
Cert-manager
^^^^^^^^^^^^
Cert manager is responsible for requesting Let’s Encrypt TLS
certificates.
Query ``cert-manager`` messages containing ``chat``:
.. code:: bash
{app="cert-manager"} |= "chat"
Hydra
^^^^^
Hydra is the single sign-on system.
Show only warnings and errors from ``hydra``:
.. code:: bash
{container_name="hydra"} != "level=info"
Backup Backup
------ ------
...@@ -204,14 +77,6 @@ following command that will apply the changes to all installed kustomizations: ...@@ -204,14 +77,6 @@ following command that will apply the changes to all installed kustomizations:
flux get -A kustomizations --no-header | awk -F' ' '{system("flux reconcile -n " $1 " kustomization " $2)}' flux get -A kustomizations --no-header | awk -F' ' '{system("flux reconcile -n " $1 " kustomization " $2)}'
.. _kubectl logs: https://kubernetes.io/docs/concepts/cluster-administration/logging
.. _Promtail: https://grafana.com/docs/loki/latest/clients/promtail/
.. _Loki: https://grafana.com/oss/loki/
.. _Grafana: https://grafana.com/
.. _Loki documentation: https://grafana.com/docs/loki/latest/
.. _Using Loki in Grafana: https://grafana.com/docs/grafana/latest/datasources/loki
.. _LogQL documentation: https://grafana.com/docs/loki/latest/logql
.. _Flux: https://fluxcd.io/
.. _Velero’s documentation: https://velero.io/docs/v1.4/ .. _Velero’s documentation: https://velero.io/docs/v1.4/
.. _reach out to us: https://stackspin.net/contact.html .. _reach out to us: https://stackspin.net/contact.html
.. _taints: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/ .. _taints: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
# Please add developer dependencies which are not needed to install # Please add developer dependencies which are not needed to install
# Stackspin to requirements-dev.txt! # Stackspin to requirements-dev.txt!
# #
recommonmark
sphinx sphinx
sphinx-design sphinx-design
sphinx-rtd-theme sphinx-rtd-theme
recommonmark
...@@ -14,7 +14,7 @@ charset-normalizer==2.0.9 ...@@ -14,7 +14,7 @@ charset-normalizer==2.0.9
# via requests # via requests
commonmark==0.9.1 commonmark==0.9.1
# via recommonmark # via recommonmark
docutils==0.18.1 docutils==0.17.1
# via # via
# recommonmark # recommonmark
# sphinx # sphinx
...@@ -30,8 +30,10 @@ markupsafe==2.0.1 ...@@ -30,8 +30,10 @@ markupsafe==2.0.1
packaging==21.3 packaging==21.3
# via sphinx # via sphinx
pygments==2.10.0 pygments==2.10.0
# via sphinx # via
pyparsing==2.4.7 # -r requirements.in
# sphinx
pyparsing==3.0.6
# via packaging # via packaging
pytz==2021.3 pytz==2021.3
# via babel # via babel
......
...@@ -95,7 +95,7 @@ Tests a specific application ...@@ -95,7 +95,7 @@ Tests a specific application
Known Issues Known Issues
'''''''''''' ''''''''''''
The Default ssh backend for testinfra tests is ``paramiko``, which doesn't work The default ssh backend for testinfra tests is ``paramiko``, which doesn't work
out of the box. It fails to connect to the host because the ``ed25519`` hostkey out of the box. It fails to connect to the host because the ``ed25519`` hostkey
was not verified. Therefore we need to force plain ssh:// with either was not verified. Therefore we need to force plain ssh:// with either
``connection=ssh`` or ``--hosts=ssh://…`` ``connection=ssh`` or ``--hosts=ssh://…``
...@@ -190,8 +190,8 @@ then: ...@@ -190,8 +190,8 @@ then:
gitlab-runner exec docker --env CI_REGISTRY_IMAGE="$CI_REGISTRY_IMAGE" --env SSH_PRIVATE_KEY="$SSH_PRIVATE_KEY" --env COSMOS_API_TOKEN="$COSMOS_API_TOKEN" bootstrap gitlab-runner exec docker --env CI_REGISTRY_IMAGE="$CI_REGISTRY_IMAGE" --env SSH_PRIVATE_KEY="$SSH_PRIVATE_KEY" --env COSMOS_API_TOKEN="$COSMOS_API_TOKEN" bootstrap
Taiko tests Advanced Taiko tests
''''''''''' ''''''''''''''''''''
If you want to use Taiko without invoking the stackspin CLI, go to the If you want to use Taiko without invoking the stackspin CLI, go to the
``test/taiko`` directory and run: ``test/taiko`` directory and run:
......
...@@ -96,10 +96,9 @@ renamed from ``oas`` to ``stackspin``. You can choose from these options: ...@@ -96,10 +96,9 @@ renamed from ``oas`` to ``stackspin``. You can choose from these options:
Rocket.Chat Rocket.Chat
~~~~~~~~~~~ ~~~~~~~~~~~
We replaced Rocket.Chat with `Zulip <https://zulip.com>`__ in this release. We replaced Rocket.Chat with `Zulip`_ in this release.
If you want to migrate your Rocket.Chat data to your new Zulip installation If you want to migrate your Rocket.Chat data to your new `Zulip`_ installation
please refer to please refer to `Import from Rocket.Chat`_.
`Import from Rocket.Chat https://api.zulip.com/help/import-from-rocketchat`__.
Monitoring Monitoring
~~~~~~~~~~ ~~~~~~~~~~
...@@ -131,13 +130,12 @@ from v0.6 to v0.7!** ...@@ -131,13 +130,12 @@ from v0.6 to v0.7!**
.. note:: .. note::
Before you start, please ensure that you have the right ``yq`` tool installed, Before you start, please ensure that you have the right ``yq`` tool installed,
because you will need it later. There are two very different versions of because you will need it later. There are two very different versions of
``yq``. The one you need is the go based `yq from Mike Farah ``yq``. The one you need is the go based `yq from Mike Farah`_,
<http://mikefarah.github.io/yq>`_, which installs the same binary name ``yq`` which installs the same binary name as the `python-yq`_ one, while both have
as the `python-yq <https://github.com/kislyuk/yq>`_, while both have different different command sets.
command sets.
The yq needed here can be installed by running ``sudo snap install yq``, The yq needed here can be installed by running ``sudo snap install yq``,
``brew install yq`` or with other methods from the `yq installation ``brew install yq`` or with other methods from the
instructions <http://mikefarah.github.io/yq/#install>`_. `yq installation instructions`_.
If you're unsure which ``yq`` you have installed, look at the output of If you're unsure which ``yq`` you have installed, look at the output of
``yq --help`` and make sure ``eval`` shows up under ``Available Commands:``. ``yq --help`` and make sure ``eval`` shows up under ``Available Commands:``.
...@@ -394,3 +392,8 @@ intervention. ...@@ -394,3 +392,8 @@ intervention.
.. _reach out to us: https://openappstack.net/contact.html .. _reach out to us: https://openappstack.net/contact.html
.. _Flux: https://fluxcd.io .. _Flux: https://fluxcd.io
.. _yq from Mike Farah: https://mikefarah.github.io/yq
.. _yq installation instructions: https://mikefarah.github.io/yq/#install
.. _python-yq: https://github.com/kislyuk/yq
.. _Zulip: https://zulip.com
.. _Import from Rocket.Chat: https://api.zulip.com/help/import-from-rocketchat
...@@ -24,7 +24,7 @@ them access to applications, take a look at the `user panel documentation ...@@ -24,7 +24,7 @@ them access to applications, take a look at the `user panel documentation
.. note:: .. note::
If you don't see applications, make sure you have installed at least one If you don't see applications, make sure you have installed at least one
optional application in :ref:`additional_apps` of the installation procedure. optional application in :ref:`install_additional_apps` of the installation procedure.
For creating users follow the `user creation documentation For creating users follow the `user creation documentation
<https://docs.stackspin.net/projects/user-panel/en/latest/#creating-a-new-user>`_. <https://docs.stackspin.net/projects/user-panel/en/latest/#creating-a-new-user>`_.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment