README.md

# Stackspin Dashboard

This repo hosts the Stackspin Dashboard, both frontend and backend code.

## Project structure

### Frontend

The frontend code lives in the `frontend` directory.

### Backend

The backend code lives in the `backend` directory. Apart from the dashboard
backend itself, it also contains a flask application that functions as the
identity provider, login, consent and logout endpoints for the OpenID Connect
(OIDC) process.

The application relies on the following components:

- **Hydra**: Hydra is an open source OIDC server.
  It means applications can connect to Hydra to start a session with a user.
  Hydra provides the application with the username
  and other roles/claims for the application.
  Hydra is developed by Ory and has security as one of their top priorities.

- **Kratos**: This is Identity Manager
  and contains all the user profiles and secrets (passwords).
  Kratos is designed to work mostly between UI (browser) and kratos directly,
  over a public API endpoint.
  Authentication, form-validation, etc. are all handled by Kratos.
  Kratos only provides an API and not UI itself.
  Kratos provides an admin API as well,
  which is only used from the server-side flask app to create/delete users.

- **MariaDB**: The login application, as well as Hydra and Kratos, need to store data.
  This is done in a MariaDB database server.
  There is one instance with three databases.
  As all databases are very small we do not foresee resource limitation problems.

If Hydra hits a new session/user, it has to know if this user has access.
To do so, the user has to login through a login application.
This application is developed by the Stackspin team (Greenhost)
and is part of this repository.
It is a Python Flask application
The application follows flows defined in Kratos,
and as such a lot of the interaction is done in the web-browser,
rather then server-side.
As a result,
the login application has a UI component which relies heavily on JavaScript.
As this is a relatively small application,
it is based on traditional Bootstrap + JQuery.

## Development environment

The development environment is a hybrid one, where one or both of the dashboard
frontend and backend run locally, but the rest of the cluster runs on a remote
machine.

The remote should be a regular Stackspin cluster, though preferably one that's
dedicated to development purposes.

The local dashboard frontend and/or backend can run in a docker container or
directly ("native mode"). (At this time it's not possible to mix the two, for
example by having the dashboard backend run directly and the frontend in a
docker container.)

The connection between the local and remote parts is set up by a tool called
telepresence. If you want to develop the frontend for example, telepresence
intercepts traffic that goes into the remote's frontend pod and redirects it to
your copy that's running locally on your machine; responses from your local
frontend are led back via the remote. This interception happens invisibly to
your browser, which you just point at the remote cluster.

### Prerequisites

#### Set up telepresence on your local development machine

You need to do this once for every development machine you're using
(workstation, laptop).

* You need root on your machine and at some point allow telepresence to perform
  actions as root, in order to make network changes to allow the two-way
  tunnel. If this is not possible or not desirable, you can try to run your
  local dashboard in a docker container instead.
* Set `user_allow_other` in `/etc/fuse.conf`. This is necessary when
  telepresence adds (FUSE-based) sshfs mounts so your local code can access
  volumes from the kubernetes cluster, in particular the one with the service
  account token (credentials for calling the kubernetes api), to let the
  dashboard interact with the cluster.
  - MacOS users may have to do a little extra work to get a working current
    sshfs: see [telepresence
    docs](https://www.getambassador.io/docs/telepresence-oss/latest/troubleshooting#volume-mounts-are-not-working-on-macos).
* Download and install the telepresence binary on your development machine:
  https://www.getambassador.io/docs/telepresence-oss/latest/install

#### Access to development cluster

You need `kubectl` and `helm` binaries, and a `kubectl` configuration file
(often called "kubeconfig") containing credentials needed to authenticate
against your cluster. If the `KUBECONFIG` environment variable is set and
points to the config file, this will be picked up by the various programs.

#### Set up telepresence on your development cluster

You need to do this once for every cluster you want to use as a development cluster.

* Install telepresence on your development cluster:
  ```
  telepresence helm install -f telepresence-values.yaml
  ```

#### Install local dependencies

Before running the frontend in native mode:
* Make sure you have nodejs installed. You may want to use [Node Version
  Manager](https://github.com/nvm-sh/nvm) to make it easy to install several
  versions side by side.
* Install necessary javascript dependencies (will be placed in
  `frontend/node_modules`) using `./dev.sh frontend setup`.

Before running the backend in native mode:
* Make sure you have python3 installed.
* Install necessary python dependencies (in a virtualenv in `backend/venv`)
  using `./dev.sh backend setup`.

### Run

From the root `dashboard` directory, run for example `./dev.sh frontend`. This
will set up the telepresence tunnel to the cluster, and start the dashboard
frontend server in native mode. `./dev.sh backend` will do the same but for the
backend. You can run both at the same time (in separate terminal windows) if
you want to make changes to both frontend and backend.

If you want to run the local dashboard in docker instead, use `./dev.sh
frontend docker` and/or `./dev.sh backend docker`. Please note that due to a
telepresence limitation it's not currently possible to run the frontend
natively and the backend in docker at the same time, or vice versa.

#### Known issues

* Running the dashboard backend locally with telepresence in docker mode
  currently doesn't work because of dns resolution issues in the docker
  container: https://github.com/telepresenceio/telepresence/issues/1492 . We
  could work around this by using a fully qualified domain name for the
  database service -- which doesn't agree with the goal of making the stackspin
  namespace variable -- or using the service env vars, but we're hoping that
  telepresence will fix this in time.
* Telepresence intercepts traffic to a pod, but the original pod is still
  running. In case of the backend, this is sometimes problematic, for example
  when you're adding database migrations which the original pod then doesn't
  know about and crashes, or with SCIM which involves timer-based actions which
  are then performed both by your modified local instance and by the original
  remote one. There is some work in progress to allow scaling down the
  intercepted pod: https://github.com/telepresenceio/telepresence/issues/1608 .
* If telepresence is giving errors, in particular ones about "an intercept with
  the same name already existing" on repeated runs, it may help to reset the
  telepresence state by doing `./dev.sh reset`. This will stop the local
  telepresence daemon so it can be cleanly restarted on the next try, and will
  also restart the "traffic manager" on the remote so it will discard any old
  lingering intercepts.

---

## Testing as a part of Stackspin

Sometimes you may want to make more fundamental changes to the dashboard that
might behave differently in the local development environment compared to a
regular Stackspin instance, i.e., one that's not a local/cluster hybrid. In
this case, you'll want to run your new version in a regular Stackspin cluster.

To do that:
* Push your work to an MR.
* Set the image tags in `values.yaml` to the one created for your branch; if
  unsure, check the available tags in the Gitlab container registry for the
  dashboard project.
* Make sure to increase the chart version number in `Chart.yaml`, preferably
  with a suffix to denote that it's not a stable version. For example, if the
  last stable release is 1.2.3, make the version 1.2.4-myawesomefeature in your
  branch.

The CI pipeline should then publish your new chart version in the Gitlab helm
chart repo for the dashboard project, but in the `unstable` channel -- the
`stable` channel is reserved for chart versions that have been merged to the
`main` branch.

Once your package is published, use it by

1. changing the `spec.url` field of the `flux-system/dashboard`
   `HelmRepository` object in the cluster where you want to run this, replacing
   `stable` by `unstable`; and
2. changing the `spec.chart.spec.version` field of the `stackspin/dashboard`
   `HelmRelease` to your chart version (the one from this chart's `Chart.yaml`).

## Release process

To publish a new version of the helm chart:

1. Increase the docker image tag in `deployment/helmchart/values.yaml` so it uses the new tag (to be
   created in a later step of this release).
2. Update the appVersion in `deployment/helmchart/Chart.yaml` to match that new tag version.
3. Increase the chart version in `deployment/helmchart/Chart.yaml`.
4. Update `CHANGELOG.md` and/or `deployment/helmchart/CHANGELOG.md` and check
   that it includes relevant changes, including ones added by renovatebot.
5. Commit and push these changes to `main`.
6. Create a new git tag for the new release and push it to gitlab as well.

The last step will trigger a CI run that will package and publish the helm chart.