hscloud/cluster
Sergiusz Bazanski 09a0f06d2a cluster/kube/lib/nginx: use Local traffic policy
Diff against prod:

  - live services nginx-system.ingress-nginx
  + config services nginx-system.ingress-nginx
    {
      "apiVersion": "v1",
      "kind": "Service",
      "metadata": {
        "annotations": {},
        "labels": {
          "app.kubernetes.io/name": "ingress-nginx",
          "app.kubernetes.io/part-of": "ingress-nginx"
        },
        "name": "ingress-nginx",
        "namespace": "nginx-system"
      },
      "spec": {
  -     "externalTrafficPolicy": "Cluster",
  +     "externalTrafficPolicy": "Local",
        "ports": [
          {
            "name": "ssh",
            "port": 22,
            "protocol": "TCP",
            "targetPort": 22
          },
          {
            "name": "http",
            "port": 80,
            "protocol": "TCP",
            "targetPort": 80
          },
          {
            "name": "https",
            "port": 443,
            "protocol": "TCP",
            "targetPort": 443
          }
        ],
        "selector": {
          "app.kubernetes.io/name": "ingress-nginx",
          "app.kubernetes.io/part-of": "ingress-nginx"
        },
        "type": "LoadBalancer"
      }
    }

Change-Id: I0dd66e3f1643efa975d6180cc163a265d4b484ef
2019-06-29 22:44:53 +02:00
..
certs *: rejigger tls certs and more 2019-04-07 00:06:23 +02:00
clustercfg cluster/clustercfg: add clustercfg-nocerts 2019-06-20 16:11:38 +02:00
kube cluster/kube/lib/nginx: use Local traffic policy 2019-06-29 22:44:53 +02:00
secrets secretstore: add implr 2019-05-18 00:15:25 +02:00
README nix/cluster-configuration: fix CNI plugins being deleted on kubelet restart 2019-06-20 12:51:51 +02:00

HSCloud Clusters
================

Current cluster: `k0.hswaw.net`

Accessing via kubectl
---------------------

There isn't yet a service for getting short-term user certificates. Instead, you'll have to get admin certificates:

    bazel run //cluster/clustercfg:clustercfg admincreds $(whoami)-admin
    kubectl get nodes

Provisioning nodes
------------------

 - bring up a new node with nixos, running the configuration.nix from bootstrap (to be documented)
 - `bazel run //cluster/clustercfg:clustercfg nodestrap bc01nXX.hswaw.net`

That's it!

Ceph
====

We run Ceph via Rook. The Rook operator is running in the `ceph-rook-system` namespace. To debug Ceph issues, start by looking at its logs.

The following Ceph clusters are available:

ceph-waw1
---------

HDDs on bc01n0{1-3}. 3TB total capacity.

The following storage classes use this cluster:

 - `waw-hdd-redundant-1` - erasure coded 2.1
 - `waw-hdd-yolo-1` - unreplicated (you _will_ lose your data)
 - `waw-hdd-redundant-1-object` - erasure coded 2.1 object store

A dashboard is available at https://ceph-waw1.hswaw.net/, to get the admin password run:

    kubectl -n ceph-waw1 get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode ; echo

Rados Gateway (S3) is available at https://object.ceph-waw1.hswaw.net/. To create
an object store user consult rook.io manual (https://rook.io/docs/rook/v0.9/ceph-object-store-user-crd.html)
User authentication secret is generated in ceph cluster namespace (`ceph-waw1`),
thus may need to be manually copied into application namespace. (see
`app/registry/prod.jsonnet` comment)

`tools/rook-s3cmd-config` can be used to generate test configuration file for s3cmd.
Remember to append `:default-placement` to your region name (ie. `waw-hdd-redundant-1-object:default-placement`)