Commit Graph

189 Commits (5e695e8f9b7496e6b6cb958fde6f2687af17d37b)

Author SHA1 Message Date
q3k 3ba5c1b591 *: docs pass
Change-Id: I87ca80d3f7728ed407071468ac233e6ad4574929
2021-03-06 22:21:28 +00:00
q3k bc0d3cb227 hackdoc: link to cs instead of gitweb
Change-Id: Ifca7a63517bceffe7ccc0452474d9d16626486de
2021-03-06 22:16:54 +00:00
q3k 0d26fc9780 cluster: disable nginx/acme
These are unused.

Change-Id: I2a428dabd0a27c060c595f5e0843d7d8d8e26dcd
2021-02-15 22:14:41 +01:00
q3k 765e369255 cluster: replace docker with containerd
This removes Docker and docker-shim from our production kubernetes, and
moves over to containerd/CRI. Docker support within Kubernetes was
always slightly shitty, and with 1.20 the integration was dropped
entirely. CRI/Containerd/runc is pretty much the new standard.

Change-Id: I98c89d5433f221b5fe766fcbef261fd72db530fe
2021-02-15 22:14:15 +01:00
q3k 4b613303b1 RFC: *: move away from rules_nixpkgs
This is an attempt to see how well we do without rules_nixpkgs.

rules_nixpkgs has the following problems:

 - complicates our build system significantly (generated external
   repository indirection for picking local/nix python and go)
 - creates builds that cannot run on production (as they are tainted by
   /nix/store libraries)
 - is not a full solution to the bazel hermeticity problem anyway, and
   we'll have to tackle that some other way (eg. by introducing proper
   C++ cross-compilation toolchains and building everything from C,
   including Python and Go)

Instead of rules_nixpkgs, we ship a shell.nix file, so NixOS users can
just:

  jane@hacker:~/hscloud $ nix-shell
  hscloud-build-chrootenv:jane@hacker:~/hscloud$ prodaccess

This shell.nix is in a way nicer, as it immediately gives you all tools
needed to access production straight away.

Change-Id: Ieceb5ae0fb4d32e87301e5c99416379cedc900c5
2021-02-15 22:11:35 +01:00
q3k 4842705406 cluster/nix: integrate with readtree
This unifies nixpkgs with the one defined in //default.nix and makes it
possible to use readTree to build the provisioners:

   nix-build -A cluster.nix.provision

   result/bin/provision

Change-Id: I68dd70b9c8869c7c0b59f5007981eac03667b862
2021-02-14 14:46:07 +00:00
q3k 225a5c7ee9 nixpkgs: bump
Fixes b/3.

Change-Id: I2f734422cdad00f78956477815c4aea645c6c49e
2021-02-14 14:43:07 +00:00
q3k 78d6f11cb2 Merge "cluster/admitomatic: allow whitelist-source-range" 2021-02-08 17:21:59 +00:00
q3k 877cf0af26 🅱️
Fixes b/8

Change-Id: I5a5779c3688451d89c0601dc913143d75048c9f6
2021-02-08 15:10:11 +00:00
q3k 943ab5b1a6 cluster/admitomatic: allow whitelist-source-range
Without this, cert-manager get stuck.

Deployed to prod.

Change-Id: I356cd44f455b6f4aecea9ae396f6a05e1a727859
2021-02-07 23:35:28 +00:00
q3k f40c9249ce cluster/kube: allow system:admin-namespaces to modify ingresses
This will permit any binding to system:admin-namespaces (eg. personal-*
namespaces, per-namespace extra admin access like matrix-0x3c) the
ability to create and updates ingresses.

Change-Id: I522896ebe290fe982d6fe46b7b1d604d22b4f72c
2021-02-07 19:24:43 +00:00
q3k 41bbf1436a cluster/kube: deploy admitomatic webhook
This has been (succesfully) tested on prod and then rolled back.

Change-Id: I22657f66b4aeaa8a0ae452035ba18a79f4549b14
2021-02-07 19:19:23 +00:00
q3k 3c5d836c56 cluster/kube: deploy admitomatic
This doesn't yet enable a webhook, but deploys admitomatic itself.

Change-Id: Id177bc8841c873031f9c196b8ff3c12dd846ba8e
2021-02-07 19:19:02 +00:00
q3k 3ab5f07c64 cluster/admitomatic: build docker image
Change-Id: I086a8b17a4dc7257de1bae3a6f0c95400af7e115
2021-02-07 19:18:53 +00:00
q3k c80321d17e Merge "cluster: add admitomatic CA/certificate" 2021-02-06 23:18:59 +00:00
q3k 04604b2aae cluster: add admitomatic CA/certificate
Change-Id: Idb32dc38b897aa266b6d2d6fd57a5e38b47db7fc
2021-02-06 17:18:58 +00:00
informatic f4a6a56662 cluster/kube/k0: add issues.hackerspace.pl crdb user
Change-Id: If78f795e0e35360b65c666e6b217037fc34a2ccf
2021-02-01 21:32:25 +01:00
informatic 3b8a43f35d cluster/kube/k0: add issues.hackerspace.pl ceph s3 user
Change-Id: If5eef3404bdc08ded88e46f45bad0f9abcdb0f1c
2021-02-01 21:19:59 +01:00
q3k c6118649ab cluster/admitomatic: finish up service
This turns admitomatic into a self-standing service that can be used as
an admission controller.

I've tested this E2E on a local k3s server, and have some early test
code for that - but that'll land up in a follow up CR, as it first needs
to be cleaned up.

Change-Id: I46da0fc49f9d1a3a1a96700a36deb82e5057249b
2021-01-31 12:18:16 +01:00
q3k 5d2c8fcda0 cluster/admitomatic: finish up ingress admission logic
This gives us nearly everything required to run the admission
controller. In addition to checking for allowed domains, we also do some
nginx-inress-controller security checks.

Change-Id: Ib187de6d2c06c58bd8c320503d4f850df2ec8abd
2021-01-31 12:18:16 +01:00
q3k 649565324b cluster/admitomatic: implement basic dns/ns filtering
This is the beginning of a validating admission controller which we will
use to permit end-users access to manage Ingresses.

This first pass implements an ingressFilter, which is the main structure
through which allowed namespace/dns combinations will be allowed. The
interface is currently via a test, but in the future this will likely be
configured via a command line, or via a serialized protobuf config.

Change-Id: I22dbed633ea8d8e1fa02c2a1598f37f02ea1b309
2021-01-30 19:19:35 +01:00
patryk edf14cc5f4 crdb: replace bc01n03 with dcr01s22, upgrade to v20.2.4
This change reflects the current production state.

Upgrade was done by going through following versions:
19.1.0 -> 19.2.12 -> 20.1.10 -> 20.2.4

Change-Id: I8b33b8116363f1a918423fd18ba3d1b5c910851c
2021-01-23 23:00:29 +01:00
patryk f3153888a8 cluster/kube: Add k0-cockroach.jsonnet, add Gitea client cert
Change-Id: Ibc5db1b0114b2540b6dc806e75e9a36cf9a3bc50
2021-01-23 15:38:50 +01:00
q3k 61f978a0a0 *: tear down ceph-waw2
It reached the stage of being crapped out so much that the OSDs spurious
IOPS killed the performance of disks colocated on the same M610 RAID
controllers. This made etcd _very_ slow, to the point of churning
through re-elections due to timeouts.

etcd/apiserver latencies, observe the difference at ~15:38:

https://object.ceph-waw3.hswaw.net/q3k-personal/4fbe8d4cfc8193cad307d487371b4e44358b931a7494aa88aff50b13fae9983c.png

I moved gerrit/* and matrix/appservice-irc-freenode PVCs to ceph-waw3 by
hand. The rest were non-critical so I removed them, they can be
recovered from benji backups if needed.

Change-Id: Iffbe87aefc06d8324a82b958a579143b7dd9914c
2021-01-22 16:26:09 +01:00
q3k 3b9ee5f1c0 ceph: bump to 14.2.16
More as-builts. This has already been bumped. Had to coax ceph-waw2 to
upgrade despite the fact that it's horribly broken.

Change-Id: Ia762f5d7d88d6420c2fc25cf199037cbccde0cb3
2021-01-19 21:45:26 +00:00
q3k 2c04c8410a rook: bump to 1.2.7
As-built: deployed to ceph-waw{2,3} already.

Change-Id: I27189b273cf72638cf2036681054832db99591da
2021-01-19 21:41:13 +01:00
q3k f684535c6e k0: remove bc01n03 from nix defs
This only affects ETCD_INITIAL_* env vars, so is is effectively a no-op.

Deployed to prod.

Change-Id: Ic9118e17b088d1b58ebaf1ac0708a1ee6fcf2c06
2021-01-19 20:20:33 +01:00
q3k cf842b0442 k0: reflect reality
This is after the monster^Wrook outage of the week two weeks ago caused
by bc01n03 dying.

Plan is to migrate ceph-waw3 to be external, yeet ceph-waw2, and extend
crdb-waw1 to another node.

Change-Id: I133af3b1171fea383b45bf06c51e48a5c40341e4
2021-01-19 20:08:26 +01:00
q3k 9708ba02ec Merge "cluster: use static addresses" 2020-12-15 18:53:54 +00:00
q3k acdd665b08 cluster: use static addresses
This disables DHCP on all k0 nodes. This change has been tentatively
deployed to bc01n01 (which is cordoned off in kube), and I will deploy
it to the rest of k0 machines once merged.

Change-Id: I96253a9d0acedb4512c877c64174992ffdb43d58
2020-12-14 19:10:52 +01:00
patryk cae7cf776f k0: add missing curly brace termination in woju's S3 user name
Change-Id: Ib2752d798f6e23493daee446a834e244f858330e
2020-11-28 14:36:48 +01:00
patryk 34668a5b7b k0: add cz3's personal s3 user
Change-Id: I51ee80eb05c34cfd8b03e15fcaefb5f235587c50
2020-11-28 13:45:25 +01:00
q3k f18a531f9b prodvider: bump to Go 1.15.5
Change-Id: I0f7999deb571aef12533f0ceee21c0283bc0bdc4
2020-11-27 09:50:09 +00:00
q3k 0754ed86a2 prodvider: fix build after k8s update, add to CI presubmit
Change-Id: I5a3794541853abd1fb16e67e285edfa29c2f5cf7
2020-11-27 09:43:47 +00:00
q3k e00fe3a448 cluster/tools/kartongips: skip tests broken by fork
These tests are broken as they depend on some test data that we
currently don't have in hscloud. They should be fixed ASAP.

Change-Id: I2571c2958cb84e145a7e3a44171685ecf43cf499
2020-11-12 00:45:15 +01:00
q3k 640336144d cluster/tools: integrate kartongips as main kubecfg tool
Change-Id: If6a6c8e9c9163f0fc25adcaa8680857fdca69cd3
2020-11-12 00:40:08 +01:00
q3k be538db63b cluster/tools/kartongips: init
This forks bitnami/kubecfg into kartongips. The rationale is that we
want to implement hscloud-specific functionality that wouldn't really be
upstreamable into kubecfg (like secret support, mulit-cluster support).

We forked off from github.com/q3k/kubecfg at commit b6817a94492c561ed61a44eeea2d92dcf2e6b8c0.

Change-Id: If5ba513905e0a86f971576fe7061a471c1d8b398
2020-11-12 00:39:34 +01:00
q3k bfe9bb0e3a k0: add woju's personal s3 user
Change-Id: I8ed5bb5428594b74460f1b89185d684cb6c26268
2020-10-27 20:50:50 +01:00
q3k e77f7717d4 k0: bump to 1.16.5
Change-Id: I548808ce4e0deb0513a1e00963f383d84b9d920c
2020-10-10 22:39:50 +02:00
q3k 1257389d3d k0: expose controller-manager and scheduler metrics
We want to be able to scrape controller-manager and scheduler metrics
into Prometheus. For that, each of them needs to:

 1) listen on a secure port
 2) have authn enabled

With this, any k8s user with the right permissions (and a bearer token
or TLS certificate) can come in and access metrics over a node's public
IP address. Access without a certificate/token gets thrown into the
system:anonymous user, which as no access to any API.

Change-Id: I267680f92f748ba63b6762e6aaba3c417446e50b
2020-10-10 16:00:15 +00:00
q3k 36224c617a clustercfg: show diff before switching to new configuration
This is mildly hacky, but lets us be more informed before we switch to a
new configuration.

Change-Id: I008f3f698db702f1e0992bd41a8d1050449d59b5
2020-10-10 16:00:11 +00:00
q3k 2e001e5046 k0: bump to 1.15.4
This notably fixes the annoying loopback issues that prevented hosts
from accessing externalip services with externalTrafficPolicy: local
from nodes that weren't running the service.

Which means, hopefuly, no more registry pull failures when
nginx-ingress gets misplaced!

Change-Id: Id4923fd0fce2e28c31a1e65518b0e984165ca9ec
2020-10-03 16:32:38 +00:00
q3k 2a223705fd cluster: bump certs
This has been deployed to k0 nodes.

Current state of cluster certificates:

cluster/certs/ca-etcd.crt
            Not After : Apr  4 17:59:00 2024 GMT
cluster/certs/ca-etcdpeer.crt
            Not After : Apr  4 17:59:00 2024 GMT
cluster/certs/ca-kube.crt
            Not After : Apr  4 17:59:00 2024 GMT
cluster/certs/ca-kubefront.crt
            Not After : Apr  4 17:59:00 2024 GMT
cluster/certs/ca-kube-prodvider.cert
            Not After : Sep  1 21:30:00 2021 GMT
cluster/certs/etcd-bc01n01.hswaw.net.cert
            Not After : Mar 28 15:53:00 2021 GMT
cluster/certs/etcd-bc01n02.hswaw.net.cert
            Not After : Mar 28 16:45:00 2021 GMT
cluster/certs/etcd-bc01n03.hswaw.net.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/etcd-calico.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/etcd-dcr01s22.hswaw.net.cert
            Not After : Oct  3 15:33:00 2021 GMT
cluster/certs/etcd-dcr01s24.hswaw.net.cert
            Not After : Oct  3 15:38:00 2021 GMT
cluster/certs/etcd-kube.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/etcdpeer-bc01n01.hswaw.net.cert
            Not After : Mar 28 15:53:00 2021 GMT
cluster/certs/etcdpeer-bc01n02.hswaw.net.cert
            Not After : Mar 28 16:45:00 2021 GMT
cluster/certs/etcdpeer-bc01n03.hswaw.net.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/etcdpeer-dcr01s22.hswaw.net.cert
            Not After : Oct  3 15:33:00 2021 GMT
cluster/certs/etcdpeer-dcr01s24.hswaw.net.cert
            Not After : Oct  3 15:38:00 2021 GMT
cluster/certs/etcd-root.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/kube-apiserver.cert
            Not After : Oct  3 15:26:00 2021 GMT
cluster/certs/kube-controllermanager.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/kubefront-apiserver.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/kube-kubelet-bc01n01.hswaw.net.cert
            Not After : Mar 28 15:53:00 2021 GMT
cluster/certs/kube-kubelet-bc01n02.hswaw.net.cert
            Not After : Mar 28 16:45:00 2021 GMT
cluster/certs/kube-kubelet-bc01n03.hswaw.net.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/kube-kubelet-dcr01s22.hswaw.net.cert
            Not After : Oct  3 15:33:00 2021 GMT
cluster/certs/kube-kubelet-dcr01s24.hswaw.net.cert
            Not After : Oct  3 15:38:00 2021 GMT
cluster/certs/kube-proxy.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/kube-scheduler.cert
            Not After : Mar 28 15:15:00 2021 GMT
cluster/certs/kube-serviceaccounts.cert
            Not After : Mar 28 15:15:00 2021 GMT

Change-Id: I94030ce78c10f7e9a0c0257d55145ef629195314
2020-10-03 16:32:32 +00:00
q3k fbe234bdb2 cluster: rename module-* into modules/*
Change-Id: I65e06f3e9cec2ba0071259eb755eddbbd1025b97
2020-10-03 14:57:30 +00:00
q3k c7de7e562f cluster: do not export metallb routes to mesh peers
This prevents metallb routes being announced from all peers to our ToR,
thereby preventing issues with traffic hitting services with
externalTrafficPolicy: local.

There still is the from-host loopback issue, but that will be fixed by
upgrading to kube 1.15.

Change-Id: Ifc9964b46840aee82d99f0b6550188550e46fe04
2020-10-03 14:56:52 +00:00
q3k f0acf16564 prodvider: use SANs in service certificates
This fixes compatibility with prodaccess tools built with Go 1.15, which
introduced 'X.509 CommonName deprecation' [1].

[1] - https://golang.org/doc/go1.15#commonname

Change-Id: I228cde3e5651a3e36f527783f2ccb4a2f6b7a8e3
2020-10-03 14:56:35 +00:00
q3k 44628f2b9e Merge "k0.hswaw.net: pass metallb through Calico" 2020-10-02 22:54:57 +00:00
q3k e7fca3acd8 ci_presubmit: init
This will be, at some point, a script to run on Gerrit presubmit (ie.
right before merge).

For now, you can manually run it to ensure that Everything At Least
Kinda Works.

Change-Id: I28b305fa81a4ca4a8e94ce4daa06fe9ae0184fe8
2020-09-25 21:15:07 +00:00
q3k a5ed644980 k0.hswaw.net: pass metallb through Calico
Previously, we had the following setup:

                          .-----------.
                          | .....     |
                        .-----------.-|
                        | dcr01s24  | |
                      .-----------.-| |
                      | dcr01s22  | | |
                  .---|-----------| |-'
    .--------.    |   |---------. | |
    | dcsw01 | <----- | metallb | |-'
    '--------'        |---------' |
                      '-----------'

Ie., each metallb on each node directly talked to dcsw01 over BGP to
announce ExternalIPs to our L3 fabric.

Now, we rejigger the configuration to instead have Calico's BIRD
instances talk BGP to dcsw01, and have metallb talk locally to Calico.

                      .-------------------------.
                      | dcr01s24                |
                      |-------------------------|
    .--------.        |---------.   .---------. |
    | dcsw01 | <----- | Calico  |<--| metallb | |
    '--------'        |---------'   '---------' |
                      '-------------------------'

This makes Calico announce our pod/service networks into our L3 fabric!

Calico and metallb talk to eachother over 127.0.0.1 (they both run with
Host Networking), but that requires one side to flip to pasive mode. We
chose to do that with Calico, by overriding its BIRD config and
special-casing any 127.0.0.1 peer to enable passive mode.

We also override Calico's Other Bird Template (bird_ipam.cfg) to fiddle
with the kernel programming filter (ie. to-kernel-routing-table filter),
where we disable programming unreachable routes. This is because routes
coming from metallb have their next-hop set to 127.0.0.1, which makes
bird mark them as unreachable. Unreachable routes in the kernel will
break local access to ExternalIPs, eg. register access from containerd.

All routes pass through without route reflectors and a full mesh as we
use eBGP over private ASNs in our fabric.

We also have to make Calico aware of metallb pools - otherwise, routes
announced by metallb end up being filtered by Calico.

This is all mildly hacky. Here's hoping that Calico will be able to some
day gain metallb-like functionality, ie. IPAM for
externalIPs/LoadBalancers/...

There seems to be however one problem with this change (but I'm not
fixing it yet as it's not critical): metallb would previously only
announce IPs from nodes that were serving that service. Now, however,
the Calico internal mesh makes those appear from every node. This can
probably be fixed by disabling local meshing, enabling route reflection
on dcsw01 (to recreate the mesh routing through dcsw01). Or, maybe by
some more hacking of the Calico BIRD config :/.

Change-Id: I3df1f6ae7fa1911dd53956ced3b073581ef0e836
2020-09-23 18:55:12 +00:00
q3k 059fdfed3b k0: add resource requests/limits to nginx, remove gitea
We just had an outage seemingly caused by N-I-C sendings tons of traffic
to gitea, which in turn caused N-I-C to balloon in memory/CPU usage.

I haven't debugged the cause of this traffic, but I have disabled the
gitea TCP forward to Stop The Bleeding.

This change reflects ad-hoc production changes.

Change-Id: I37e11609f408fa3e3fbfafafba44dc83149b90a9
2020-09-20 22:53:40 +00:00