1
0
Fork 0
Commit Graph

10 Commits (523df5c2353bec74f6c4be74713a02e170a4f079)

Author SHA1 Message Date
q3k 18084c1e86 cluster/nix: k0: enable rgw on osds
This enables radosgw wherever osds are. This should be fast and works
for us because we have little osd hosts.

Change-Id: I4ed014d2790d6c02a2ba8e775aaa1846032dee1e
2021-09-14 21:39:39 +02:00
q3k 05c4b5515b cluster/nix: symlink /sbin/lvm
This is needed by the new Rook OSD daemons.

Change-Id: I16eb24332db40a8209e7eb9747a81fa852e5cad9
2021-09-11 20:45:45 +00:00
q3k 9848e7e15f cluster: deploy NixOS-based ceph
First pass at a non-rook-managed Ceph cluster. We call it k0 instead of
ceph-waw4, as we pretty much are sure now that we will always have a
one-kube-cluster-to-one-ceph-cluster correspondence, with different Ceph
pools for different media kinds (if at all).

For now this has one mon and spinning rust OSDs. This can be iterated on
to make it less terrible with time.

See b/6 for more details.

Change-Id: Ie502a232c700af93f33fcad9fa1c57058161aa11
2021-09-11 20:33:24 +00:00
q3k 0d26fc9780 cluster: disable nginx/acme
These are unused.

Change-Id: I2a428dabd0a27c060c595f5e0843d7d8d8e26dcd
2021-02-15 22:14:41 +01:00
q3k 765e369255 cluster: replace docker with containerd
This removes Docker and docker-shim from our production kubernetes, and
moves over to containerd/CRI. Docker support within Kubernetes was
always slightly shitty, and with 1.20 the integration was dropped
entirely. CRI/Containerd/runc is pretty much the new standard.

Change-Id: I98c89d5433f221b5fe766fcbef261fd72db530fe
2021-02-15 22:14:15 +01:00
q3k acdd665b08 cluster: use static addresses
This disables DHCP on all k0 nodes. This change has been tentatively
deployed to bc01n01 (which is cordoned off in kube), and I will deploy
it to the rest of k0 machines once merged.

Change-Id: I96253a9d0acedb4512c877c64174992ffdb43d58
2020-12-14 19:10:52 +01:00
q3k e77f7717d4 k0: bump to 1.16.5
Change-Id: I548808ce4e0deb0513a1e00963f383d84b9d920c
2020-10-10 22:39:50 +02:00
q3k 1257389d3d k0: expose controller-manager and scheduler metrics
We want to be able to scrape controller-manager and scheduler metrics
into Prometheus. For that, each of them needs to:

 1) listen on a secure port
 2) have authn enabled

With this, any k8s user with the right permissions (and a bearer token
or TLS certificate) can come in and access metrics over a node's public
IP address. Access without a certificate/token gets thrown into the
system:anonymous user, which as no access to any API.

Change-Id: I267680f92f748ba63b6762e6aaba3c417446e50b
2020-10-10 16:00:15 +00:00
q3k 2e001e5046 k0: bump to 1.15.4
This notably fixes the annoying loopback issues that prevented hosts
from accessing externalip services with externalTrafficPolicy: local
from nodes that weren't running the service.

Which means, hopefuly, no more registry pull failures when
nginx-ingress gets misplaced!

Change-Id: Id4923fd0fce2e28c31a1e65518b0e984165ca9ec
2020-10-03 16:32:38 +00:00
q3k fbe234bdb2 cluster: rename module-* into modules/*
Change-Id: I65e06f3e9cec2ba0071259eb755eddbbd1025b97
2020-10-03 14:57:30 +00:00