1
0
Fork 0
Commit Graph

1234 Commits (924d0035fdb64b7652f534d05ee67867b764c4d3)

Author SHA1 Message Date
informatic 0e6c6720d9 Merge "app/matrix/matrix.hackerspace.pl: pin synapse media-worker container version" 2021-09-14 20:58:53 +00:00
informatic e839f95079 cluster/kube/k0: add matrix and informatic personal ceph users
Change-Id: Ied8d474709b8053e9fc339435d3ca1ca5fdfa710
2021-09-14 22:21:22 +02:00
informatic 2e191eae7b app/matrix/matrix.hackerspace.pl: pin synapse media-worker container version
We keep this pinned to older version to prevent unneeded media container
restarts.

Change-Id: I221237d3f88720779572fd972e8ada65e829864d
2021-09-14 22:19:44 +02:00
informatic dcb131fdc2 Merge "app/matrix: appservice-irc v0.29.0 upgrade" 2021-09-14 20:19:15 +00:00
informatic 91faf5bc3d Merge "shell.nix: add missing gnupg" 2021-09-14 20:19:07 +00:00
q3k 719ab840c5 Merge changes Ia92c99e1,I4dca55a7,I4ed014d2,I96c3c18b,I08e70425
* changes:
  cluster/kube: always enable flexdriver
  cluster: k0: move ceph-waw3 to proper realm/zonegroup
  cluster/nix: k0: enable rgw on osds
  cluster: k0: upgrade to ceph 16.2.5
  cluster: k0: bump rook to 1.6
2021-09-14 19:53:17 +00:00
q3k 4b8ee32246 cluster/kube: always enable flexdriver
Documentation says [1] this is disabled by default in 1.1, but that
documentation kinda lies [2].

[1] - 235d5a384b/Documentation/flexvolume.md (ceph-flexvolume-configuration)

[2] - 64e28af741 (diff-d1eb5cba50e3770b61ccd3c730cd40514053e1da0233dfe09b5e7967e76a2a6cL424-L425)

Change-Id: Ia92c99e137ed751db62c0f56d42c4901986d0bb8
2021-09-14 21:39:39 +02:00
q3k 38f72fe094 cluster: k0: move ceph-waw3 to proper realm/zonegroup
With this we can use Ceph's multi-site support to easily migrate to our
new k0 Ceph cluster.

This migration was done by using radosgw-admin to rename the existing
realm/zonegroup to the new names (hscloud and eu), and then reworking
the jsonnet so that the Rook operator would effectively do nothing.

It sounds weird that creating a bunch of CRs like
Object{Realm,ZoneGroup,Zone} realm would be a no-op for the operator,
but that's how Rook works - a CephObjectStore generally creates
everything that the above CRs would create too, but implicitly. Adding
the extra CRs just allows specifying extra settings, like names.

(it wasn't fully a no-op, as the rgw daemon is parametrized by
realm/zonegroup/zone names, so that had to be restarted)

We also make the radosgw serve under object.ceph-eu.hswaw.net, which
allows us to right away start using a zonegroup URL instead of the
zone-only URL.

Change-Id: I4dca55a705edb3bd28e54f50982c85720a17b877
2021-09-14 21:39:39 +02:00
q3k 18084c1e86 cluster/nix: k0: enable rgw on osds
This enables radosgw wherever osds are. This should be fast and works
for us because we have little osd hosts.

Change-Id: I4ed014d2790d6c02a2ba8e775aaa1846032dee1e
2021-09-14 21:39:39 +02:00
q3k 085a8ff247 cluster: k0: upgrade to ceph 16.2.5
This was fun. See b/6 for a log of how swimmingly this went.

Change-Id: I96c3c18b5d33ef86523b3506f49a390419e9ca7f
2021-09-14 21:39:39 +02:00
q3k 464fb04f39 cluster: k0: bump rook to 1.6
This is needed to get Rook to talk to an external Ceph 16/Pacific
cluster.

This is mostly a bunch of CRD/RBAC changes. Most notably, we yeet our
own CRD rewrite and just slurp in upstream CRD defs.

Change-Id: I08e7042585722ae4440f97019a5212d6cf733fcc
2021-09-14 21:39:37 +02:00
informatic 0f26c4afbc app/matrix: appservice-irc v0.29.0 upgrade
Change-Id: I5b09b3e861442c0b8579abdbeff8983ab1ec0208
2021-09-14 20:00:42 +02:00
informatic 0c59cb33af shell.nix: add missing gnupg
This should fix secretstore on NixOS

Change-Id: Id86b0e920bef82f08a67a84e59d37d6f8737d83f
2021-09-14 20:00:42 +02:00
informatic 5cc64bf60e Merge "app/matrix: bump synapse to 1.37.1" 2021-09-14 17:51:07 +00:00
informatic 013c159dfe Merge "shell.nix: add missing tools" 2021-09-14 16:43:21 +00:00
informatic cb9cbb3fcc shell.nix: add missing tools
Some tools were taken from "host" shell/PATH which crashed in certain
cases due to libc incompatiblity.

Fixes b/50

Change-Id: Ie94e2c064afff6d5aa782f70e0a024365079e4c7
2021-09-14 18:37:10 +02:00
q3k 92c8dc6532 Merge "kartongips: paper over^W^Wfix CRD updates" 2021-09-12 22:11:11 +00:00
q3k 6c88de9dd7 Merge "cluster/nix: symlink /sbin/lvm" 2021-09-12 22:11:07 +00:00
q3k c793538b58 Merge "cluster: deploy NixOS-based ceph" 2021-09-12 00:56:12 +00:00
q3k 6579e842b0 kartongips: paper over^W^Wfix CRD updates
Ceph CRD updates would fail with:

  ERROR Error updating customresourcedefinitions cephclusters.ceph.rook.io: expected kind, but got map

This wasn't just https://github.com/bitnami/kubecfg/issues/259 . We pull
in the 'solution' from Pulumi
(https://github.com/pulumi/pulumi-kubernetes/pull/622) which just
retries the update via a JSON update instead, and that seems to have
worked.

We also add some better error return wrapping, which I used to debug
this issue properly.

Oof.

Change-Id: I2007a7857e44128d74760174b61b59efa58e9cbc
2021-09-11 20:54:34 +00:00
q3k 9cfc2a0e43 kube.libsonnet: refactor OpenAPI lib, support extra types
This was to be used by a Ceph CRD bump, but we ended up using upstream
yaml instead. But it's a useful change regardless.

I really should document this and write some tests.

Change-Id: I27ce94c6ebe50a4a93baa83418e8d40004755231
2021-09-11 20:49:51 +00:00
q3k 05c4b5515b cluster/nix: symlink /sbin/lvm
This is needed by the new Rook OSD daemons.

Change-Id: I16eb24332db40a8209e7eb9747a81fa852e5cad9
2021-09-11 20:45:45 +00:00
q3k 9848e7e15f cluster: deploy NixOS-based ceph
First pass at a non-rook-managed Ceph cluster. We call it k0 instead of
ceph-waw4, as we pretty much are sure now that we will always have a
one-kube-cluster-to-one-ceph-cluster correspondence, with different Ceph
pools for different media kinds (if at all).

For now this has one mon and spinning rust OSDs. This can be iterated on
to make it less terrible with time.

See b/6 for more details.

Change-Id: Ie502a232c700af93f33fcad9fa1c57058161aa11
2021-09-11 20:33:24 +00:00
q3k 1dbefed537 Merge "cluster/kube: remove ceph diff against k0 production" 2021-09-11 20:32:57 +00:00
q3k 9f639694ba Merge "kartongips: switch default diff behaviour to subset, nag users" 2021-09-11 20:18:34 +00:00
q3k 29f314b620 Merge "kartongips: implement proper diffing of aggregated ClusterRoles" 2021-09-11 20:18:28 +00:00
q3k 4f0468fa26 cluster/kube: remove ceph diff against k0 production
This now has a zero diff against prod.

location fields in CephCluster.storage.nodes seem to have been removed
from the CRD at some point. Not sure how the CRUSH tree now gets
populated, but whatever, it's been working like this for a while
already. Same for CephObjectStore.gateway.type.

The Rook Operator has been zero-scaled for a while now due to b/6.

Change-Id: I30a836f273f4c1529f60fa9297c96b7aac412f59
2021-09-11 12:43:53 +00:00
q3k 59c8149df4 kartongips: switch default diff behaviour to subset, nag users
Change-Id: I998cdf7e693f6d1ce86c7ea411f47320d72a5906
2021-09-11 12:43:50 +00:00
q3k 72d7574536 kartongips: implement proper diffing of aggregated ClusterRoles
For a while now we've had spurious diffs against Ceph on k0 because of
a ClusterRole with an aggregationRule.

The way these behave is that the config object has an empty rule list,
and instead populates an aggregationRule which combines other existing
ClusterRoles into that ClusterRole. The control plane then populates the
rule field when the object is read/acted on, which caused us to always
see a diff between the configuration of that ClusterRole.

This hacks together a hardcoded fix for this particular behaviour.
Porting kubecfg over to SSA would probably also fix this - but that's
too much work for now.

Change-Id: I357c1417d4023691e5809f1af23f58f364353388
2021-09-11 12:40:18 +00:00
q3k d592e6836d Merge "ops, cluster: consolidate NixOS provisioning" 2021-09-11 10:38:43 +00:00
implr 7f7dcd9847 Merge "nix: upgrade readTree" 2021-09-11 10:19:03 +00:00
implr 56ff18c486 nix: upgrade readTree
Change-Id: I460800dc3d8095e2ae89b8bd6ed7c5f0c90b6ccf
2021-09-11 12:18:04 +02:00
q3k b3c6770f8d ops, cluster: consolidate NixOS provisioning
This moves the diff-and-activate logic from cluster/nix/provision.nix
into ops/{provision,machines}.nix that can be used for both cluster
machines and bgpwtf machines.

The provisioning scripts now live per-NixOS-config, and anything under
ops.machines.$fqdn now has a .passthru.hscloud.provision derivation
which is that script. When ran, it will attempt to deploy onto the
target machine.

There's also a top-level tool at `ops.provision` which builds all
configurations / machines and can be called with the machine name/fqdn
to call the corresponding provisioner script.

clustercfg is changed to use the new provisioning logic.

Change-Id: I258abce9e8e3db42af35af102f32ab7963046353
2021-09-10 23:55:52 +00:00
q3k 69ff6038d5 shell.nix: colorful prompt
https://object.ceph-waw3.hswaw.net/q3k-personal/815968ff10071d4192e464c91b64228e760128267311a94872006d87cbfd0bd9.png

Change-Id: Ia4eeddf045af0d0bdc962087aaeed55d11846648
2021-09-10 23:15:38 +00:00
q3k eed9afe210 Merge "bgpwtf: edge01: fix ipv4 static routing for customers" 2021-09-10 22:45:41 +00:00
arsenicum aef13358c8 personal - start
Change-Id: I0f1972a095b5a41cad727dbc37fcd454d308050d
2021-09-09 18:26:33 +02:00
q3k 81e7fbaadd bgpwtf: edge01: fix ipv4 static routing for customers
Change-Id: I9c34d12a7947c9bb25331e38ea7ee03beede7e47
2021-09-08 23:40:29 +02:00
q3k 11248d88ab bgpwtf: edge01: add new client networks, remove old q3k network, limit nscd
Batch of small changes. Already deployed.

Change-Id: Ieb4f418699f497c7013e617fd7d1827e71a7a415
2021-09-06 12:07:42 +00:00
q3k 0f11b3c850 hswaw/site: deploy
Change-Id: I3c8aff05f339f3154cb80831099482f0d97a360e
2021-09-04 21:32:30 +02:00
q3k 62e50da881 Merge "tweak blink animation & add gallery" 2021-09-04 18:41:07 +00:00
q3k 5001851808 Merge "hswaw/site: fix twitter link" 2021-09-04 18:40:50 +00:00
q3k d0c9c414cf hswaw/site: deploy
Change-Id: I2ea68f07c81859ffea99ad5b107b14876422288b
2021-09-04 18:38:42 +00:00
informatic 381514ead3 hswaw/site: fix twitter link
Change-Id: I7ec93e1cfe8ac7e4b8949d356109c060c51f187d
2021-09-02 11:07:20 +02:00
radex 41a3cfe04c tweak blink animation & add gallery
Change-Id: I1a1cd568e7982bf4e8e31f9e21897db53e59727f
2021-09-01 21:55:07 +02:00
radex d88a2e2377 improve fonts & animations
Change-Id: I2a586243035e84136b2a309dc6ce26ab21f8925d
2021-08-30 21:28:59 +02:00
q3k 717aad4ac6 hswaw/site: wip new layout
Change-Id: I4da3a668429dee42c7292accb9e24b93703f1538
2021-08-30 21:00:59 +02:00
q3k c35d52b19e *: update build_naming_convention for new rules_go
Change-Id: Ib1604a46d24969ae0110985cda156d31b7cc27aa
2021-08-30 18:21:03 +00:00
radex 38203d2dbe *: update for M1 support
preliminary pass to build site on an M1 Mac

Change-Id: I89e6ac5874bbb8db92040ec98717fc0ed3ee4455
2021-08-30 18:58:54 +02:00
q3k d0b76e62b9 WORKSPACE: remove duplicate library
Change-Id: Ia165c1a44ffb557f37e5a61d372d945016190e08
2021-08-30 18:46:23 +02:00
q3k 432fa30ded cluster/certs: bump ca-kube-prodivider
Redeployed.

Change-Id: I01110433f89df5595de0f9587508104d6091a774
2021-08-29 17:20:59 +00:00