hscloud

mirror of https://gerrit.hackerspace.pl/hscloud synced 2024-10-18 02:58:06 +00:00

Author	SHA1	Message	Date
Serge Bazanski	15e7348a0b	cluster: remove dead machines Change-Id: I3ff6680bc7212341ca626b0f560e1fe93efe3a35 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1987 Reviewed-by: ar <ar@hackerspace.pl>	2024-07-20 12:18:00 +00:00
Serge Bazanski	168f84b69b	ops: apply CVE-2024-6387 patch on critical machines Instead of waiting for backports or even rolling forward unstable, let's just patch the bug out. This has been deployed on: - dcr01s22.hswaw.net - dcr01s24.hswaw.net - dcr03s16.hswaw.net - snowflake.hswaw.net Change-Id: I0ad8ea37bd15bc9bd4e814cdf3eda7b2c47bb03e Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1988 Reviewed-by: implr <implr@hackerspace.pl>	2024-07-20 12:17:55 +00:00
Ari Gerus	39f9d171c4	h/m/snowflake, matrix: postgresql config prep for postgresql database migration from the instance running on old dell blade server. on snowflake side, mostly a copy-paste of configuration from bc01n05, from which the database instance will be migrated from, with a few adjustments for newer nixpkgs/nixos. on matrix/k8s side, just a change of host. and a drive-by rename from `.hackerspace.pl` to `.hswaw.net` Change-Id: I0e78162270ebb3244078e34dee0cd4629d5598ca Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1986 Reviewed-by: q3k <q3k@hackerspace.pl>	2024-06-14 19:06:52 +00:00
Ari Gerus	ad179def49	hswaw/machines: add snowflake This adds one of the 4 new fast machines that will run various one-off workloads, initially mostly migrated off of the old dell m1000e blade chassis, such as a virtualized boston-packets. Change-Id: I4a85f8e14cd79257ad41bbe1519f33595f4e497a Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1981 Reviewed-by: q3k <q3k@hackerspace.pl>	2024-06-11 16:58:05 +00:00
noisersup	d843b782a1	hswaw/sound: add esphome integration Change-Id: I535256056aed6dfec4ddf4843203990324f49564 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1950 Reviewed-by: informatic <informatic@hackerspace.pl>	2024-04-27 20:35:55 +00:00
Piotr Dobrowolski	c8d1d51c11	hswaw/machines/printmaster: cups server box Change-Id: Ibf75d9bad789521bfab77fb17017b20030deed52 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1894 Reviewed-by: informatic <informatic@hackerspace.pl>	2024-02-28 06:55:45 +00:00
Serge Bazanski	faf8a41a83	ops/k0: bump runc to 1.1.12 (CVE-2024-21626) Change-Id: I204f0a296b600143da43b8c8e34d70d4dcb1b8aa Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1903 Reviewed-by: informatic <informatic@hackerspace.pl>	2024-02-08 12:03:49 +00:00
Serge Bazanski	1b3774b584	ops: remove reference to non-existent machine Change-Id: I0d4ea1a0d99f7b177a3fe526a7f435ea6b161bb7 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1902 Reviewed-by: informatic <informatic@hackerspace.pl>	2024-02-08 12:03:49 +00:00
Piotr Dobrowolski	ff8a50cb02	ops: colmena integration Change-Id: I18b9218f2c29a84f7fa769e1a9f561a4385578ca Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1757 Reviewed-by: q3k <q3k@hackerspace.pl>	2024-02-07 18:12:12 +00:00
Bartosz Stebel	7ab03b1083	ops/machines: bump dcr01s24 to newer nixpkgs, drop old pkg pin Dropped bc01n02 as it's long gone. Change-Id: I9aa83d33e47466ed24a3938cb1ef3e1fee42e545 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1881 Reviewed-by: q3k <q3k@hackerspace.pl>	2024-01-30 22:20:39 +00:00
Bartosz Stebel	655db5e3c6	ops/machines: bump dcr01s22 to newer nixpkgs I know the comments are wrong, I'll clean them up once we get rid of the old nixpkgs fetch completely. Change-Id: Ia64d2d0908fc834cb976afbb415c8d1283433a38 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1865 Reviewed-by: q3k <q3k@hackerspace.pl>	2024-01-24 19:24:15 +00:00
viq	30a563c49f	ops/monitoring/lib/cluster.libsonnet: scrape based on annotations This adds automatic scraping of pods and services based on presence of annotations: - prometheus.io/scrape - prometheus.io/port - prometheus.io/path Change-Id: I1c1afecc75c30278889de1f6ca0b17da69997295 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1850 Reviewed-by: implr <implr@hackerspace.pl>	2024-01-19 22:02:40 +00:00
viq	a694d21670	ops/monitoring: update components Update Prometheus, VictoriaMetrics and Grafana to latest releases, LTS where applicable Change-Id: I18e173a8c75288c341503e97d367e0f65f807b3f Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1842 Reviewed-by: q3k <q3k@hackerspace.pl>	2024-01-12 22:34:35 +00:00
Serge Bazanski	260ff1c011	ops/monitoring: scrape ceph Change-Id: Ibe2d4d2e4c562789a8849074abe6e789c95c598d Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1837 Reviewed-by: viq <viq@hackerspace.pl>	2024-01-10 14:00:14 +00:00
radex	4ffc64d97d	kube: add .volume field on PVCs and ConfigMaps Change-Id: I93eec44bd6df4ecb0044a4797faa9bf6fd26802d Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1811 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-12-04 20:33:37 +00:00
radex	7a4c27d28c	kube: clean up (various) Change-Id: Idc11cf70fa7fd0360f63438270748ef1d9bad989 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1810 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-12-04 20:33:31 +00:00
radex	d45584aa6d	kube: clean up SimpleIngress Rename `target_service` to `target` to mirror Service's `target`; rename `extra_paths` to `extraPaths` to follow the camelCase convention used everywhere except for a few places in kube.upstream (assumed to be a mistake) Change-Id: Icfcb70ef889e3359bf0391c465034817f4b70cce Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1809 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-12-04 20:33:10 +00:00
radex	1439fde1ba	kube: standardize top.secretRefs convention Introduce a convention of declaring a secretsRefs:: object below cfg:: for containing all secretKeyRefs. The goal is to self-document all secrets that need to be created in order to deploy a service Change-Id: I3a990d54f65a288f5e748262c576d2a120efd815 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1806 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-11-24 20:39:11 +00:00
radex	c995c212d2	kube: standardize on a `local top = self` convention A convention is introduced to specify `local top = self` declaration at the top of an app/service/component's jsonnet, representing the top-level object. Reasoning is as following: - `top` is more universal/unambiguous than `app` - `top` is usually shorter than $NAME - a conventional `top` instead of $NAME (coupled with other conventions introduced) makes app jsonnets wonderfully copy-paste'able, aiding in learning and quickly building Change-Id: I7ece83ce7e97021ad98a6abb3500fb9839936811 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1805 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-11-24 20:38:59 +00:00
radex	99ed6a7abb	kube: standardize on a `local ns` convention A convention is introduced to specify the kube.Namespace object in a deployment as a `local ns` instead of an `ns:` or a `namespace:` for these reasons: - non-cluster admins cannot create new namespaces, and we've been moving in the direction of specifying objects that require cluster admin permissions to apply (policies, role bindings) in //cluster/kube/k0 instead of in the app jsonnet - namespace admins CAN delete the namespace, making `kubecfg delete` unexpectedly dangerous (especially if a namespace contains more than just the contents of the file being applied - common with personal namespaces) - `.Contain()` is a common operation, and it shows up in lines that are pretty long, so `ns.Contain()` is preferable to `app.ns.Contain()` or `service.namespace.Contain()` Change-Id: Ie4ea825376dbf6faa175179054f3ee3de2253ae0 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1804 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-11-24 20:38:44 +00:00
radex	36964dca3b	kube: clean up PersistentVolumeClaims There's no difference as far as jsonnet is concerned, but it may confuse newbies, as Service and SimpleIngress use double colon for its top-level kube helpers. This also removes any ambiguity as to whether this is manifested in final JSON. So we can make that a convention. Change-Id: I01ad4ea63f4d5d8ee6e5d41c79637ba186548c6f Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1803 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-11-24 20:37:53 +00:00
radex	8b8f3876a9	kube: add target:: convenience field to Service Change-Id: If69116d93b6074136a36d98973e1aa997e2ebbef Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1802 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-11-24 20:37:48 +00:00
Radek Pietruszewski	f28cd62c0e	*: Simplify kube.PersistentVolumeClaims Change-Id: I0a3e44de9f1c4db146fd1e493741f5fe381da3ae Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1768 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-11-18 12:36:00 +00:00
Radek Pietruszewski	f5844311eb	*/kube: Add kube.SimpleIngress Change-Id: Iddcac629b9938f228dd93b32e58bb14606d5c6e5 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1745 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-10-28 17:55:48 +00:00
Bartosz Stebel	9a88f28805	cluster/{machines,certs}: add dcr03s16.hswaw.net Also make dataplane-only nodes actually work: - make kubeproxy use the same package as kubelet - disable firewall Change-Id: I7babbb749656e6f75151c8eda6e3f09f3c6bff5f Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1686 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-10-09 19:02:18 +00:00
Serge Bazanski	9f0e1e88f1	cluster/clustercfg: rewrite it in Go This replaces the old clustercfg script with a brand spanking new mostly-equivalent Go reimplementation. But it's not exactly the same, here are the differences: 1. No cluster deployment logic anymore - we expect everyone to use ops/ machine at this point. 2. All certs/keys are Ed25519 and do not expire by default - but support for short-lived certificates is there, and is actually more generic and reusable. Currently it's only used for admincreds. 3. Speaking of admincreds: the new admincreds automatically figure out your username. 4. admincreds also doesn't shell out to kubectl anymore, and doesn't override your default context. The generated creds can live peacefully alongside your normal prodaccess creds. 5. gencerts (the new nodestrap without deployment support) now automatically generates certs for all nodes, based on local Nix modules in ops/. 6. No secretstore support. This will be changed once we rebuild secretstore in Go. For now users are expected to manually run secretstore sync on cluster/secrets. Change-Id: Ida935f44e04fd933df125905eee10121ac078495 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1498 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-06-19 22:23:52 +00:00
Piotr Dobrowolski	7e841065b0	*: post-certmanager manifests update Change-Id: I745c850268c31777c5722a9833c8152a55615aed Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1512 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-06-19 21:20:44 +00:00
Serge Bazanski	f6e6abb0f5	ops: repin cluster machines to older nixpkgs checkout Change-Id: I592c689e33d81c131d389d87153900165aac19e5 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1486 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-03-31 22:53:59 +00:00
Serge Bazanski	8f0842341a	ops: repin edge01.waw to old nixpkgs We accidentally bumped nixpkgs at https://gerrit.hackerspace.pl/1441 and forgot to upgrade it. We don't wanna upgrade it right now. This doesn't give us back a zero-diff, but it's close enough. Change-Id: I1a9f50df88e564cd4de76f67adfaa1e88a746f2e Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1471 Reviewed-by: patryk <patryk@hackerspace.pl>	2023-03-10 20:17:15 +00:00
Serge Bazanski	712a5dc3e3	cluster: add bc01n05.hswaw.net This will be our postgres pet machine. Change-Id: Ifff6648394ca6407fb5b5daa853f4abc42541703 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1467 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-03-04 22:26:46 +00:00
Serge Bazanski	3a9562ecfd	cluster: k0: remove native ceph After installing HBJ11s and spreading out the mons we're going full Rook. Change-Id: Ia00cbe953548f06cf27343371fc67890619c8262 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1466 Reviewed-by: q3k <q3k@hackerspace.pl>	2023-03-04 22:26:39 +00:00
Serge Bazanski	ef3aab6a14	k0: host os bump wip This bumps it on bc01n01, but nowhere else yet. We have to vendor some more kubelet bits unfortunately. Change-Id: Ifb169dd9c2c19d60f88d946d065d4446141601b1 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1465 Reviewed-by: implr <implr@hackerspace.pl>	2023-03-04 22:26:14 +00:00
vuko	deeeff861e	hswaw/machines: add sound.waw.hackerspace.pl Change-Id: Id0e6a02d9ae4cf61d758713a99d21c6da0c72b66 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1401 Reviewed-by: vuko <vuko@hackerspace.pl> Reviewed-by: informatic <informatic@hackerspace.pl>	2022-10-09 19:35:18 +00:00
Serge Bazanski	957d91180a	bgpwtf: edge01: bump nixpkgs, use networkd Change-Id: I038f9518e090aecc90f464475f29c5b3c1570eff Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1339 Reviewed-by: implr <implr@hackerspace.pl>	2022-07-07 23:51:57 +00:00
Serge Bazanski	c35ea6a220	ops: inject the machine's pkgs into the machine's hscloud tree This ensures, for example, that the packets are for the correct architecture. Change-Id: If17c307fbad02ee72c6dd21a874c59514415ab2e Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1334 Reviewed-by: implr <implr@hackerspace.pl>	2022-07-07 18:10:40 +00:00
Serge Bazanski	dcdbd8425c	hswaw/machines: add tv2 Change-Id: I657c316bcc663c79b6886d5843b9de5cbf17f1c3 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1333 Reviewed-by: informatic <informatic@hackerspace.pl>	2022-07-07 18:07:18 +00:00
Serge Bazanski	5ac5e4bec3	hswaw/machines: add tv1, larrythebuilder This adds two brand new AArch64 machines: a generic builder (and instructions on how to use it) and tv1.waw, an RPi4 acting as digital signage in the space. Change-Id: I8d38344ec35f99f4b872cf9526f6e6771fbffc43 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1330 Reviewed-by: informatic <informatic@hackerspace.pl>	2022-07-06 19:49:37 +00:00
Serge Bazanski	55a486ae49	cluster: refactor nix machinery to fit //ops This is a chonky refactor that get rids of the previous cluster-centric defs-* plain nix file setup. Now, nodes are configured individually in plain nixos modules, and are provided a view of all other nodes in the 'machines' attribute. Cluster logic is moved into modules which inspect this array to find other nodes within the same cluster. Kubernetes options are not fully clusterified yet (ie., they are still hardcode to only provide the 'k0' cluster) but that can be fixed later. The Ceph machinery is a good example of how that can be done. The new NixOS configs are zero-diff against prod. While this is done mostly by keeping the logic, we had to keep a few newly discovered 'bugs' around by adding some temporary options which keeps things as they are. These will be removed in a future CL, then introducing a diff (but no functional changes, hopefully). We also remove the nix eval from clustercfg as it was not used anymore (basically since we refactored certs at some point). Change-Id: Id79772a96249b0e6344046f96f9c2cb481c4e1f4 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1322 Reviewed-by: informatic <informatic@hackerspace.pl>	2022-06-19 11:48:52 +00:00
Piotr Dobrowolski	a13208bf9b	ops/sso: bump to latest version, roll out RSA JWT signing Bump to: https://code.hackerspace.pl/informatic/sso-v2/commit/?id=682322c98063c596d2e46f1e7844551c5a7226db This introduces (and enables) support for RSA id_tokens (that are required by oauth2_proxy for example) and fixes/improves handling of non-active members. Change-Id: Ia7d5e5ca7a2769f11f6190add78114e3b6141c6e Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1304 Reviewed-by: q3k <q3k@hackerspace.pl>	2022-05-01 08:17:57 +00:00
Piotr Dobrowolski	b6bc3e69b9	hswaw/machines/customs: upgrade to workspace nixos-unstable 2021-08-11 Change-Id: I6eb4408d40e14f24ebbe3f9f3aef0be952b44e8b Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1167 Reviewed-by: vuko <vuko@hackerspace.pl>	2021-10-20 20:58:16 +00:00
Piotr Dobrowolski	a01905ae64	hswaw/machines/customs: check in code.hackerspace.pl/vuko/customs Change-Id: Ic698cce2ef0060a54b195cf90574696b8be1eb0f Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1162 Reviewed-by: informatic <informatic@hackerspace.pl>	2021-10-20 20:58:16 +00:00
Serge Bazanski	a16af2db91	ops/machines.nix: inject workspace This makes the hscloud readTree object available as following in NixOS modules: { config, pkgs, workspace, ... }: { environment.systemPackages = [ workspace.hswaw.laserproxy ]; } Change-Id: I9c8146f5156ffe5d06cb8408a2ce632657990d59 Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1164 Reviewed-by: q3k <q3k@hackerspace.pl>	2021-10-16 21:24:22 +00:00
Serge Bazanski	9848e7e15f	cluster: deploy NixOS-based ceph First pass at a non-rook-managed Ceph cluster. We call it k0 instead of ceph-waw4, as we pretty much are sure now that we will always have a one-kube-cluster-to-one-ceph-cluster correspondence, with different Ceph pools for different media kinds (if at all). For now this has one mon and spinning rust OSDs. This can be iterated on to make it less terrible with time. See b/6 for more details. Change-Id: Ie502a232c700af93f33fcad9fa1c57058161aa11	2021-09-11 20:33:24 +00:00
Serge Bazanski	b3c6770f8d	ops, cluster: consolidate NixOS provisioning This moves the diff-and-activate logic from cluster/nix/provision.nix into ops/{provision,machines}.nix that can be used for both cluster machines and bgpwtf machines. The provisioning scripts now live per-NixOS-config, and anything under ops.machines.$fqdn now has a .passthru.hscloud.provision derivation which is that script. When ran, it will attempt to deploy onto the target machine. There's also a top-level tool at `ops.provision` which builds all configurations / machines and can be called with the machine name/fqdn to call the corresponding provisioner script. clustercfg is changed to use the new provisioning logic. Change-Id: I258abce9e8e3db42af35af102f32ab7963046353	2021-09-10 23:55:52 +00:00
Serge Bazanski	0ec06d7b75	ops: update deploy instructions to include profile set This is necessary for the NixOS EFI boot machinery to pick up the new derivation when switching to it, otherwise the machine will not boot into the newly switched configuration. Change-Id: I8b18956d2afeea09c38462f09a00c345cf86f80d	2021-04-18 18:13:33 +00:00
Serge Bazanski	a0332a75a0	ops/machines: pin edge01.waw to its current version of nixpkgs Stopgap until we finish b/3, need to deploy some changes on it without rebooting into newer nixpkgs. Change-Id: Ic2690dfcb398a419338961c8fcbc7e604298977a	2021-03-18 19:22:41 +00:00
Piotr Dobrowolski	7f8f3e9f9c	ops/sso: upgrade sso-v2 Change in sso-v2 unifies id_token and userinfo endpoint handling - now groups, nickname, email and preferred_username keys are present in id_tokens as well. https://code.hackerspace.pl/informatic/sso-v2/commit/?id=c4c810cd255a7bfcab5ced3fb88c8b311b518c34 Change-Id: Ib22994edc067fd83701590182f8096f6fca692ba	2021-02-01 17:03:27 +01:00
Serge Bazanski	9e3ca9c841	ops/sso: move jsonnets to kube/ This is in preparation for moving the sso source code into hscloud. Change-Id: I4325df617dc82c17fb4c96762743f0b70122976f	2021-01-31 15:52:06 +01:00
Serge Bazanski	cc2ff79f01	ops/monitoring: move grafana to sso. Change-Id: Ib2ecf6820454a160834db2ac212b31d9d5306972	2021-01-30 17:26:47 +01:00
q3k	d82807e024	Merge changes I84873bc3,I1eedb190 * changes: ops/monitoring: deploy grafana ops/monitoring: scrape apiserver, scheduler, and controller-manager	2021-01-30 16:22:44 +00:00

1 2

59 commits