Refactoring registry to use newer syntax/jsonnet helpers/conventions, in line with the rest of the codebase.
Change-Id: I20508c8f6ef9a2d0e8faa7de3d3b9efcf2c91af3
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/2013
Reviewed-by: q3k <q3k@hackerspace.pl>
This fixes cluster routing, which broke for some reason at some point.
It ensures cluster routes get propagated correctly across nodes.
This is a mess. We should replace this.
Change-Id: Ic749a529da620fa201ec9cd71a6a8eed664e2d0f
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/2012
Reviewed-by: radex <radex@hackerspace.pl>
These changes were already live but were not committed
Change-Id: Ib0590964ad8521d06ad2219b51751e65b6f9742f
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/2011
Reviewed-by: q3k <q3k@hackerspace.pl>
This brings the code up to date with what was already deployed
Change-Id: I8e47787df8d421857f8a011ce3d6ab29488f980a
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/2009
Reviewed-by: q3k <q3k@hackerspace.pl>
Forked Dockerfile is no longer necessary, as 0.51.0 has a newer openssl
This is the newest version of n-i-c we can use with current k8s version. v1.0.0 requires k8s at least v1.19
Change-Id: Ibb244482cef2624274817ea6c62f190587a03f97
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/2006
Reviewed-by: q3k <q3k@hackerspace.pl>
Since `ops/monitoring` operates on both `monitoring-cluster` and
`monitoring-global-k0` namespaces, working properly using the tooling
requires access to both.
While there, add access to `monitoring-external-k0` for potential
working with external targets.
Change-Id: I5f37ed306f064ffcced705609aa919b684a46235
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1834
Reviewed-by: informatic <informatic@hackerspace.pl>
This gives viq admin access to monitoring-cluster namespace to be able
to inspect what's already there and try to extend it.
Change-Id: I48eaba8db6cd6868879da33abd93607ed5de2008
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1829
Reviewed-by: q3k <q3k@hackerspace.pl>
Rename `target_service` to `target` to mirror Service's `target`; rename `extra_paths` to `extraPaths` to follow the camelCase convention used everywhere except for a few places in kube.upstream (assumed to be a mistake)
Change-Id: Icfcb70ef889e3359bf0391c465034817f4b70cce
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1809
Reviewed-by: q3k <q3k@hackerspace.pl>
There's no difference as far as jsonnet is concerned, but it may confuse newbies, as Service and SimpleIngress use double colon for its top-level kube helpers. This also removes any ambiguity as to whether this is manifested in final JSON. So we can make that a convention.
Change-Id: I01ad4ea63f4d5d8ee6e5d41c79637ba186548c6f
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1803
Reviewed-by: q3k <q3k@hackerspace.pl>
The bootloader is *not* moved yet, machine still boots off the old disk
Change-Id: I8cc92489bb06bfe9581d68503237e08fa8082c7c
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1766
Reviewed-by: q3k <q3k@hackerspace.pl>
fstrim is nice as it might prevent us from killing SSDs so fast.
A lower GC threshold for kubelet is nice as we run non-kubelet services
on these nodes, and they need their space. Notably, Ceph's mons tend to
be extremely claustrophobic, firing alerts at 70% disk usage or so.
Change-Id: I94c1787e62f82a02f107d04a87575327d3d79c01
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1724
Reviewed-by: implr <implr@hackerspace.pl>
This deploys the changes in Id64cccadcd1e109035ed09f62086772fa615dd72
and I34163bbb62ba792d359a5f5e72de1024c0109eab .
Turns out the site actually serves at new.hackerspace.pl and is being
proxy-passed from boston-packets, as that for legacy reasons still has
to live at hackerspace.pl.
Change-Id: Ieaa3e8b6f9c4ced14db83c121e30c9cbaa416b00
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1700
Reviewed-by: radex <radex@hackerspace.pl>
A little QA environment, currently without any data populated.
Change-Id: Ifbe5e97f312376ca64222a3754fe6fa29d7fda79
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1643
Reviewed-by: q3k <q3k@hackerspace.pl>
Also make dataplane-only nodes actually work:
- make kubeproxy use the same package as kubelet
- disable firewall
Change-Id: I7babbb749656e6f75151c8eda6e3f09f3c6bff5f
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1686
Reviewed-by: q3k <q3k@hackerspace.pl>
This is a mega-change, but attempting to split this up further is
probably not worth the effort.
Summary:
1. Bump up bazel, rules_go, and others.
2. Switch to new go target naming (bye bye go_default_library)
3. Move go deps to go.mod/go.sum, use make gazelle generate from that
4. Bump up Python deps a bit
And also whatever was required to actually get things to work - loads of
small useless changes.
Tested to work on NixOS and Ubuntu 20.04:
$ bazel build //...
$ bazel test //...
Change-Id: I8364bdaa1406b9ae4d0385a6b607f3e7989f98a9
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1583
Reviewed-by: q3k <q3k@hackerspace.pl>
Building jq portably is annoying, and the way we were doing it (which we
iirc stole from some google project?) sucked. Let's use a Go jq clone
instead.
This is an alternative for 1535. jq is currently used only in one
script, which could really be replaced by a Go program, but let's keep
it simple for now.
Change-Id: Ie25dffadd545df143490f510e9b75a74adf81492
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1540
Reviewed-by: palid <palid@hackerspace.pl>
Public pull ACL in the middle had priority over our more specific rules
- moving these to the top fixes common registry namespace ACLs.
Change-Id: Ia6f05cef09c0db4eb71155d2c0e2d9944b81f903
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1522
Reviewed-by: q3k <q3k@hackerspace.pl>
This replaces the old clustercfg script with a brand spanking new
mostly-equivalent Go reimplementation. But it's not exactly the same,
here are the differences:
1. No cluster deployment logic anymore - we expect everyone to use ops/
machine at this point.
2. All certs/keys are Ed25519 and do not expire by default - but
support for short-lived certificates is there, and is actually more
generic and reusable. Currently it's only used for admincreds.
3. Speaking of admincreds: the new admincreds automatically figure out
your username.
4. admincreds also doesn't shell out to kubectl anymore, and doesn't
override your default context. The generated creds can live
peacefully alongside your normal prodaccess creds.
5. gencerts (the new nodestrap without deployment support) now
automatically generates certs for all nodes, based on local Nix
modules in ops/.
6. No secretstore support. This will be changed once we rebuild
secretstore in Go. For now users are expected to manually run
secretstore sync on cluster/secrets.
Change-Id: Ida935f44e04fd933df125905eee10121ac078495
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1498
Reviewed-by: q3k <q3k@hackerspace.pl>
This completes the migration away from the old CA/cert infrastructure.
The tool which was used to generate all these certs will come next. It's
effectively a reimplementation of clustercfg in Go.
We also removed the unused kube-serviceaccounts cert, which was
generated by the old tooling for no good reason (we only need a key for
service accounts, not an actual cert...).
Change-Id: Ied9e5d8fc90c64a6b4b9fdd20c33981410c884b4
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1501
Reviewed-by: q3k <q3k@hackerspace.pl>
This finishes the regeneration of all cluster CAs/certs to be never
expiring ED25519 certs.
We still have leftovers of the old Kube CA (and it's still being
accepted in Kubernetes components). Cleaning that up is the next step.
Change-Id: I883f94fd8cef3e3b5feefdf56ee106e462bb04a9
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1500
Reviewed-by: q3k <q3k@hackerspace.pl>
This is already deployed, and it allows Kubernetes components
(temporary) freedom to use the old or new CA cert.
Change-Id: I8ac7f773a333c30fa22902b8edc327c0c700a482
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1490
Reviewed-by: q3k <q3k@hackerspace.pl>
This gets rid of cfssl for the kubernetes bits of prodvider, instead
using plain crypto/x509. This also allows to support our new fancy
ED25519 CA.
Change-Id: If677b3f4523014f56ea802b87499d1c0eb6d92e9
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1489
Reviewed-by: q3k <q3k@hackerspace.pl>
Done:
1. etcd peer CA & certs
2. etcd client CA & certs
3. kube CA (currently all components set to accept both new and old CA,
new CA called ca-kube-new)
4. kube apiserver
5. kubelet & kube-proxy
6. prodvider intermediate
TODO:
1. kubernetes controller-manager & kubernetes scheduler
2. kubefront CA
3. admitomatic?
4. undo bundle on kube CA components to fully transition away from old
CA
Change-Id: If529eeaed9a6a2063bed23c9d81c57b36b9a0115
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1487
Reviewed-by: q3k <q3k@hackerspace.pl>
This will happen at next boot via early microcode - no risk to currently
running processes.
Change-Id: I88553fa9a1350ebb80aaf978e29e8f1156783a2c
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1469
Reviewed-by: q3k <q3k@hackerspace.pl>
This will be our postgres pet machine.
Change-Id: Ifff6648394ca6407fb5b5daa853f4abc42541703
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1467
Reviewed-by: q3k <q3k@hackerspace.pl>
After installing HBJ11s and spreading out the mons we're going full
Rook.
Change-Id: Ia00cbe953548f06cf27343371fc67890619c8262
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1466
Reviewed-by: q3k <q3k@hackerspace.pl>
This bumps it on bc01n01, but nowhere else yet.
We have to vendor some more kubelet bits unfortunately.
Change-Id: Ifb169dd9c2c19d60f88d946d065d4446141601b1
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1465
Reviewed-by: implr <implr@hackerspace.pl>
the spark one has been an abandoned experiment from years ago, and
I could use a personal one right now
Change-Id: I78a706c3371d441b2f8460fd796d0cfd9a198cc6
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1464
Reviewed-by: q3k <q3k@hackerspace.pl>
This is needed for running some memory-intensive workloads, like
ElasticSearch/OpenSearch.
Change-Id: I7b00ec5faca73ec69bdbf1ca41c025d7efeae55c
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1443
Reviewed-by: implr <implr@hackerspace.pl>
This was never used and only caused scary warnings during OSDs reboots
due to lack of availability.
Change-Id: I14eacd88855bc56e06f2a61cc2d914d985330852
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1423
Reviewed-by: implr <implr@hackerspace.pl>
Leaving the CRD definitions as YAML, extracted without modifications
from the original install file - this should make upgrades simpler.
Change-Id: I7211d2711e2af014b36dd887a951abb9e1032eb9
Reviewed-on: https://gerrit.hackerspace.pl/c/hscloud/+/1179
Reviewed-by: q3k <q3k@hackerspace.pl>