hscloud

cheshire

hscloud

Author	SHA1	Message	Date
q3k	5f3a5e0310	cluster/kube: emergency fixes after evition Some pods got evicted. Some of them broke. - postgres in matrix and nginx in internet because of the new policies (chown issues) - cas proxy in matrix because apparently the image was not reuploaded to the regsitry after ceph-waw1 died, and another node didn't have it - registry because it had a weak image pin an downgraded to some broken version on another node Change-Id: I836036872629843c8ede1b7f67982112c90d71f0	2019-09-25 02:58:15 +02:00
q3k	13bb1bf4e3	Get in the Cluster, Benji! Here we introduce benji [1], a backup system based on backy2. It lets us backup Ceph RBD objects from Rook into Wasabi, our offsite S3-compatible storage provider. Benji runs as a k8s CronJob, every hour at 42 minutes. It does the following: - runs benji-pvc-backup, which iterates over all PVCs in k8s, and backs up their respective PVs to Wasabi - runs benji enforce, marking backups outside our backup policy [2] as to be deleted - runs benji cleanup, to remove unneeded backups - runs a custom script to backup benji's sqlite3 database into wasabi (unencrypted, but we're fine with that - as the metadata only contains image/pool names, thus Ceph PV and pool names) [1] - https://benji-backup.me/index.html [2] - latest3,hours48,days7,months12, which means the latest 3 backups, then one backup for the next 48 hours, then one backup for the next 7 days, then one backup for the next 12 months, for a total of 65 backups (deduplicated, of course) We also drive-by update some docs (make them mmore separated into user/admin docs). Change-Id: Ibe0942fd38bc232399c0e1eaddade3f4c98bc6b4	2019-09-02 16:33:02 +02:00
q3k	9496d9910a	cluster: add nextcloud user for object store Change-Id: Ib08be16f71ff5e1b72ca6ad436de4b12427dd407	2019-09-02 16:33:02 +02:00
q3k	b13b7ffcdb	prod{access,vider}: implement Prodaccess/Prodvider allow issuing short-lived certificates for all SSO users to access the kubernetes cluster. Currently, all users get a personal-$username namespace in which they have adminitrative rights. Otherwise, they get no access. In addition, we define a static CRB to allow some admins access to everything. In the future, this will be more granular. We also update relevant documentation. Change-Id: Ia18594eea8a9e5efbb3e9a25a04a28bbd6a42153	2019-08-30 23:08:18 +02:00
q3k	1fad2e5c6e	bgpwtf/cccampix: draw the rest of the fucking owl Change-Id: I49fd5906e69512e8f2d414f406edc0179522f225	2019-08-11 23:43:25 +02:00
q3k	d533892efa	Fix crdb-waw1 We accidentally created crdb-waw2 in https://gerrit.hackerspace.pl/c/hscloud/+/2. We remove it now and also backport a manual change that makes the crdb-waw1 service public via a LoadBalancer. Change-Id: I3bbd6f01b82c6efa458cc44776f086ba36e9f20c	2019-08-11 23:42:47 +02:00
q3k	d07861b7df	ceph-waw1 -> ceph-waw2 Change-Id: I03d6244b9697a9efc06492114ef90cdb01e17601	2019-08-08 17:49:31 +02:00
q3k	4d61d20aec	app/registry: integrate into cluster/kube This makes a registry be automatically part of the cluster infrastructure. Tested by running kubecfg diff, no diffs (apart from out-of-date ACLs) found. Change-Id: Ic0635e789cf3fb851f410bcf2865326f1fa87545	2019-07-21 16:56:41 +02:00
q3k	184678b0f4	cluster/cube/lib/cockroachdb: clean up topology IP addresses are not necessary in the topology definitions of a cockroach cluster. They were mis-commited leftovers from trying to run the cluster on DaemonSets with hostNetworking: true. Change-Id: I4ef1f6ed9a745efc6b05846bc13aba9d1f8dc7c8	2019-06-22 21:18:29 +00:00
q3k	c7258f4644	cluster/kube: refactor, add crdb-waw1	2019-06-21 00:24:09 +02:00
q3k	c3b0f7627c	cluster/kube: set operator replicas to 0	2019-06-20 16:42:19 +02:00
q3k	36cc4fb61a	bazel-cache: deploy, add waw-hdd-yolo-1 ceph pool	2019-05-17 18:09:39 +02:00
informatic	5ac85c6e73	cluster/kube: refactor rook.io object store configuration	2019-04-09 21:45:32 +02:00
informatic	6da3b288dc	WIP: app/registry: ceph object storage	2019-04-09 13:48:21 +02:00
informatic	3187c59a86	cluster/kube: ceph dashboard tls certificates	2019-04-02 14:44:04 +02:00
informatic	79ddbc57d9	cluster/kube: initial cert-manager implementation	2019-04-02 13:20:15 +02:00
q3k	65f3b1d8ab	cluster/kube: add waw-hdd-redundant-1 pool/storageclass	2019-04-02 01:05:38 +02:00
q3k	c6da127d3f	cluster/kube: ceph-waw1 up	2019-04-02 00:06:13 +02:00
q3k	b7fcc67f42	cluster/kube: start implementing rook	2019-04-01 18:40:50 +02:00
q3k	14cbacb81a	cluster/kube/metallb: parametrize address pools	2019-04-01 18:00:44 +02:00
q3k	a9c7e86687	cluster: fix metallb, add nginx ingress controller	2019-04-01 17:56:28 +02:00
q3k	1e565dc4a5	cluster: start implementing metallb	2019-01-18 09:40:59 +01:00
q3k	af3be426ad	cluster: deploy calico and metrics service	2019-01-17 18:57:19 +01:00
q3k	49b9a13d28	cluster: deploy coredns	2019-01-14 00:02:59 +01:00
q3k	5bebbebe3e	cluster/kube: fix typo	2019-01-13 22:08:05 +01:00
q3k	4d9e72cb8c	cluster/kube: init	2019-01-13 22:06:33 +01:00

26 Commits (325e9476bfc9336f47db3f378799a6b242c90cf7)