hscloud

cheshire

hscloud

Author	SHA1	Message	Date
q3k	edeb3ccf78	hswaw/lib: add flask_spaceauth Change-Id: I3bb47bb65e739eaf27f54c07f03df18e79b398e0	2019-12-18 14:20:10 +01:00
q3k	43189235bd	third_party/py: add flask-spaceauth deps Change-Id: I3e153f8992b2a987ce2b0e1db8f869f6cca40f4b	2019-12-18 14:11:00 +01:00
q3k	ba8e79e8f4	kube-apiserver: fix cert mismatch, again This time from a bare hscloud checkout to make sure _nothing_ is fucked up. This causes no change remotely, just makes te repo reflect reality. Change-Id: Ie8db01300771268e0371c3cdaf1930c8d7cbfb1a	2019-12-17 02:13:55 +01:00
q3k	050af01b83	cluster: add q3k's new SSH key Change-Id: I872a75cc89a62c9487433fa5e8e5767953e309c9	2019-12-17 01:58:58 +01:00
q3k	31058185df	personal/q3k: 'production' openrct2 game Change-Id: I9b0fd29dd4e8a6c2cac3aaceabbdba07de0faf1b	2019-11-24 02:39:47 +01:00
q3k	262c6e0361	personal/q3k: add openrct2 Change-Id: I2526d75c577be6712342a60cc5c7c90b21d5242d	2019-11-24 02:39:47 +01:00
q3k	d0ec2c6ac7	hswaw/kube: refactor This breaks up hswaw.jsonnet into a component-per-file pattern. Change-Id: I1b83d44146ae6c3d3f7c5d02abc2c9b764cc0e8e	2019-11-21 00:08:52 +01:00
q3k	e5a956a1c8	*: bump to q3k's kubecfg, kubernetes 1.16 Change-Id: I302876d5a45cbfb63d87ad9f6ea9aaeff7bec17d	2019-11-17 22:38:40 +01:00
q3k	fd323a0f55	cluster: sync to prod Change-Id: If311f1ce44653bb54e0a10ad2fdd65685722a64d	2019-11-17 19:49:04 +01:00
q3k	96c428f7d7	nixops: fix Change-Id: I15ebde319fcae3f9771da6a549e52783e0ec4409	2019-11-17 19:00:46 +01:00
q3k	c33ebcc79f	cluster: add ceph-waw3, move metallb to bgp Change-Id: Iebf369f9a02e44be163ef4afc2e0f23c4b009898	2019-11-01 18:43:45 +01:00
q3k	e67f6fec98	cluster/secrets: really try to fix apiserver key/cert Change-Id: I6b0ea601246b665585adb040b9819344bc683e78	2019-10-31 17:36:44 +01:00
q3k	737cafd548	cluster/certs: fix kube-apiserver key/cert mismatch :/ Change-Id: I3601a18d3ab1eae4183b59be43c497cd27dfe704	2019-10-31 17:30:48 +01:00
q3k	d493ab66ca	*: add dcr01s{22,24} Change-Id: I072e825e2e1d199d9da50b9d38a9ffba68e61182	2019-10-31 17:07:50 +01:00
q3k	da67c6d3e9	hswaw/voucherchecker: detect when voucher is in cart Change-Id: Iac9a58c14b9d4faba5df0a945dd93ad269992c33	2019-10-25 19:55:05 +02:00
q3k	3502c0d840	hswaw/voucherchecker: do not care for session Change-Id: I406b0f6ce2761affabe1c3e6e37d7aefbf575f69	2019-10-25 19:26:39 +02:00
q3k	ed65c190be	hswaw/voucherchecker: init Change-Id: Id79ae9b14f61edf2f4abb3d9a60294edd6074f29	2019-10-23 21:28:19 +02:00
q3k	831a54acd9	hswaw/ldapweb: move to profile.hackerspace.pl Change-Id: I071dbd482b0eda75c5e73c53bf136010e1014abe	2019-10-20 17:38:22 +02:00
q3k	5b866624ec	hswaw: add ldap-web Change-Id: I49602ecf6001150491aae3e5fe024fb0ee7a9367	2019-10-18 14:54:36 +02:00
q3k	cccf5ec072	hswaw/kube: add cert for piorekf Change-Id: I302ced35503197522151177663c4321e858473e1	2019-10-17 19:56:15 +02:00
q3k	65273c8105	Merge "smsgw: productionize, implement kube/mirko"	2019-10-11 14:13:33 +00:00
q3k	adb17a6009	Merge "cluster: move prodvider to kubernetes.default.svc.k0.hswaw.net"	2019-10-11 14:13:28 +00:00
q3k	4836dff19b	bgpwtf/internet: fix prod diff Change-Id: Ie967ef5fbfdb479b1251e2495a28edd09864730c	2019-10-11 16:10:14 +02:00
q3k	6f773e0004	smsgw: productionize, implement kube/mirko This productionizes smsgw. We also add some jsonnet machinery to provide a unified service for Go micro/mirkoservices. This machinery provides all the nice stuff: - a deployment - a service for all your types of pots - TLS certificates for HSPKI We also update and test hspki for a new name scheme. Change-Id: I292d00f858144903cbc8fe0c1c26eb1180d636bc	2019-10-04 13:52:34 +02:00
q3k	d186e9468d	cluster: move prodvider to kubernetes.default.svc.k0.hswaw.net In https://gerrit.hackerspace.pl/c/hscloud/+/70 we accidentally introduced a split-horizon DNS situation: - k0.hswaw.net from the Internet resolves to nodes running the k8s API servers, and as such can serve API server traffic - k0.hswaw.net from the cluster returned no results This broke prodvider in two ways: - it dialed the API servers at k0.hswaw.net - even after the endpoint was moved to kubernetes.default.svc.k0.hswaw.net, the apiserver cert didn't cover that Thus, not only we had to change the prodvider endpoint but also change the APIserver certs to cover this new name. I'm not sure this should be the target fix. I think at some point we should only start referring to in-cluster services via their full (or cluster.local) names, but right now k0.hswaw.net is an exception and as such a split, and we have no way to access the internal services from the outside just yet. However, getting prodvider to work is important enough that this fix is IMO good enough for now. Change-Id: I13d0681208c66f4060acecc78b7ae14b8f8d7125	2019-10-04 13:52:34 +02:00
q3k	e31d64f265	kube: move cert-manager resources to kube.local.libsonnet This way kubernetes consumers don't have to import anything from cluster/, hopefully. We also create a small abstraction for local additions for kube.libsonnet without having to modify upstream. Change-Id: I209095781f91c8867250a647fe944370cddd67d0	2019-10-02 21:03:13 +02:00
q3k	54490d385e	cluster/coredns: add cluster fqdn top level domain This means that in addition to services being discoverable the 'classic' way: <svcname>.<namespace>.svc.cluster.local They are now discoverable as: <svcname>.<namespace>.svc.<fqdn> For instance, on k0 you can now internally resolve: $ kubectl run --rm -it foo --image=nixery.dev/shell/dnsutils bash bash-4.4# dig +short coffee-svc.default.svc.k0.hswaw.net 10.10.12.192 Change-Id: Ie6875b54ed6358f30f888ca0cd96e011520ace20	2019-10-02 20:49:13 +02:00
q3k	325e9476bf	hswaw/smsgw: implement The SMS gateway service allows consumers to subscribe to SMS messages received by a Twilio phone number. This is useful for receiving SMS auth messages. Change-Id: Ib02a4306ad0d856dd10c7ca9241d9163809e7084	2019-09-27 12:54:16 +02:00
q3k	95868eeddc	benji: back up daily instead of hourly Every benji backup seems to cycle blocks (eg. delete some and recreate them). Since wasabi has a minimum billing retention policy of 90 days, this means that every uploaded and then an hour later deleted object costs us. Currently we seem to be storing around 200G of data in wasabi for Benji but already have 600G of deleted objects. This is suboptimal. This change has already been deployed on production. Change-Id: I67302d23a1c45974fb5d51ec9a8cff28260830dc	2019-09-26 21:49:24 +00:00
q3k	47b7e850e7	dc/arista-proxy: fix by using github.com/q3k/cursedjson Change-Id: Id9657a30af8c16afe4ddde7e2ac04f4508a2fd18	2019-09-26 18:32:39 +02:00
q3k	6781f62ec4	Merge "app/radio: add support for following relays"	2019-09-25 12:06:17 +00:00
q3k	57515a2525	Merge "rules_pip: update to new version"	2019-09-25 12:05:58 +00:00
q3k	5f9b1ecd67	rules_pip: update to new version rules_pip has a new version [1] of their rule system, incompatible with the version we used, that fixes a bunch of issues, notably: - explicit tagging of repositories for PY2/PY3/PY23 support - removal of dependency on host pip (in exchange for having to vendor wheels) - higher quality tooling for locking We update to the newer version of pip_rules, rename the external repository to pydeps and move requirements.txt, the lockfile and the newly vendored wheels to third_party/, where they belong. [1] - https://github.com/apt-itude/rules_pip/issues/16 Change-Id: I1065ee2fc410e52fca2be89fcbdd4cc5a4755d55	2019-09-25 14:05:07 +02:00
q3k	2d81427410	app/radio: add support for following relays Change-Id: Ib079d657239b1bf5294ad8457370d56a0093dd6d	2019-09-25 13:59:08 +02:00
q3k	5f3a5e0310	cluster/kube: emergency fixes after evition Some pods got evicted. Some of them broke. - postgres in matrix and nginx in internet because of the new policies (chown issues) - cas proxy in matrix because apparently the image was not reuploaded to the regsitry after ceph-waw1 died, and another node didn't have it - registry because it had a weak image pin an downgraded to some broken version on another node Change-Id: I836036872629843c8ede1b7f67982112c90d71f0	2019-09-25 02:58:15 +02:00
q3k	db2a2a029f	Merge "Get in the Cluster, Benji!"	2019-09-18 20:40:12 +00:00
q3k	a01c487a6e	cluster: allow insecure pods in rook-ceph-system This is required for the agent to start a socket on each host for kubelet-to-rook access. Change-Id: I78529df81185aeaacdcb494138f72f0224a029c6	2019-09-05 16:01:19 +00:00
q3k	350aa88421	Merge "cluster: add nextcloud user for object store"	2019-09-02 14:33:24 +00:00
q3k	8c009bb302	Merge "cluster: disable unauthenticated read only port on kubelets"	2019-09-02 14:33:13 +00:00
q3k	13bb1bf4e3	Get in the Cluster, Benji! Here we introduce benji [1], a backup system based on backy2. It lets us backup Ceph RBD objects from Rook into Wasabi, our offsite S3-compatible storage provider. Benji runs as a k8s CronJob, every hour at 42 minutes. It does the following: - runs benji-pvc-backup, which iterates over all PVCs in k8s, and backs up their respective PVs to Wasabi - runs benji enforce, marking backups outside our backup policy [2] as to be deleted - runs benji cleanup, to remove unneeded backups - runs a custom script to backup benji's sqlite3 database into wasabi (unencrypted, but we're fine with that - as the metadata only contains image/pool names, thus Ceph PV and pool names) [1] - https://benji-backup.me/index.html [2] - latest3,hours48,days7,months12, which means the latest 3 backups, then one backup for the next 48 hours, then one backup for the next 7 days, then one backup for the next 12 months, for a total of 65 backups (deduplicated, of course) We also drive-by update some docs (make them mmore separated into user/admin docs). Change-Id: Ibe0942fd38bc232399c0e1eaddade3f4c98bc6b4	2019-09-02 16:33:02 +02:00
q3k	9496d9910a	cluster: add nextcloud user for object store Change-Id: Ib08be16f71ff5e1b72ca6ad436de4b12427dd407	2019-09-02 16:33:02 +02:00
q3k	42553cd044	cluster: disable unauthenticated read only port on kubelets This port was leaking kubelet state, including information on running pods. No secrets were leaked (if they were not text-pasted into env/args), but this still shouldn't be available. As far as I can tell, nothing depends on this port, other than some enterprise load balancers that require HTTP for node 'health' checks. Change-Id: I9549b73e0168fe3ea4dce43cbe8fdc2ca4575961	2019-09-02 16:33:02 +02:00
q3k	c349ccf2fd	Merge "prodvider: clean up LDAP connections"	2019-08-31 14:57:44 +00:00
q3k	896926c921	prodvider: clean up LDAP connections Change-Id: Ic95e6d1b845832fa0fb2da51b418bcdcb8fd05c4	2019-08-31 15:00:51 +02:00
q3k	1503983c27	Merge "rook/ceph: bump"	2019-08-30 23:21:13 +00:00
q3k	ed9cf98316	Merge "prod{access,vider}: implement"	2019-08-30 23:21:09 +00:00
informatic	eabbe8a11e	app/matrix: update software components, refactor config handling Dynamic config generation based on environment variables in Synapse is no longer supported. To pass secrets to container we use a patch that implements configuration overrides via environment variables directly. (to be upstreamed...) Due to Synapse update, appservice configuration ConfigMaps don't need to be copied into Synapse /data volume anymore. Change-Id: I70e6480983bfb997362739c6ce0ec3c313320836	2019-08-30 23:21:53 +02:00
informatic	b20b366092	app/matrix: change storageclass to waw-hdd-paranoid-2 Change-Id: I757942409f4ef4da69d4cf1925d26dc758c65311	2019-08-30 23:21:53 +02:00
q3k	71a21c7693	rook/ceph: bump Change-Id: I046df292cad11650adb829cc8a73100cc1d1ecc8	2019-08-30 23:08:26 +02:00
q3k	b13b7ffcdb	prod{access,vider}: implement Prodaccess/Prodvider allow issuing short-lived certificates for all SSO users to access the kubernetes cluster. Currently, all users get a personal-$username namespace in which they have adminitrative rights. Otherwise, they get no access. In addition, we define a static CRB to allow some admins access to everything. In the future, this will be more granular. We also update relevant documentation. Change-Id: Ia18594eea8a9e5efbb3e9a25a04a28bbd6a42153	2019-08-30 23:08:18 +02:00

... 6 7 8 9 10 ...

705 Commits (9e3ca9c84108453dd958b365eaf56a797832a6bb) All Branches Search

705 Commits (9e3ca9c84108453dd958b365eaf56a797832a6bb)

All Branches