Some pods got evicted. Some of them broke.
- postgres in matrix and nginx in internet because of the new policies
(chown issues)
- cas proxy in matrix because apparently the image was not reuploaded
to the regsitry after ceph-waw1 died, and another node didn't have it
- registry because it had a weak image pin an downgraded to some
broken version on another node
Change-Id: I836036872629843c8ede1b7f67982112c90d71f0
Here we introduce benji [1], a backup system based on backy2. It lets us
backup Ceph RBD objects from Rook into Wasabi, our offsite S3-compatible
storage provider.
Benji runs as a k8s CronJob, every hour at 42 minutes. It does the
following:
- runs benji-pvc-backup, which iterates over all PVCs in k8s, and backs
up their respective PVs to Wasabi
- runs benji enforce, marking backups outside our backup policy [2] as
to be deleted
- runs benji cleanup, to remove unneeded backups
- runs a custom script to backup benji's sqlite3 database into wasabi
(unencrypted, but we're fine with that - as the metadata only contains
image/pool names, thus Ceph PV and pool names)
[1] - https://benji-backup.me/index.html
[2] - latest3,hours48,days7,months12, which means the latest 3 backups,
then one backup for the next 48 hours, then one backup for the next
7 days, then one backup for the next 12 months, for a total of 65
backups (deduplicated, of course)
We also drive-by update some docs (make them mmore separated into
user/admin docs).
Change-Id: Ibe0942fd38bc232399c0e1eaddade3f4c98bc6b4
This port was leaking kubelet state, including information on running
pods. No secrets were leaked (if they were not text-pasted into
env/args), but this still shouldn't be available.
As far as I can tell, nothing depends on this port, other than some
enterprise load balancers that require HTTP for node 'health' checks.
Change-Id: I9549b73e0168fe3ea4dce43cbe8fdc2ca4575961
Dynamic config generation based on environment variables in Synapse is
no longer supported. To pass secrets to container we use a patch that
implements configuration overrides via environment variables directly.
(to be upstreamed...)
Due to Synapse update, appservice configuration ConfigMaps don't need to
be copied into Synapse /data volume anymore.
Change-Id: I70e6480983bfb997362739c6ce0ec3c313320836
Prodaccess/Prodvider allow issuing short-lived certificates for all SSO
users to access the kubernetes cluster.
Currently, all users get a personal-$username namespace in which they
have adminitrative rights. Otherwise, they get no access.
In addition, we define a static CRB to allow some admins access to
everything. In the future, this will be more granular.
We also update relevant documentation.
Change-Id: Ia18594eea8a9e5efbb3e9a25a04a28bbd6a42153
We just got this email:
We've been working with Jetstack, the authors of cert-manager, on a
series of fixes to the client. Cert-manager sometimes falls into a
traffic pattern where it sends really excessive traffic to Let's
Encrypt's servers, continuously. To mitigate this, we plan to start
blocking all traffic from cert-manager versions less than 0.8.0 (the
current semver minor release), as of November 1, 2019. Please upgrade
all of your cert-manager instances before then.
We're sending this email because this is the contact address of your
cert-manager instance at:
185.236.240.37 .
Version 0.8.0 is much better but we still observe excessive traffic in
some cases. We're working with Jetstack to improve these cases. As new
versions of cert-manager are released, we will add the non-current
versions to our block list after 3 months. We strongly encourage
cert-manager users to stay up-to-date with new versions.
Also, there is an opportunity to help both Jetstack and Let's Encrypt.
Once you've upgraded, please check the logs for your cert-manager
instances from time to time. Are they making excessive requests to Let's
Encrypt (more than, say, 10 per day over multiple days)? If so, please
share details at https://github.com/jetstack/cert-manager/issues/1948 .
Thanks,
Let's Encrypt Team
Change-Id: Ic7152150ac1c96941423878c6d4b6209e07429cf
We seem to be hitting a bug where the encryptor doesn't initialize
because of a lacking gpg binary, and then crashes on .Close().
This should fix the issue, but is untested.
goroutine 70 [running]:
code.hackerspace.pl/hscloud/bgpwtf/cccampix/pgpencryptor/gpg.(*CLIEncryptor).Close(0x0)
bgpwtf/cccampix/pgpencryptor/gpg/gpg.go:144 +0x22
main.(*service).Encrypt(0xc000345e00, 0x16d13a0, 0xc00047f260, 0x1688400, 0xc00003d4a0)
bgpwtf/cccampix/pgpencryptor/main.go:132 +0x6f9
code.hackerspace.pl/hscloud/bgpwtf/cccampix/proto._PGPEncryptor_Encrypt_Handler(0x133bf00, 0xc000345e00, 0x16c6300, 0xc0000d6000, 0x2247b78, 0xc0001f8000)
bazel-out/k8-fastbuild/bin/bgpwtf/cccampix/proto/linux_amd64_stripped/ix_go_proto%/code.hackerspace.pl/hscloud/bgpwtf/cccampix/proto/ix.pb.go:1816 +0xad
google.golang.org/grpc.(*Server).processStreamingRPC(0xc000160c00, 0x16d6ce0, 0xc000161500, 0xc0001f8000, 0xc0004244e0, 0x21b00e0, 0xc0000c6ff0, 0x0, 0x0)
external/org_golang_google_grpc/server.go:1175 +0xacd
google.golang.org/grpc.(*Server).handleStream(0xc000160c00, 0x16d6ce0, 0xc000161500, 0xc0001f8000, 0xc0000c6ff0)
external/org_golang_google_grpc/server.go:1254 +0xcbe
google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc000404770, 0xc000160c00, 0x16d6ce0, 0xc000161500, 0xc0001f8000)
external/org_golang_google_grpc/server.go:690 +0x9f
created by google.golang.org/grpc.(*Server).serveStreams.func1
external/org_golang_google_grpc/server.go:688 +0xa1
created by google.golang.org/grpc.(*Server).serveStreams.func1
external/org_golang_google_grpc/server.go:688 +0xa1
Change-Id: Idd167a120e157005f44d255a61ef13dc80e8eeed
We accidentally created crdb-waw2 in
https://gerrit.hackerspace.pl/c/hscloud/+/2.
We remove it now and also backport a manual change that makes the
crdb-waw1 service public via a LoadBalancer.
Change-Id: I3bbd6f01b82c6efa458cc44776f086ba36e9f20c
This uses github.com/golang-migrate/migrate and adds a Source that
allows using go_embed data files.
We also provide a test/example.
Change-Id: Icd2b6c7f7d0f728073b3fdf39b432b33ce61a3cd
We add a small IRR service for getting a parsed RPSL from IRRs. For now,
we only support RIPE and ARIN, and only the following attributes:
- remarks
- import
- export
Since RPSL/RFC2622 is fucking insane, there is no guarantee that the
parser, especially the import/export parser, is correct. But it should
be good enough for our use. We even throw in some tests for good
measure.
$ grpcurl -format text -plaintext -d 'as: "26625"' 127.0.0.1:4200 ix.IRR.Query
source: SOURCE_ARIN
attributes: <
import: <
expressions: <
peering: "AS6083"
actions: "pref=10"
>
filter: "ANY"
>
>
attributes: <
import: <
expressions: <
peering: "AS12491"
actions: "pref=10"
>
filter: "ANY"
>
>
Change-Id: I8b240ffe2cd3553a25ce33dbd3917c0aef64e804
We start having the need to have our own production image instead ofjust
a bare Ubuntu image. For instance, octorpki will need rync and TLS CA
bundles.
Change-Id: Ia8d9604ae8c320f858cfe8a2dc21ddcc321017ff