hscloud

cheshire

hscloud

Author	SHA1	Message	Date
q3k	0ec06d7b75	ops: update deploy instructions to include profile set This is necessary for the NixOS EFI boot machinery to pick up the new derivation when switching to it, otherwise the machine will not boot into the newly switched configuration. Change-Id: I8b18956d2afeea09c38462f09a00c345cf86f80d	2021-04-18 18:13:33 +00:00
q3k	a0332a75a0	ops/machines: pin edge01.waw to its current version of nixpkgs Stopgap until we finish b/3, need to deploy some changes on it without rebooting into newer nixpkgs. Change-Id: Ic2690dfcb398a419338961c8fcbc7e604298977a	2021-03-18 19:22:41 +00:00
informatic	7f8f3e9f9c	ops/sso: upgrade sso-v2 Change in sso-v2 unifies id_token and userinfo endpoint handling - now groups, nickname, email and preferred_username keys are present in id_tokens as well. https://code.hackerspace.pl/informatic/sso-v2/commit/?id=c4c810cd255a7bfcab5ced3fb88c8b311b518c34 Change-Id: Ib22994edc067fd83701590182f8096f6fca692ba	2021-02-01 17:03:27 +01:00
q3k	9e3ca9c841	ops/sso: move jsonnets to kube/ This is in preparation for moving the sso source code into hscloud. Change-Id: I4325df617dc82c17fb4c96762743f0b70122976f	2021-01-31 15:52:06 +01:00
q3k	cc2ff79f01	ops/monitoring: move grafana to sso. Change-Id: Ib2ecf6820454a160834db2ac212b31d9d5306972	2021-01-30 17:26:47 +01:00
q3k	d82807e024	Merge changes I84873bc3,I1eedb190 * changes: ops/monitoring: deploy grafana ops/monitoring: scrape apiserver, scheduler, and controller-manager	2021-01-30 16:22:44 +00:00
informatic	d6c97596cd	ops/sso: "the hackerspace oidc/oauth2 provider" deployment Change-Id: I092b844364ed30037eff00188dcdf5d6d3c228c5	2021-01-29 23:23:09 +01:00
q3k	4f7caf8d86	ops/monitoring: deploy grafana This is a basic grafana running on: https://monitoring-global-dashboard.k0.hswaw.net/ It contains a data source pointing at the corresponding global victoria metrics. There's no dashboards, these will be provisioned soon via jsonnet/grafonnet. Change-Id: I84873bc323d1727096e3ce818fae122a9af3e191	2020-12-17 22:10:31 +00:00
q3k	cfc0496266	ops/monitoring: scrape apiserver, scheduler, and controller-manager These get scraped by public IP address, which get retrieved via service discovery in Prometheus (by using the endpoints role on the default/kubernetes service). Also drive-by fix cluster prometheus resources - the default configuration wants at least 3GB of physical memory. Change-Id: I1eedb19051f62b40613f69e5f0f736d5958acf42	2020-12-17 22:09:56 +00:00
q3k	7d311e9602	ops/monitoring: pull in grafonnet-7.0 Change-Id: Ie036ef767419418876a18255a5ad378f5cfa1535	2020-10-10 15:59:45 +00:00
q3k	363bf4f341	monitoring: global: implement This creates a basic Global instance, running Victoria Metrics on k0. Change-Id: Ib03003213d79b41cc54efe40cd2c4837f652c0f4	2020-10-06 14:28:27 +00:00
q3k	6abe4fa771	bgpwtf/machines: init edge01.waw This configures our WAW edge router using NixOS. This replaces our previous Ubuntu installation. Change-Id: Ibd72bde66ec413164401da407c5b268ad83fd3af	2020-10-03 14:57:38 +00:00
q3k	c1364e8d8a	ops/monitoring: add implr to owners This will fix future reviews from him having to require my +2. Change-Id: Icde1f64fe4387e92d19943d7469ce0569eb45257	2020-06-07 02:23:09 +02:00
q3k	2022ac2338	ops/monitoring: split up jsonnet, add simple docs Change-Id: I8120958a6862411de0446896875766834457aba9	2020-06-06 17:05:15 +02:00
q3k	ce81c39081	ops/metrics: basic cluster setup with prometheus We handwavingly plan on implementing monitoring as a two-tier system: - a 'global' component that is reponsible for global aggregation, long-term storage and alerting. - multiple 'per-cluster' components, that collect metrics from Kubernetes clusters and export them to the global component. In addition, several lower tiers (collected by per-cluster components) might also be implemented in the future - for instance, specific to some subprojects. Here we start sketching out some basic jsonnet structure (currently all in a single file, with little parametrization) and a cluster-level prometheus server that scrapes Kubernetes Node and cAdvisor metrics. This review is mostly to get this commited as early as possible, and to make sure that the little existing Prometheus scrape configuration is sane. Change-Id: If37ac3b1243b8b6f464d65fee6d53080c36f992c	2020-06-06 15:56:10 +02:00

15 Commits (cb9cbb3fccecf5768e0d6977deb8caffc7ba9456)