hscloud

cheshire

hscloud

History

q3k ce81c39081 ops/metrics: basic cluster setup with prometheus We handwavingly plan on implementing monitoring as a two-tier system: - a 'global' component that is reponsible for global aggregation, long-term storage and alerting. - multiple 'per-cluster' components, that collect metrics from Kubernetes clusters and export them to the global component. In addition, several lower tiers (collected by per-cluster components) might also be implemented in the future - for instance, specific to some subprojects. Here we start sketching out some basic jsonnet structure (currently all in a single file, with little parametrization) and a cluster-level prometheus server that scrapes Kubernetes Node and cAdvisor metrics. This review is mostly to get this commited as early as possible, and to make sure that the little existing Prometheus scrape configuration is sane. Change-Id: If37ac3b1243b8b6f464d65fee6d53080c36f992c		2020-06-06 15:56:10 +02:00
..
certs	cluster: bump nearly-expired certs	2020-03-28 18:01:40 +01:00
clustercfg	cluster: bump nearly-expired certs	2020-03-28 18:01:40 +01:00
doc	*: more hackdoc updates	2020-04-10 22:10:18 +02:00
kube	ops/metrics: basic cluster setup with prometheus	2020-06-06 15:56:10 +02:00
nix	cluster: bump kubelets to 1.14.3	2020-02-02 23:43:28 +01:00
prodaccess	prod{access,vider}: implement	2019-08-30 23:08:18 +02:00
prodvider	prodvider: clean up LDAP connections	2019-08-31 15:00:51 +02:00
secrets	tools/secretstore: add sync command, re-encrypt	2020-06-04 19:25:07 +00:00
tools	calico: upgrade to 3.14, fix calicoctl	2020-05-28 16:47:16 +02:00