1
0
Fork 0
hscloud/cluster
q3k ce81c39081 ops/metrics: basic cluster setup with prometheus
We handwavingly plan on implementing monitoring as a two-tier system:

 - a 'global' component that is reponsible for global aggregation,
   long-term storage and alerting.
 - multiple 'per-cluster' components, that collect metrics from
   Kubernetes clusters and export them to the global component.

In addition, several lower tiers (collected by per-cluster components)
might also be implemented in the future - for instance, specific to some
subprojects.

Here we start sketching out some basic jsonnet structure (currently all
in a single file, with little parametrization) and a cluster-level
prometheus server that scrapes Kubernetes Node and cAdvisor metrics.

This review is mostly to get this commited as early as possible, and to
make sure that the little existing Prometheus scrape configuration is
sane.

Change-Id: If37ac3b1243b8b6f464d65fee6d53080c36f992c
2020-06-06 15:56:10 +02:00
..
certs cluster: bump nearly-expired certs 2020-03-28 18:01:40 +01:00
clustercfg cluster: bump nearly-expired certs 2020-03-28 18:01:40 +01:00
doc *: more hackdoc updates 2020-04-10 22:10:18 +02:00
kube ops/metrics: basic cluster setup with prometheus 2020-06-06 15:56:10 +02:00
nix cluster: bump kubelets to 1.14.3 2020-02-02 23:43:28 +01:00
prodaccess prod{access,vider}: implement 2019-08-30 23:08:18 +02:00
prodvider prodvider: clean up LDAP connections 2019-08-31 15:00:51 +02:00
secrets tools/secretstore: add sync command, re-encrypt 2020-06-04 19:25:07 +00:00
tools calico: upgrade to 3.14, fix calicoctl 2020-05-28 16:47:16 +02:00