forked from hswaw/hscloud
42 lines
1.6 KiB
Markdown
42 lines
1.6 KiB
Markdown
hscloud monitoring
|
|
==================
|
|
|
|
Quick links
|
|
-----------
|
|
|
|
- *Old Global Dashboard*: [monitoring.hackerspace.pl](https://monitoring.hackerspace.pl) - old monitoring system, unrelated to this one, configured using Chef at management.hackerspace.pl (long since dead). This setup is supposed to replace it.
|
|
|
|
Architecture
|
|
------------
|
|
|
|
The hscloud monitoring solution is two-tiered:
|
|
|
|
- at the *global* tier we run metrics aggregation, long-term storage, dashboard and alerting.
|
|
- at the *agent* tier we collect metrics from various sources (possibly even lower tiered agents).
|
|
|
|
All agent-tier agents send metrics to all global instances.
|
|
|
|
|
|
.--------. .--------. '.
|
|
| global | | global | > - global tier
|
|
'--------' '--------' .' (contains 'global instances')
|
|
| '---. .---' |
|
|
| X |
|
|
| .---' '---. |
|
|
| | | |
|
|
.--------------. .--------------------. '.
|
|
| cluster | | hswaw-proxy | |
|
|
| k0.hswaw.net | | waw.hackerspace.pl | > - agent tier
|
|
'--------------' '--------------------' .' (contains 'agents')
|
|
|
|
|
|
Agent - cluster
|
|
---------------
|
|
|
|
Cluster agents are responsible from collecting Kubernetes cluster metrics. They run a prometheus server that scrapes kubelet/cadvisor/... metrics and send them off to global instances.
|
|
|
|
Global Instances
|
|
----------------
|
|
|
|
Global agents run Victoria Metrics, ingest metrics from all agents, and perform long-term storage. In the future they will also run Grafana and AlertManager.
|