Add alerts for testnet services

This commit is contained in:
Nabarun 2025-04-04 18:06:53 +05:30
parent 27412519b4
commit fb0138e975
2 changed files with 82 additions and 2 deletions

View File

@ -0,0 +1,64 @@
apiVersion: 1
groups:
- orgId: 1
name: testnet
folder: TestnetAlerts
interval: 30s
rules:
- uid: endpoint_down
title: endpoint_down
condition: condition
data:
- refId: probe_success
relativeTimeRange:
from: 600
to: 0
datasourceUid: PBFA97CFB590B2093
model:
datasource:
type: prometheus
uid: PBFA97CFB590B2093
editorMode: code
expr: probe_success{job="blackbox"}
instant: true
intervalMs: 1000
legendFormat: __auto
maxDataPoints: 43200
range: false
refId: probe_success
- refId: condition
relativeTimeRange:
from: 600
to: 0
datasourceUid: __expr__
model:
conditions:
- evaluator:
params:
- 0
- 0
type: eq
operator:
type: and
query:
params: []
reducer:
params: []
type: avg
type: query
datasource:
name: Expression
type: __expr__
uid: __expr__
expression: ${probe_success} == 0
intervalMs: 1000
maxDataPoints: 43200
refId: condition
type: math
noDataState: Alerting
execErrState: Alerting
for: 5m
annotations:
summary: Endpoint {{ $labels.instance }} is down
isPaused: false

View File

@ -4,7 +4,7 @@ Instructions to setup and run monitoring stack for testnet services
## Create a deployment ## Create a deployment
After completing [setup](./README.md#setup), create a spec file for the deployment, which will map the stack's ports and volumes to the host: Create a spec file for the deployment, which will map the stack's ports and volumes to the host:
```bash ```bash
laconic-so --stack monitoring deploy init --output monitoring-testnet-spec.yml laconic-so --stack monitoring deploy init --output monitoring-testnet-spec.yml
@ -37,7 +37,7 @@ laconic-so --stack monitoring deploy create --spec-file monitoring-testnet-spec.
### Prometheus scrape config ### Prometheus scrape config
Add the following scrape configs to prometheus config file (`monitoring-testnet-deployment/config/monitoring/prometheus/prometheus.yml`) in the deployment folder: - Setup the following scrape configs in prometheus config file (`monitoring-testnet-deployment/config/monitoring/prometheus/prometheus.yml`) in the deployment folder:
```yml ```yml
... ...
@ -62,6 +62,22 @@ Add the following scrape configs to prometheus config file (`monitoring-testnet-
# Example: 'host.docker.internal:3317' # Example: 'host.docker.internal:3317'
``` ```
- Remove docker compose services which are not required in `monitoring-testnet-deployment/compose/docker-compose-prom-server.yml`
- `ethereum-chain-head-exporter`
- `filecoin-chain-head-exporter`
- `graph-node-upstream-head-exporter`
- `postgres-exporter`
### Grafana dashboards
Remove some of the existing dashboards which are not required in monitoring testnet
```
cd monitoring-testnet-deployment/config/monitoring/grafana/dashboards
rm postgres-dashboard.json subgraphs-dashboard.json watcher-dashboard.json
cd -
```
<!-- TODO: Check node-exporter-full.json, nodejs-app-dashboard.json -->
### Grafana alerts config ### Grafana alerts config
Place the pre-configured alerts rules in Grafana provisioning directory: Place the pre-configured alerts rules in Grafana provisioning directory: