check-health: monitor CPU load

---- ✂️ ----
🧮📈️ Health warning: CPU load

The average CPU load on MikroTik is at 76%!
---- ✂️ ----
🧮📉️ Health recovery: CPU load

The average CPU load on MikroTik decreased to 64%.
---- ✂️ ----
This commit is contained in:
Christian Hesse 2023-01-20 14:24:20 +01:00
parent 2694f8d2b1
commit 75bd14267e
10 changed files with 30 additions and 7 deletions

View file

@ -10,6 +10,8 @@
:global GlobalFunctionsReady; :global GlobalFunctionsReady;
:while ($GlobalFunctionsReady != true) do={ :delay 500ms; } :while ($GlobalFunctionsReady != true) do={ :delay 500ms; }
:global CheckHealthCPULoad;
:global CheckHealthCPULoadNotified;
:global CheckHealthLast; :global CheckHealthLast;
:global CheckHealthTemperature; :global CheckHealthTemperature;
:global CheckHealthTemperatureDeviation; :global CheckHealthTemperatureDeviation;
@ -43,6 +45,20 @@
$ScriptLock $0; $ScriptLock $0;
:set CheckHealthCPULoad (($CheckHealthCPULoad * 4 + [ /system/resource/get cpu-load ] * 10) / 5);
:if ($CheckHealthCPULoad > 750 && $CheckHealthCPULoadNotified != true) do={
$SendNotification2 ({ origin=$0; \
subject=([ $SymbolForNotification "abacus,chart-increasing" ] . "Health warning: CPU load"); \
message=("The average CPU load on " . $Identity . " is at " . ($CheckHealthCPULoad / 10) . "%!") });
:set CheckHealthCPULoadNotified true;
}
:if ($CheckHealthCPULoad < 650 && $CheckHealthCPULoadNotified = true) do={
$SendNotification2 ({ origin=$0; \
subject=([ $SymbolForNotification "abacus,chart-decreasing" ] . "Health recovery: CPU load"); \
message=("The average CPU load on " . $Identity . " decreased to " . ($CheckHealthCPULoad / 10) . "%.") });
:set CheckHealthCPULoadNotified false;
}
:foreach Voltage in=[ /system/health/find where type="V" ] do={ :foreach Voltage in=[ /system/health/find where type="V" ] do={
:local Name [ /system/health/get $Voltage name ]; :local Name [ /system/health/get $Voltage name ];
:local Value [ /system/health/get $Voltage value ]; :local Value [ /system/health/get $Voltage value ];

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.2 KiB

View file

Before

Width:  |  Height:  |  Size: 4 KiB

After

Width:  |  Height:  |  Size: 4 KiB

Before After
Before After

View file

Before

Width:  |  Height:  |  Size: 3.5 KiB

After

Width:  |  Height:  |  Size: 3.5 KiB

Before After
Before After

View file

Before

Width:  |  Height:  |  Size: 3.7 KiB

After

Width:  |  Height:  |  Size: 3.7 KiB

Before After
Before After

View file

Before

Width:  |  Height:  |  Size: 3.5 KiB

After

Width:  |  Height:  |  Size: 3.5 KiB

Before After
Before After

View file

Before

Width:  |  Height:  |  Size: 3.5 KiB

After

Width:  |  Height:  |  Size: 3.5 KiB

Before After
Before After

View file

@ -12,32 +12,38 @@ Description
This script is run from scheduler periodically, sending notification on This script is run from scheduler periodically, sending notification on
health related events: health related events:
* high CPU load
* voltage jumps up or down more than configured threshold or drops below limit * voltage jumps up or down more than configured threshold or drops below limit
* power supply failed or recovered * power supply failed or recovered
* temperature is above or below threshold * temperature is above or below threshold
Note that bad initial state will not trigger an event. Note that bad initial state will not trigger an event.
Only sensors available in hardware can be checked. See what your Monitoring CPU load works on all devices. Other than that only sensors
hardware supports: available in hardware can be checked. See what your hardware supports:
/system/health/print; /system/health/print;
### Sample notifications ### Sample notifications
#### CPU load
![check-health notification cpu load high](check-health.d/notification-01-cpu-load-high.avif)
![check-health notification cpu load ok](check-health.d/notification-02-cpu-load-ok.avif)
#### Voltage #### Voltage
![check-health notification voltage](check-health.d/notification-01-voltage.avif) ![check-health notification voltage](check-health.d/notification-03-voltage.avif)
#### Temperature #### Temperature
![check-health notification](check-health.d/notification-02-temperature-high.avif) ![check-health notification temperature high](check-health.d/notification-04-temperature-high.avif)
![check-health notification](check-health.d/notification-03-temperature-ok.avif) ![check-health notification temperature ok](check-health.d/notification-05-temperature-ok.avif)
#### PSU state #### PSU state
![check-health notification](check-health.d/notification-04-psu-fail.avif) ![check-health notification psu fail](check-health.d/notification-06-psu-fail.avif)
![check-health notification](check-health.d/notification-05-psu-ok.avif) ![check-health notification psu ok](check-health.d/notification-07-psu-ok.avif)
Requirements and installation Requirements and installation
----------------------------- -----------------------------

View file

@ -1075,6 +1075,7 @@
# return UTF-8 symbol for unicode name # return UTF-8 symbol for unicode name
:set SymbolByUnicodeName do={ :set SymbolByUnicodeName do={
:local Symbols { :local Symbols {
"abacus"="\F0\9F\A7\AE";
"alarm-clock"="\E2\8F\B0"; "alarm-clock"="\E2\8F\B0";
"calendar"="\F0\9F\93\85"; "calendar"="\F0\9F\93\85";
"chart-decreasing"="\F0\9F\93\89"; "chart-decreasing"="\F0\9F\93\89";