check-health: monitor free RAM

---- ✂️ ----
🗃️📉️ Health warning: free RAM

The available free RAM on MikroTik is at 18% (47MiB)!
---- ✂️ ----
🗃️📈️ Health recovery: free RAM

The available free RAM on MikroTik increased to 65% (168MiB).
---- ✂️ ----
This commit is contained in:
Christian Hesse 2023-01-20 14:34:18 +01:00
parent 75bd14267e
commit 6780e1a24c
10 changed files with 36 additions and 8 deletions

View file

@ -12,6 +12,7 @@
:global CheckHealthCPULoad; :global CheckHealthCPULoad;
:global CheckHealthCPULoadNotified; :global CheckHealthCPULoadNotified;
:global CheckHealthFreeRAMNotified;
:global CheckHealthLast; :global CheckHealthLast;
:global CheckHealthTemperature; :global CheckHealthTemperature;
:global CheckHealthTemperatureDeviation; :global CheckHealthTemperatureDeviation;
@ -45,7 +46,9 @@
$ScriptLock $0; $ScriptLock $0;
:set CheckHealthCPULoad (($CheckHealthCPULoad * 4 + [ /system/resource/get cpu-load ] * 10) / 5); :local Resource [ /system/resource/get ];
:set CheckHealthCPULoad (($CheckHealthCPULoad * 4 + ($Resource->"cpu-load") * 10) / 5);
:if ($CheckHealthCPULoad > 750 && $CheckHealthCPULoadNotified != true) do={ :if ($CheckHealthCPULoad > 750 && $CheckHealthCPULoadNotified != true) do={
$SendNotification2 ({ origin=$0; \ $SendNotification2 ({ origin=$0; \
subject=([ $SymbolForNotification "abacus,chart-increasing" ] . "Health warning: CPU load"); \ subject=([ $SymbolForNotification "abacus,chart-increasing" ] . "Health warning: CPU load"); \
@ -59,6 +62,23 @@ $ScriptLock $0;
:set CheckHealthCPULoadNotified false; :set CheckHealthCPULoadNotified false;
} }
:local CheckHealthFreeRAM ($Resource->"free-memory" * 100 / $Resource->"total-memory");
:if ($CheckHealthFreeRAM < 20 && $CheckHealthFreeRAMNotified != true) do={
$SendNotification2 ({ origin=$0; \
subject=([ $SymbolForNotification "card-file-box,chart-decreasing" ] . "Health warning: free RAM"); \
message=("The available free RAM on " . $Identity . " is at " . $CheckHealthFreeRAM . "% (" . \
($Resource->"free-memory" / 1024 / 1024) . "MiB)!") });
:set CheckHealthFreeRAMNotified true;
}
:if ($CheckHealthFreeRAM > 30 && $CheckHealthFreeRAMNotified = true) do={
$SendNotification2 ({ origin=$0; \
subject=([ $SymbolForNotification "card-file-box,chart-increasing" ] . "Health recovery: free RAM"); \
message=("The available free RAM on " . $Identity . " increased to " . $CheckHealthFreeRAM . "% (" . \
($Resource->"free-memory" / 1024 / 1024) . "MiB).") });
:set CheckHealthFreeRAMNotified false;
}
:foreach Voltage in=[ /system/health/find where type="V" ] do={ :foreach Voltage in=[ /system/health/find where type="V" ] do={
:local Name [ /system/health/get $Voltage name ]; :local Name [ /system/health/get $Voltage name ];
:local Value [ /system/health/get $Voltage value ]; :local Value [ /system/health/get $Voltage value ];

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.6 KiB

View file

Before

Width:  |  Height:  |  Size: 4 KiB

After

Width:  |  Height:  |  Size: 4 KiB

Before After
Before After

View file

Before

Width:  |  Height:  |  Size: 3.5 KiB

After

Width:  |  Height:  |  Size: 3.5 KiB

Before After
Before After

View file

Before

Width:  |  Height:  |  Size: 3.7 KiB

After

Width:  |  Height:  |  Size: 3.7 KiB

Before After
Before After

View file

Before

Width:  |  Height:  |  Size: 3.5 KiB

After

Width:  |  Height:  |  Size: 3.5 KiB

Before After
Before After

View file

Before

Width:  |  Height:  |  Size: 3.5 KiB

After

Width:  |  Height:  |  Size: 3.5 KiB

Before After
Before After

View file

@ -13,14 +13,16 @@ This script is run from scheduler periodically, sending notification on
health related events: health related events:
* high CPU load * high CPU load
* low available free RAM
* voltage jumps up or down more than configured threshold or drops below limit * voltage jumps up or down more than configured threshold or drops below limit
* power supply failed or recovered * power supply failed or recovered
* temperature is above or below threshold * temperature is above or below threshold
Note that bad initial state will not trigger an event. Note that bad initial state will not trigger an event.
Monitoring CPU load works on all devices. Other than that only sensors Monitoring CPU load and available free RAM works on all devices. Other
available in hardware can be checked. See what your hardware supports: than that only sensors available in hardware can be checked. See what your
hardware supports:
/system/health/print; /system/health/print;
@ -31,19 +33,24 @@ available in hardware can be checked. See what your hardware supports:
![check-health notification cpu load high](check-health.d/notification-01-cpu-load-high.avif) ![check-health notification cpu load high](check-health.d/notification-01-cpu-load-high.avif)
![check-health notification cpu load ok](check-health.d/notification-02-cpu-load-ok.avif) ![check-health notification cpu load ok](check-health.d/notification-02-cpu-load-ok.avif)
#### Available free RAM
![check-health notification free ram low](check-health.d/notification-03-free-ram-low.avif)
![check-health notification free ram ok](check-health.d/notification-04-free-ram-ok.avif)
#### Voltage #### Voltage
![check-health notification voltage](check-health.d/notification-03-voltage.avif) ![check-health notification voltage](check-health.d/notification-05-voltage.avif)
#### Temperature #### Temperature
![check-health notification temperature high](check-health.d/notification-04-temperature-high.avif) ![check-health notification temperature high](check-health.d/notification-06-temperature-high.avif)
![check-health notification temperature ok](check-health.d/notification-05-temperature-ok.avif) ![check-health notification temperature ok](check-health.d/notification-07-temperature-ok.avif)
#### PSU state #### PSU state
![check-health notification psu fail](check-health.d/notification-06-psu-fail.avif) ![check-health notification psu fail](check-health.d/notification-08-psu-fail.avif)
![check-health notification psu ok](check-health.d/notification-07-psu-ok.avif) ![check-health notification psu ok](check-health.d/notification-09-psu-ok.avif)
Requirements and installation Requirements and installation
----------------------------- -----------------------------

View file

@ -1078,6 +1078,7 @@
"abacus"="\F0\9F\A7\AE"; "abacus"="\F0\9F\A7\AE";
"alarm-clock"="\E2\8F\B0"; "alarm-clock"="\E2\8F\B0";
"calendar"="\F0\9F\93\85"; "calendar"="\F0\9F\93\85";
"card-file-box"="\F0\9F\97\83";
"chart-decreasing"="\F0\9F\93\89"; "chart-decreasing"="\F0\9F\93\89";
"chart-increasing"="\F0\9F\93\88"; "chart-increasing"="\F0\9F\93\88";
"cloud"="\E2\98\81"; "cloud"="\E2\98\81";