“A python script to monitor your system health (disk usage) by running the script as a cron job. The script also helps to alert the listed email recipient by sending a email alert when the anomalies are detected to take corrective action.”
Srimanta Panda
Technologist
Auto Bot to Monitor System Anomalies (Disk Usage)
Everyday millions of server running across the world relentlessly. People keep on deploying code with the ambition to creating amazing features to deliver to customer. As the features are added and deployed, end user start using the new features. Also, it keeps the server up and running for hours. During the process, sometimes it happens that people forget about checking the health aspect of the server and hide behind the feature deployment
Suddenly, one day server start behaving incorrectly and that what I am going to write about today. Developer start digging the code to find the fault where as the root cause of the issue might be the server infrastructure itself that making the behave strangely.
The issue might be in the low disk storage, or higher CPU usage, or low on memory. How do we prepare ourselves to handle the situation for system health monitoring? Answer is a simple cronjob running a python script can do the magic, VoilĂ !
A Simple Python script
“Here is a link to the python script on Github that can help to check the disk usage of your server and alert you regarding the status. The script contains only the disk usage report, the new version will add more feature like memory usage and CPU status !”
How?
Here are the steps explaining setting up the script on your server that you want to monitor.
01
Update Parameter in the Script
For simplicity aspect of the script, these parameters are hard coded inside the script. Otherwise, it can be taken from environment variable instead.
PARTITION = ‘/’
This is the partition you want to keep track of. The script handles one partition right now, but it can be extended to handle multiple partition with few simple modifications in the script or can be achieved by keeping multiple cron job (not recommended though) for each partition.
THRESOLD = 60
This threshold level of the disk usage in percentage. If the exceeds this limit, it will send an alert.
SENDER_EMAIL = ‘<input sender email>’
The email address that will be used for sending the email.
SENDER_PASSWORD = ‘<sender email password>’
The password for the sender email address. This is a tricky part that I will explain in later section.
ALERT_EMAILS = ‘<receiver email>’
The recipient email address.
SMTP_SERVER = ‘<smtp server for email>’
The SMTP server for the sender email. For gmail this is smtp.gmail.com.
SMTP_SERVER_PORT = 587
SMTP server port that will be used for sending the email.
02
Write a Cron Job to run periodically
A cron job can be setup to handle the system check. For example
01 */1 * * * /home/debian-local/script/machine-alert-monitor.py
03
Mail Formatting
You can always change the formatting of the email message, as per your like. You can replace this part of the code and insert your own wording for the email body.
04
Hint about Gmail as SMTP Server
When I started testing the script using my own email account, it started giving me authentication failure from google although I was putting correct password. After a little bit of investigation, I found that to make the script to work, you need to generate a app password in your Google Account. This can be used for the authentication of Third Party application to send emails using the gmail.
To generate the app password you need to enable 2-Step Verification for the Google Account. Otherwise, the option will not abe available in Google Setting.