Why Monitor Services with Bash?
Production servers run critical services — web servers, databases, application processes — that must remain available. While enterprise monitoring solutions provide comprehensive dashboards and alerting, a simple bash script can serve as a lightweight watchdog that detects failures and takes immediate action.
Bash-based monitoring is especially useful for small environments, edge servers, or as a backup monitoring layer that operates independently from your primary monitoring stack. It requires no additional software installation and can be set up in minutes.
Checking Service Status with systemctl
On systemd-based Linux distributions (Ubuntu 16.04+, CentOS 7+, Debian 8+), the systemctl is-active command provides a simple way to check whether a service is running:
$ systemctl is-active nginx
active
$ systemctl is-active mysql
active
$ systemctl is-active some-stopped-service
inactive
The command returns a single word — active, inactive, failed, or activating — and sets the exit code accordingly. An exit code of 0 means active; any non-zero exit code means the service is not running.
systemctl is-active nginx
echo $? # 0 if active, non-zero if not
Basic Monitoring Script
Here is a minimal script that checks a service and restarts it if it is not running:
#!/bin/bash
# monitor-service.sh - Check and restart a service if down
SERVICE="nginx"
if ! systemctl is-active --quiet "$SERVICE"; then
echo "$(date): $SERVICE is down. Attempting restart..." >> /var/log/service-monitor.log
systemctl restart "$SERVICE"
# Check if restart succeeded
sleep 2
if systemctl is-active --quiet "$SERVICE"; then
echo "$(date): $SERVICE restarted successfully." >> /var/log/service-monitor.log
else
echo "$(date): CRITICAL - $SERVICE failed to restart!" >> /var/log/service-monitor.log
fi
fi
The --quiet flag suppresses output and only sets the exit code, making it ideal for scripting. The script logs all actions to a file for later review.
Adding Email Alerts
When a service goes down, you need to know about it immediately. Add email notifications using mailx or sendmail:
#!/bin/bash
# monitor-service-email.sh
SERVICE="nginx"
ADMIN_EMAIL="admin@example.com"
HOSTNAME=$(hostname)
LOGFILE="/var/log/service-monitor.log"
if ! systemctl is-active --quiet "$SERVICE"; then
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
echo "$TIMESTAMP: $SERVICE is down on $HOSTNAME" >> "$LOGFILE"
# Attempt restart
systemctl restart "$SERVICE"
sleep 3
if systemctl is-active --quiet "$SERVICE"; then
SUBJECT="[RECOVERED] $SERVICE restarted on $HOSTNAME"
BODY="$SERVICE was found inactive at $TIMESTAMP and was automatically restarted."
echo "$TIMESTAMP: $SERVICE recovered after restart" >> "$LOGFILE"
else
SUBJECT="[CRITICAL] $SERVICE DOWN on $HOSTNAME"
BODY="$SERVICE was found inactive at $TIMESTAMP. Automatic restart FAILED. Immediate attention required."
echo "$TIMESTAMP: CRITICAL - $SERVICE restart failed" >> "$LOGFILE"
fi
echo "$BODY" | mailx -s "$SUBJECT" "$ADMIN_EMAIL"
fi
Make sure mailx is installed and your server can send outbound email. On Ubuntu, install it with sudo apt install mailutils.
Adding Slack Alerts
Many teams prefer Slack notifications. You can send alerts to a Slack channel using an incoming webhook:
#!/bin/bash
# send-slack-alert.sh
SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
SERVICE="$1"
STATUS="$2"
HOSTNAME=$(hostname)
if [ "$STATUS" = "critical" ]; then
COLOR="#FF0000"
ICON=":red_circle:"
else
COLOR="#36a64f"
ICON=":white_check_mark:"
fi
PAYLOAD=$(cat <<EOJSON
{
"attachments": [{
"color": "$COLOR",
"title": "$ICON $SERVICE on $HOSTNAME",
"text": "Service status: $STATUS\nTime: $(date '+%Y-%m-%d %H:%M:%S')",
"footer": "Service Monitor"
}]
}
EOJSON
)
curl -s -X POST -H 'Content-type: application/json' --data "$PAYLOAD" "$SLACK_WEBHOOK"
Call this from your monitoring script:
# After failed restart
bash /opt/scripts/send-slack-alert.sh "$SERVICE" "critical"
# After successful restart
bash /opt/scripts/send-slack-alert.sh "$SERVICE" "recovered"
Monitoring Multiple Services
Most servers run multiple services. Here is a script that monitors a list of services:
#!/bin/bash
# monitor-all-services.sh
SERVICES=("nginx" "mysql" "php8.1-fpm" "redis-server" "postfix")
ADMIN_EMAIL="admin@example.com"
HOSTNAME=$(hostname)
LOGFILE="/var/log/service-monitor.log"
FAILED_SERVICES=()
for SERVICE in "${SERVICES[@]}"; do
if ! systemctl is-active --quiet "$SERVICE"; then
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
echo "$TIMESTAMP: $SERVICE is down. Attempting restart..." >> "$LOGFILE"
systemctl restart "$SERVICE"
sleep 2
if ! systemctl is-active --quiet "$SERVICE"; then
echo "$TIMESTAMP: CRITICAL - $SERVICE restart failed" >> "$LOGFILE"
FAILED_SERVICES+=("$SERVICE")
else
echo "$TIMESTAMP: $SERVICE recovered" >> "$LOGFILE"
fi
fi
done
# Send a single alert for all failed services
if [ ${#FAILED_SERVICES[@]} -gt 0 ]; then
SUBJECT="[CRITICAL] Services DOWN on $HOSTNAME"
BODY="The following services failed to restart:\n\n"
for s in "${FAILED_SERVICES[@]}"; do
BODY+=" - $s\n"
done
BODY+="\nImmediate attention required."
echo -e "$BODY" | mailx -s "$SUBJECT" "$ADMIN_EMAIL"
fi
Continuous Loop Monitoring
Instead of relying on cron for periodic checks, you can run the monitoring script as a continuous loop with a sleep interval:
#!/bin/bash
# continuous-monitor.sh
SERVICE="nginx"
CHECK_INTERVAL=30 # seconds between checks
MAX_RESTARTS=3 # max restarts before stopping attempts
restart_count=0
while true; do
if ! systemctl is-active --quiet "$SERVICE"; then
if [ $restart_count -lt $MAX_RESTARTS ]; then
echo "$(date): $SERVICE down. Restart attempt $((restart_count + 1))/$MAX_RESTARTS" >> /var/log/service-monitor.log
systemctl restart "$SERVICE"
restart_count=$((restart_count + 1))
sleep 5
else
echo "$(date): $SERVICE - max restart attempts reached. Alerting." >> /var/log/service-monitor.log
# Send critical alert here
break
fi
else
restart_count=0 # Reset counter when service is healthy
fi
sleep $CHECK_INTERVAL
done
Run this script as a systemd service itself so it starts automatically on boot and is managed by systemd:
# /etc/systemd/system/service-monitor.service
[Unit]
Description=Service Health Monitor
After=network.target
[Service]
Type=simple
ExecStart=/opt/scripts/continuous-monitor.sh
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable service-monitor
sudo systemctl start service-monitor
Cron-Based Monitoring
For simpler setups, schedule the monitoring script to run at regular intervals using cron:
# Check services every 2 minutes
*/2 * * * * /opt/scripts/monitor-all-services.sh
# Check every minute for critical services
* * * * * /opt/scripts/monitor-service.sh
Edit the crontab with sudo crontab -e to add these entries. Make sure the script has execute permissions:
chmod +x /opt/scripts/monitor-all-services.sh
Logging to a File
Consistent logging is essential for post-incident analysis. Include timestamps, service names, actions taken, and results in every log entry:
log_event() {
local LEVEL="$1"
local MESSAGE="$2"
echo "$(date '+%Y-%m-%d %H:%M:%S') [$LEVEL] $MESSAGE" >> /var/log/service-monitor.log
}
# Usage
log_event "INFO" "Starting service check"
log_event "WARNING" "nginx is inactive"
log_event "ERROR" "nginx restart failed"
log_event "INFO" "mysql is active - OK"
Rotate the log file using logrotate to prevent it from growing indefinitely:
# /etc/logrotate.d/service-monitor
/var/log/service-monitor.log {
weekly
rotate 4
compress
missingok
notifempty
}
Understanding Exit Codes
Exit codes are fundamental to bash scripting and service monitoring. Every command returns an exit code that indicates success (0) or failure (non-zero):
- 0: Service is active and running.
- 3: Service is inactive (stopped or dead).
- 4: Service unit not found (misspelled name or not installed).
Use $? to capture the exit code of the last command, or use the command directly in an if statement as shown in the scripts above.
Beyond Bash: When to Upgrade
Bash-based monitoring works well for simple scenarios, but consider upgrading to a dedicated monitoring solution when you need:
- Monitoring across dozens or hundreds of servers
- Historical metrics, graphing, and trend analysis
- Complex alerting rules with escalation policies
- Application-level health checks (HTTP endpoints, database queries)
- Integration with incident management workflows
NOC.org provides continuous monitoring that covers infrastructure and application health with enterprise-grade alerting.
Summary
A bash script combined with systemctl is-active gives you a lightweight, reliable way to monitor service health on Linux servers. Start with a simple check-and-restart script, add email or Slack alerting, and scale up to monitoring multiple services with a single script. Whether you run it in a continuous loop or schedule it through cron, bash-based monitoring ensures that service failures are detected and addressed quickly. For comprehensive infrastructure monitoring across your entire environment, explore NOC.org's monitoring capabilities and review our Linux security checklist for additional hardening steps.