Back to Learn

Monitor Service Status with a Bash Script | NOC.org

Why Monitor Services with Bash?

Production servers run critical services — web servers, databases, application processes — that must remain available. While enterprise monitoring solutions provide comprehensive dashboards and alerting, a simple bash script can serve as a lightweight watchdog that detects failures and takes immediate action.

Bash-based monitoring is especially useful for small environments, edge servers, or as a backup monitoring layer that operates independently from your primary monitoring stack. It requires no additional software installation and can be set up in minutes.

Checking Service Status with systemctl

On systemd-based Linux distributions (Ubuntu 16.04+, CentOS 7+, Debian 8+), the systemctl is-active command provides a simple way to check whether a service is running:

$ systemctl is-active nginx
active

$ systemctl is-active mysql
active

$ systemctl is-active some-stopped-service
inactive

The command returns a single word — active, inactive, failed, or activating — and sets the exit code accordingly. An exit code of 0 means active; any non-zero exit code means the service is not running.

systemctl is-active nginx
echo $?    # 0 if active, non-zero if not

Basic Monitoring Script

Here is a minimal script that checks a service and restarts it if it is not running:

#!/bin/bash
# monitor-service.sh - Check and restart a service if down

SERVICE="nginx"

if ! systemctl is-active --quiet "$SERVICE"; then
    echo "$(date): $SERVICE is down. Attempting restart..." >> /var/log/service-monitor.log
    systemctl restart "$SERVICE"

    # Check if restart succeeded
    sleep 2
    if systemctl is-active --quiet "$SERVICE"; then
        echo "$(date): $SERVICE restarted successfully." >> /var/log/service-monitor.log
    else
        echo "$(date): CRITICAL - $SERVICE failed to restart!" >> /var/log/service-monitor.log
    fi
fi

The --quiet flag suppresses output and only sets the exit code, making it ideal for scripting. The script logs all actions to a file for later review.

Adding Email Alerts

When a service goes down, you need to know about it immediately. Add email notifications using mailx or sendmail:

#!/bin/bash
# monitor-service-email.sh

SERVICE="nginx"
ADMIN_EMAIL="admin@example.com"
HOSTNAME=$(hostname)
LOGFILE="/var/log/service-monitor.log"

if ! systemctl is-active --quiet "$SERVICE"; then
    TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
    echo "$TIMESTAMP: $SERVICE is down on $HOSTNAME" >> "$LOGFILE"

    # Attempt restart
    systemctl restart "$SERVICE"
    sleep 3

    if systemctl is-active --quiet "$SERVICE"; then
        SUBJECT="[RECOVERED] $SERVICE restarted on $HOSTNAME"
        BODY="$SERVICE was found inactive at $TIMESTAMP and was automatically restarted."
        echo "$TIMESTAMP: $SERVICE recovered after restart" >> "$LOGFILE"
    else
        SUBJECT="[CRITICAL] $SERVICE DOWN on $HOSTNAME"
        BODY="$SERVICE was found inactive at $TIMESTAMP. Automatic restart FAILED. Immediate attention required."
        echo "$TIMESTAMP: CRITICAL - $SERVICE restart failed" >> "$LOGFILE"
    fi

    echo "$BODY" | mailx -s "$SUBJECT" "$ADMIN_EMAIL"
fi

Make sure mailx is installed and your server can send outbound email. On Ubuntu, install it with sudo apt install mailutils.

Adding Slack Alerts

Many teams prefer Slack notifications. You can send alerts to a Slack channel using an incoming webhook:

#!/bin/bash
# send-slack-alert.sh

SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
SERVICE="$1"
STATUS="$2"
HOSTNAME=$(hostname)

if [ "$STATUS" = "critical" ]; then
    COLOR="#FF0000"
    ICON=":red_circle:"
else
    COLOR="#36a64f"
    ICON=":white_check_mark:"
fi

PAYLOAD=$(cat <<EOJSON
{
  "attachments": [{
    "color": "$COLOR",
    "title": "$ICON $SERVICE on $HOSTNAME",
    "text": "Service status: $STATUS\nTime: $(date '+%Y-%m-%d %H:%M:%S')",
    "footer": "Service Monitor"
  }]
}
EOJSON
)

curl -s -X POST -H 'Content-type: application/json' --data "$PAYLOAD" "$SLACK_WEBHOOK"

Call this from your monitoring script:

# After failed restart
bash /opt/scripts/send-slack-alert.sh "$SERVICE" "critical"

# After successful restart
bash /opt/scripts/send-slack-alert.sh "$SERVICE" "recovered"

Monitoring Multiple Services

Most servers run multiple services. Here is a script that monitors a list of services:

#!/bin/bash
# monitor-all-services.sh

SERVICES=("nginx" "mysql" "php8.1-fpm" "redis-server" "postfix")
ADMIN_EMAIL="admin@example.com"
HOSTNAME=$(hostname)
LOGFILE="/var/log/service-monitor.log"
FAILED_SERVICES=()

for SERVICE in "${SERVICES[@]}"; do
    if ! systemctl is-active --quiet "$SERVICE"; then
        TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
        echo "$TIMESTAMP: $SERVICE is down. Attempting restart..." >> "$LOGFILE"

        systemctl restart "$SERVICE"
        sleep 2

        if ! systemctl is-active --quiet "$SERVICE"; then
            echo "$TIMESTAMP: CRITICAL - $SERVICE restart failed" >> "$LOGFILE"
            FAILED_SERVICES+=("$SERVICE")
        else
            echo "$TIMESTAMP: $SERVICE recovered" >> "$LOGFILE"
        fi
    fi
done

# Send a single alert for all failed services
if [ ${#FAILED_SERVICES[@]} -gt 0 ]; then
    SUBJECT="[CRITICAL] Services DOWN on $HOSTNAME"
    BODY="The following services failed to restart:\n\n"
    for s in "${FAILED_SERVICES[@]}"; do
        BODY+="  - $s\n"
    done
    BODY+="\nImmediate attention required."
    echo -e "$BODY" | mailx -s "$SUBJECT" "$ADMIN_EMAIL"
fi

Continuous Loop Monitoring

Instead of relying on cron for periodic checks, you can run the monitoring script as a continuous loop with a sleep interval:

#!/bin/bash
# continuous-monitor.sh

SERVICE="nginx"
CHECK_INTERVAL=30  # seconds between checks
MAX_RESTARTS=3     # max restarts before stopping attempts
restart_count=0

while true; do
    if ! systemctl is-active --quiet "$SERVICE"; then
        if [ $restart_count -lt $MAX_RESTARTS ]; then
            echo "$(date): $SERVICE down. Restart attempt $((restart_count + 1))/$MAX_RESTARTS" >> /var/log/service-monitor.log
            systemctl restart "$SERVICE"
            restart_count=$((restart_count + 1))
            sleep 5
        else
            echo "$(date): $SERVICE - max restart attempts reached. Alerting." >> /var/log/service-monitor.log
            # Send critical alert here
            break
        fi
    else
        restart_count=0  # Reset counter when service is healthy
    fi
    sleep $CHECK_INTERVAL
done

Run this script as a systemd service itself so it starts automatically on boot and is managed by systemd:

# /etc/systemd/system/service-monitor.service
[Unit]
Description=Service Health Monitor
After=network.target

[Service]
Type=simple
ExecStart=/opt/scripts/continuous-monitor.sh
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable service-monitor
sudo systemctl start service-monitor

Cron-Based Monitoring

For simpler setups, schedule the monitoring script to run at regular intervals using cron:

# Check services every 2 minutes
*/2 * * * * /opt/scripts/monitor-all-services.sh

# Check every minute for critical services
* * * * * /opt/scripts/monitor-service.sh

Edit the crontab with sudo crontab -e to add these entries. Make sure the script has execute permissions:

chmod +x /opt/scripts/monitor-all-services.sh

Logging to a File

Consistent logging is essential for post-incident analysis. Include timestamps, service names, actions taken, and results in every log entry:

log_event() {
    local LEVEL="$1"
    local MESSAGE="$2"
    echo "$(date '+%Y-%m-%d %H:%M:%S') [$LEVEL] $MESSAGE" >> /var/log/service-monitor.log
}

# Usage
log_event "INFO" "Starting service check"
log_event "WARNING" "nginx is inactive"
log_event "ERROR" "nginx restart failed"
log_event "INFO" "mysql is active - OK"

Rotate the log file using logrotate to prevent it from growing indefinitely:

# /etc/logrotate.d/service-monitor
/var/log/service-monitor.log {
    weekly
    rotate 4
    compress
    missingok
    notifempty
}

Understanding Exit Codes

Exit codes are fundamental to bash scripting and service monitoring. Every command returns an exit code that indicates success (0) or failure (non-zero):

  • 0: Service is active and running.
  • 3: Service is inactive (stopped or dead).
  • 4: Service unit not found (misspelled name or not installed).

Use $? to capture the exit code of the last command, or use the command directly in an if statement as shown in the scripts above.

Beyond Bash: When to Upgrade

Bash-based monitoring works well for simple scenarios, but consider upgrading to a dedicated monitoring solution when you need:

  • Monitoring across dozens or hundreds of servers
  • Historical metrics, graphing, and trend analysis
  • Complex alerting rules with escalation policies
  • Application-level health checks (HTTP endpoints, database queries)
  • Integration with incident management workflows

NOC.org provides continuous monitoring that covers infrastructure and application health with enterprise-grade alerting.

Summary

A bash script combined with systemctl is-active gives you a lightweight, reliable way to monitor service health on Linux servers. Start with a simple check-and-restart script, add email or Slack alerting, and scale up to monitoring multiple services with a single script. Whether you run it in a continuous loop or schedule it through cron, bash-based monitoring ensures that service failures are detected and addressed quickly. For comprehensive infrastructure monitoring across your entire environment, explore NOC.org's monitoring capabilities and review our Linux security checklist for additional hardening steps.

Improve Your Websites Speed and Security

14 days free trial. No credit card required.