Skip to main content
Bash Scripting – Complete Beginner to Advanced Guide
CHAPTER 15 Intermediate

System Administration Automation

Updated: May 16, 2026
30 min read

# CHAPTER 15

System Administration Automation

1. Introduction

The theoretical foundation of shell scripting syntax—loops, variables, and conditions—is complete. It is time to transition from writing academic exercises to engineering production-grade IT infrastructure. A system administrator's daily workload is defined by routine, repetitive maintenance: provisioning new user accounts, purging old log files, checking if hard drives are full, and ensuring critical web services haven't crashed. If performed manually, this is drudgery. In this chapter, we will synthesize our shell skills to solve real-world operational problems. We will architect fully functional Unix scripts designed for Bulk User Management, Disk Space Monitoring, and Automated Service Health Checks.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Synthesize loops and conditions into multi-stage administrative tools.
  • Write a script to autonomously read a text file and provision multiple user accounts.
  • Write a monitoring script utilizing df, awk, and mathematical conditions to detect full hard drives.
  • Write a self-healing script that checks the status of services and restarts them if they fail.
  • Generate automated, timestamped system health reports.

3. Bulk User Management Script

When a company hires 20 new engineers, you do not type useradd 20 times. You create a text file containing their usernames, and you write a script to read that file line by line.

The Workflow:

  1. 1. The script reads new_hires.txt.
  1. 2. It loops through each name.
  1. 3. It checks if the user already exists (to prevent errors).
  1. 4. If they don't exist, it creates the user.

sh
1234567891011121314151617181920
#!/bin/sh

# Ensure script is run as root
if [ "$(id -u)" -ne 0 ]; then
  echo "Please run as root (sudo)"
  exit 1
fi

FILE="new_hires.txt"

# Loop through the file line by line
while read USERNAME; do
    # Check if user exists by searching /etc/passwd quietly
    if grep -q "^$USERNAME:" /etc/passwd; then
        echo "User $USERNAME already exists. Skipping."
    else
        echo "Creating user $USERNAME..."
        useradd -m -s /bin/bash "$USERNAME"
    fi
done < "$FILE"

*(Note the syntax at the very bottom: < "$FILE". This feeds the physical text file directly into the while loop's read command!)*

4. Disk Space Monitoring Alert

A full hard drive is the #1 cause of server crashes. We need a script that runs via cron every hour, checks the root partition /, and alerts us if it crosses 90% capacity.

The Workflow:

  1. 1. Run df -h /.
  1. 2. Use awk and sed to extract just the raw percentage number.
  1. 3. Compare the number against our 90 threshold.

sh
12345678910111213141516
#!/bin/sh

THRESHOLD=90

# 1. df -h / (Get disk info)
# 2. awk '{print $5}' (Grab the 5th column: Use%)
# 3. tail -n 1 (Grab the actual data row, ignore the header)
# 4. sed 's/%//g' (Strip the '%' sign so the shell can do math on the raw number)
USAGE=$(df -h / | awk &#039;{print $5}' | tail -n 1 | sed 's/%//g')

if [ $USAGE -ge $THRESHOLD ]; then
    echo "CRITICAL WARNING: Disk space is at ${USAGE}%!"
    # In a real environment, you would use curl here to send a Slack or Email alert.
else
    echo "Disk space healthy at ${USAGE}%."
fi

5. Self-Healing Service Monitor

A web server (nginx or apache2) might crash due to a temporary memory spike. We don't want to wake up at 3:00 AM to type systemctl restart nginx. We will build a script that does it for us autonomously.

The Workflow:

  1. 1. Check the active status of nginx.
  1. 2. If the response is not active, restart the service.
  1. 3. Log the intervention.

sh
12345678910111213141516171819202122
#!/bin/sh

SERVICE="nginx"
LOG="/var/log/service_monitor.log"
TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")

# Check if the service is running (Suppress stdout)
systemctl is-active --quiet $SERVICE

# If exit code is not 0, it means it crashed
if [ $? -ne 0 ]; then
    echo "[$TIMESTAMP] ALERT: $SERVICE crashed! Attempting restart..." >> $LOG
    systemctl restart $SERVICE
    
    # Check again to see if the restart worked
    systemctl is-active --quiet $SERVICE
    if [ $? -eq 0 ]; then
        echo "[$TIMESTAMP] SUCCESS: $SERVICE recovered." >> $LOG
    else
        echo "[$TIMESTAMP] FATAL: $SERVICE failed to recover! Manual intervention required." >> $LOG
    fi
fi

6. Diagrams/Visual Suggestions

*Visual Concept: The Self-Healing Architecture* Draw a cycle diagram. Top Node: Cron Daemon (Fires every 5 mins). Arrow down to a script box: Monitor Script: is NGINX active? Arrow splits:
  • YES -> Arrow loops harmlessly back to the top.
  • NO -> Arrow points to a red box: Execute: systemctl restart.
Arrow from red box points to a log file icon: Write recovery to syslog. This visualizes the concept of autonomous remediation in modern systems administration.

7. Best Practices

  • Idempotency: A professional administrative script should be "Idempotent." This means you can run the script 1 time or 1,000 times, and it will safely result in the exact same state without causing errors or creating duplicates. Notice how our User Management script explicitly checks if grep -q "^$USERNAME:" /etc/passwd first. Without that check, running the script twice would trigger a barrage of messy useradd errors.

8. Common Mistakes

  • Assuming cron has root privileges: A script executing useradd or systemctl restart requires administrative rights. If a standard user schedules this script using their personal crontab -e, it will fail silently. System administration scripts MUST be scheduled in the root user's crontab (sudo crontab -e).

9. Mini Project: Automated System Report

Let's combine everything into a daily digest report.
  1. 1. nano system_report.sh
  1. 2. Write the code:
sh
1234567891011121314151617
#!/bin/sh
REPORT="/tmp/daily_report.txt"

echo "=== DAILY SYSTEM REPORT ===" > $REPORT
echo "Date: $(date)" >> $REPORT
echo "Hostname: $(hostname)" >> $REPORT
echo "" >> $REPORT

echo "--- UPTIME & LOAD ---" >> $REPORT
uptime >> $REPORT
echo "" >> $REPORT

echo "--- MEMORY USAGE ---" >> $REPORT
free -m >> $REPORT

echo "Report generated successfully."
cat $REPORT

10. Practice Exercises

  1. 1. Analyze the Disk Space Monitoring script. Explain the specific text processing function of the sed 's/%//g' command, and why it is mathematically necessary before the if statement can execute.
  1. 2. What is "Idempotency" in scripting? Provide an example of how you would make a directory-creation command (mkdir) idempotent.

11. MCQs with Answers

Question 1

When writing a script to automate the creation of users from a text file, which command structure is best suited for reading the file line-by-line directly into a variable without requiring external tools?

Question 2

An administrator writes a self-healing script to check if the Apache web server is running. Which systemctl flag is utilized to cleanly return a 0 (Success) or 1 (Failure) exit code without printing massive amounts of status text to the terminal screen?

12. Interview Questions

  • Q: You wrote a script to monitor disk space. You ran ./monitor.sh manually and it worked perfectly. You scheduled it in cron to run every hour. An hour later, the server crashed because the disk filled up to 100%, and your script never alerted you. Explain the environmental issue regarding the df, awk, and sed commands within the cron environment that likely caused the failure. *(Hint: Chapter 13).*
  • Q: Explain the concept of "Idempotent" scripting in Unix administration. Walk me through how you would ensure a shell script tasked with appending a specific configuration line to /etc/ssh/sshdconfig is strictly idempotent.
  • Q: A junior engineer is tasked with monitoring a critical background service. They write an infinite while true loop that checks the service status every 1 second. Explain the severe architectural flaw in this design and why scheduling a script via cron is vastly superior for service monitoring.

13. FAQs

Q: Can a Shell script send me an email or a Slack message if a server crashes? A: Yes. For email, you can pipe a string into the mail command (e.g., echo "Server Down" | mail -s "Alert" admin@corp.com). For Slack, you use the curl command to send a JSON payload via an HTTP POST request directly to a Slack Webhook URL. It only takes one line of code!

14. Summary

In Chapter 15, we transitioned from academic syntax into practical systems engineering. We constructed idempotent bulk-processing workflows, parsing text files to autonomously generate fleets of user accounts. We deployed complex text-processing pipelines (df | awk | sed) to translate raw system metrics into integer variables, allowing mathematical thresholds to trigger critical disk-space alerts. Finally, we engineered a self-healing service architecture, utilizing $? exit codes to detect and automatically remediate crashed system daemons without human intervention.

15. Next Chapter Recommendation

Your scripts are powerful, but powerful scripts can cause catastrophic damage if they encounter unexpected data. You must learn to trace their logic and handle errors gracefully. Proceed to Chapter 16: Error Handling and Debugging.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·