Skip to main content
Linux Command Line – Complete Beginner to Advanced Guide
CHAPTER 16 Beginner

Monitoring and System Logs

Updated: May 16, 2026
25 min read

# CHAPTER 16

Monitoring and System Logs

1. Introduction

When a Linux server crashes in the middle of the night, it doesn't leave a sticky note explaining what happened. However, the operating system is obsessively paranoid; it silently records every single hardware failure, software error, and user login into deep, hidden text files. The ability to read these digital black boxes is what separates a junior operator from a senior system administrator. In this chapter, we will master the diagnostic tools required to check the vitals of the machine. We will monitor RAM consumption with free, assess CPU strain with uptime, read kernel panic messages with dmesg, and interact with the all-powerful systemd logging engine using journalctl.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Check system load averages and duration online using uptime.
  • Monitor active memory (RAM) and swap space using free.
  • Investigate hardware and driver failures using dmesg.
  • Navigate traditional log files stored in /var/log (e.g., syslog, auth.log).
  • Query and filter the modern centralized logging daemon using journalctl.

3. Checking System Vitals (uptime and free)

Before digging into complex text logs, you must check the basic biological vitals of the server.

1. The uptime Command (CPU Load):

bash
12
uptime
# Output: 14:05:12 up 34 days,  2:14,  1 user,  load average: 0.45, 0.50, 0.61

The most critical part of this output is the Load Average. It shows the CPU strain over the last 1 minute, 5 minutes, and 15 minutes.

  • If you have a 1-core CPU, a load of 1.00 means the CPU is at 100% capacity.
  • If the load is 5.00 on a 1-core CPU, the server is suffocating; tasks are waiting in a massive line.

2. The free Command (RAM): When a database crashes, it is almost always because the server ran out of RAM.

bash
1
free -m

The -m flag displays the output in Megabytes (instead of confusing bytes). Pay close attention to the Swap row. Swap space is an emergency overflow file on the hard drive used when the physical RAM is 100% full. If your server is heavily using Swap, the machine will run incredibly slow.

4. Hardware and Boot Logs (dmesg)

If you plug a new USB drive into a server and nothing happens, how do you know if the motherboard even detected it? You use the Diagnostic Message (dmesg) command. dmesg prints the absolute lowest-level communications between the Linux Kernel and the physical hardware.
bash
12
# Print the hardware log and pipe it to tail to see the most recent events
dmesg | tail -n 20

If a hard drive is physically dying and failing to spin, the terrifying red error messages will appear here.

5. Traditional Text Logs (/var/log)

Historically, Linux applications were programmed to write their own text files into the /var/log directory. You must know where these are:
  • /var/log/syslog (Ubuntu) or /var/log/messages (CentOS): The general "junk drawer" of system events.
  • /var/log/auth.log (Ubuntu) or /var/log/secure (CentOS): Records every single successful and failed SSH login attempt.
  • *(You view these files using the cat, less, or tail -f tools we learned in Chapter 5).*

6. The Modern Engine (journalctl)

Modern Linux distributions (anything using systemd) have abandoned traditional loose text files in favor of a massive, centralized, binary logging database. You cannot read this database with cat. You must query it using the journalctl command.

journalctl is incredibly powerful because it allows you to filter logs mathematically:

bash
12345678
# Show logs for a specific service (e.g., the SSH service)
sudo journalctl -u ssh

# Show all critical system errors that happened since yesterday
sudo journalctl --since "yesterday" -p err

# Follow the central log live in real-time
sudo journalctl -f

7. Diagrams/Visual Suggestions

*Visual Concept: The Logging Funnel* Draw three separate icons: A Web Server, the SSH Service, and the Hardware Kernel. Draw arrows from all three funneling into a central, glowing cylinder labeled systemd Journal. Draw an arrow coming out of the bottom of the cylinder into a magnifying glass labeled journalctl (Query Tool). This visualizes the modern shift from fragmented text files to centralized, queryable logging architecture.

8. Best Practices

  • Persistent Journals: By default on some systems, the journalctl database is stored in RAM (volatile memory). This means if the server crashes and reboots, the log explaining *why* it crashed is erased! Administrators must ensure the directory /var/log/journal exists so that systemd writes the logs permanently to the hard drive, allowing post-crash forensic analysis.

9. Common Mistakes

  • Ignoring the -u flag in Journalctl: Beginners often type journalctl and are overwhelmed by 50,000 lines of system noise, concluding that the tool is useless. journalctl is a database query tool. If you want to fix a broken NGINX web server, you MUST use the -u (Unit) flag: journalctl -u nginx. This filters out 99% of the noise and shows you only the web server's specific errors.

10. Mini Project: Forensic Security Audit

Let's see who is attacking your machine right now:
  1. 1. Type sudo less /var/log/auth.log (or secure on CentOS).
  1. 2. Press Shift + G to instantly jump to the very bottom (most recent) part of the file.
  1. 3. Look for lines that say Failed password for invalid user. These are automated bots in China and Russia attempting to brute-force your SSH port.
  1. 4. Now, let's use the modern tool. Type: sudo journalctl -u ssh --since "1 hour ago".
  1. 5. You just queried the master database for every SSH event that occurred in the last 60 minutes.

11. Practice Exercises

  1. 1. Analyze the output of the uptime command. If a server has 4 CPU cores, and the 1-minute load average is 2.50, is the CPU currently overloaded?
  1. 2. Explain the functional difference between reading logs via less /var/log/syslog versus querying logs via journalctl.

12. MCQs with Answers

Question 1

A server application is repeatedly crashing. You suspect the operating system is completely exhausting its physical RAM. Which command will instantly display the total, used, and available system memory in Megabytes?

Question 2

You need to investigate hardware-level errors generated by the Linux kernel during the boot sequence regarding a failing network interface card. Which command isolates and displays these specific kernel ring buffer messages?

13. Interview Questions

  • Q: A web developer states the server is acting sluggish. You run the free -m command and notice that the "Swap" memory usage is steadily increasing. Explain what "Swap" memory physically is, and why its usage causes severe system latency.
  • Q: Contrast the legacy /var/log directory structure with the modern systemd logging architecture. Why must an administrator use the journalctl command instead of cat or less when interacting with systemd logs?
  • Q: Explain the significance of the three "Load Average" numbers provided by the uptime command. How do you interpret these numbers in relation to the physical number of CPU cores on the motherboard?

14. FAQs

Q: Do these massive log files eventually fill up the entire hard drive? A: They would, but Linux uses an automated background utility called logrotate. Every night, it takes the massive syslog file, compresses it into a tiny .gz file, names it syslog.1.gz, and creates a brand new, empty syslog file for the next day. It automatically deletes logs older than a specific timeframe (usually 30 days) to prevent disk exhaustion.

15. Summary

In Chapter 16, we learned to interpret the diagnostic language of the operating system. We utilized uptime to calculate CPU load averages and free -m to monitor RAM and Swap saturation, establishing the baseline metrics for performance troubleshooting. We explored the deep hardware diagnostics of the kernel via dmesg and navigated the traditional plaintext security logs housed in /var/log. Most importantly, we bridged the gap to modern Linux administration by mastering the journalctl query engine, allowing us to filter, slice, and extract critical forensic data from the centralized systemd database.

16. Next Chapter Recommendation

You can monitor the server and investigate attacks in the logs. Now, it is time to actively block the attackers. Proceed to Chapter 17: Linux Security Basics.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·