Kahibaro
Discord Login Register

3.5.5 Monitoring running services

Why Monitoring Services Is Different from Monitoring Processes

System monitoring in general focuses on CPU, memory, disk, and overall process activity. Monitoring services is more specific:

Monitoring services usually means answering questions like:

This chapter focuses on practical ways to answer these questions and set up basic checks and alerts.

Checking Service Status with systemd

Most modern distributions use systemd. For system services, systemctl is your primary tool.

Basic status checks

To check if a service is active:

bash
systemctl status ssh
systemctl status nginx
systemctl status cron

Key fields to pay attention to:

For a quick one-line status:

bash
systemctl is-active ssh
systemctl is-enabled ssh

These are useful in scripts where you only care about a simple answer.

Listing and filtering services

To see all loaded services:

bash
systemctl list-units --type=service

To see failed services only:

bash
systemctl --failed --type=service

To filter by name:

bash
systemctl list-units --type=service | grep ssh

This helps spot services that have crashed or failed to start.

Monitoring Service Logs for Problems

Service health issues often show up first in logs. With systemd, use journalctl to view them.

Viewing logs for a specific service

bash
journalctl -u ssh.service
journalctl -u nginx.service

Useful options:

bash
  journalctl -u ssh.service -f
bash
  journalctl -u nginx.service --since "1 hour ago"

When monitoring, look for:

Spotting frequent restarts

Use systemctl for a quick overview:

bash
systemctl status nginx

Look at:

Or use journalctl to search for restart messages:

bash
journalctl -u nginx.service | grep -i "start request"

Frequent restarts may indicate a crash loop, misconfiguration, or missing dependencies.

Simple Command-Line Health Checks

Checking that a service process is running does not guarantee it’s healthy. Basic service-level checks often involve talking to the service over the network or via its command interface.

Checking network services (HTTP, SSH, etc.)

For services that listen on TCP ports:

Examples:

bash
# Check simple HTTP response
curl -I http://localhost
# Check HTTPS (ignoring certificate issues)
curl -kI https://localhost
# Test if a TCP port is reachable (e.g., SSH on port 22)
nc -zv localhost 22

Interpretation:

Using service-specific status commands

Some services ship their own status or health commands. Examples:

These often provide more accurate health information than just checking the process.

Resource Usage of Services

A service might be “running” but misbehaving due to resource issues (high CPU, memory leaks, etc.). You can connect process-level monitoring tools to specific services.

Using top/htop with service names

Start top:

bash
top

Then filter by command name (e.g., sshd, nginx, postgres). With htop, you can:

This helps you watch how much CPU/memory a service is consuming over time.

Linking systemd services to their processes

To see the main PID and children of a service:

bash
systemctl status apache2

Or use:

bash
systemd-cgls

This shows a tree of control groups, letting you see which processes belong to which service.

On some distributions, systemd-cgtop gives a live view of resource usage by service:

bash
systemd-cgtop

You’ll see CPU and memory consumed per unit (service), useful for spotting resource hogs.

Automatic Restarts and Watchdogs

Monitoring often goes hand-in-hand with automatic recovery. systemd can be configured to restart services and act as a basic watchdog.

systemd service restart options

Within a service’s unit file (typically in /usr/lib/systemd/system or /etc/systemd/system), you may see options like:

ini
[Service]
Restart=on-failure
RestartSec=5s
StartLimitIntervalSec=60
StartLimitBurst=5

Key directives:

For an existing service, you can check these settings with:

bash
systemctl cat nginx.service

While configuration details belong in service-management chapters, from a monitoring perspective, you must understand whether a failed service will be automatically restarted or not.

systemd watchdogs

Some services support systemd’s watchdog mechanism:

From a monitoring perspective, services with watchdogs can detect hangs, not just crashes.

Simple Scripting for Service Monitoring

For small systems, you might build basic checks using shell scripts and cron, before moving to full monitoring suites.

Checking service status in a script

A very simple example:

#!/bin/bash
SERVICE="nginx"
if ! systemctl is-active --quiet "$SERVICE"; then
    echo "$(date): $SERVICE is not running!" >> /var/log/service-monitor.log
    # Optional: try to restart
    systemctl start "$SERVICE"
fi

Key ideas:

Running checks periodically with cron

You can schedule the script using the system crontab or user crontab. For example:

bash
sudo crontab -e

Add:

cron
*/5 * * * * /usr/local/bin/check-nginx.sh

This runs the script every 5 minutes. For more advanced scheduling and logging, see automation and cron chapters.

Integrating with Monitoring Systems

Larger environments usually rely on dedicated monitoring tools. While setup details belong elsewhere, it’s important to understand what they typically check for each service.

Common types of service checks

Monitoring systems (Nagios, Icinga, Zabbix, Prometheus-based stacks, etc.) often perform:

From a service-monitoring point of view, you’ll often:

Using check scripts as plugins

Many monitoring tools allow you to register custom scripts that return:

For example:

#!/bin/bash
if systemctl is-active --quiet nginx; then
    echo "OK - nginx is running"
    exit 0
else
    echo "CRITICAL - nginx is not running"
    exit 2
fi

This bridges simple systemctl checks with a full monitoring and alerting system.

Practical Service Monitoring Checklist

For each important service on a system, ensure you can answer:

Focusing on these points gives you an effective, practical approach to monitoring running services, even before deploying more advanced monitoring stacks.

Views: 59

Comments

Please login to add a comment.

Don't have an account? Register now!