6 min read

Keeping Your Linux machines Happy: Automated System Reports That Actually Get Read

Transform the tedious chore of system monitoring into a "set it and forget it" operation.
Keeping Your Linux machines Happy: Automated System Reports That Actually Get Read

Let's be honest: manually checking your Linux systems is about as enjoyable as watching paint dry in a server room. You know you should do it regularly, but who has time to SSH into every machine just to run df -h for the hundredth time this week?

Fear not, fellow sysadmin! You're about to set up a proper automated system that'll send you lovely email reports without you having to lift a finger. Think of it as your Linux boxes dropping you a friendly note saying, "All good here, boss!" or perhaps more urgently, "Help, I'm drowning in log files!"

Why Bother Automating This Faff?

Manual system checks are like manually counting sheep. They are tedious, error-prone, and likely to send you to sleep. Automation brings consistency, saves precious time (which you can spend on more important things like arguing about systemd), and gives you that warm fuzzy feeling of proactive monitoring.

Plus, getting system reports straight to your inbox means you can check on your servers whilst having your morning brew, rather than frantically logging in when something's already gone tits up.

What this Clever Little Script Actually Does

The Python script I share with you in this article is like a diligent intern who never complains and always remembers to check:

  • Hostname and IP address (so you know which machine is having a wobble)
  • OS version and kernel details (because knowing your vintage matters)
  • CPU and memory usage (the classic "is it actually working or just pretending?")
  • Disk space availability (before your logs eat everything)
  • Running processes (who's hogging all the RAM this time?)
  • Docker container health (because containers are like cats — they look fine until suddenly they're not)

Once it's gathered all this intel, it packages it up nicely and fires it off via SendGrid's email API. Job done.

What You'll Need Before We Start

Before diving in, make sure you've got:

  • A Linux system with Python installed (if you haven't got Python, what are you even doing?)
  • A SendGrid account and API key (they're rather good at delivering emails, unlike that one mail server we don't talk about)
  • Basic knowledge of Python or shell scripting (don't worry, it's not rocket science)
  • Docker and Docker Compose installed (because who isn't running containers these days?)

The Script That Does All The Hard Work

Here's the script:

#!/usr/bin/env python3
"""
Linux System Report Script
Gathers system information and sends it via SendGrid email.
"""

import subprocess
import os
import sys
from datetime import datetime
import sendgrid
from sendgrid.helpers.mail import Mail, Content

def load_config():
    """Load configuration from dot files."""
    config = {}
    config_files = {
        'api_key': '.apikey',
        'from_email': '.from_email', 
        'to_email': '.to_email'
    }
    
    for key, filename in config_files.items():
        try:
            with open(filename, 'r') as f:
                config[key] = f.read().strip()
        except FileNotFoundError:
            print(f"Error: Configuration file '{filename}' not found.")
            sys.exit(1)
        except Exception as e:
            print(f"Error reading '{filename}': {str(e)}")
            sys.exit(1)
    
    return config

def run_command(command):
    """Run a shell command and return its output, handling errors gracefully."""
    try:
        result = subprocess.run(command, shell=True, capture_output=True, text=True, timeout=30)
        if result.returncode == 0:
            return result.stdout.strip()
        else:
            return f"Command failed: {result.stderr.strip()}"
    except subprocess.TimeoutExpired:
        return "Command timed out"
    except Exception as e:
        return f"Error running command: {str(e)}"

def get_system_info():
    """Collect comprehensive system information."""
    info = {}
    
    # Basic system info
    info['hostname'] = run_command('hostname')
    info['ip_addresses'] = run_command('hostname -I')
    info['os_release'] = run_command('cat /etc/os-release | grep PRETTY_NAME')
    info['kernel'] = run_command('uname -r')
    info['uptime'] = run_command('uptime')
    
    # CPU info (condensed)
    cpu_info = run_command('lscpu | grep -E "Model name|CPU\\(s\\):|Thread"')
    info['cpu_info'] = cpu_info
    
    # Memory usage
    info['memory'] = run_command('free -h')
    
    # Disk usage
    info['disk'] = run_command('df -h --type=ext4 --type=xfs --type=btrfs')
    
    # Top processes by memory usage
    info['top_processes'] = run_command('ps aux --sort=-%mem | head -10')
    
    # Load averages
    info['load'] = run_command('cat /proc/loadavg')
    
    # Docker info (only if Docker is running)
    docker_running = run_command('systemctl is-active docker 2>/dev/null')
    if docker_running == "active":
        # Get container health status
        containers = run_command('docker ps --format "table {{.Names}}\\t{{.Status}}"')
        info['docker_containers'] = containers if containers and "NAMES" in containers else "No containers running"
        
        # Get container stats (brief)
        stats = run_command('timeout 5 docker stats --no-stream --format "table {{.Name}}\\t{{.CPUPerc}}\\t{{.MemUsage}}"')
        info['docker_stats'] = stats if stats and "NAME" in stats else "No container statistics available"
    else:
        info['docker_containers'] = "Docker is not running or not installed"
        info['docker_stats'] = "N/A"
    
    # System temperature (if available)
    temp = run_command('sensors 2>/dev/null | grep "Core 0" | head -1')
    info['temperature'] = temp if temp else "Temperature sensors not available"
    
    return info

def format_report(info):
    """Format the system information into a readable email."""
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    
    report = f"""Linux System Report - {timestamp}
{'=' * 50}

BASIC SYSTEM INFO
-----------------
Hostname: {info['hostname']}
IP Addresses: {info['ip_addresses']}
OS: {info['os_release'].replace('PRETTY_NAME=', '').strip('"')}
Kernel: {info['kernel']}
Uptime: {info['uptime']}
Temperature: {info['temperature']}

CPU INFORMATION
---------------
{info['cpu_info']}

LOAD AVERAGES
-------------
{info['load']}

MEMORY USAGE
------------
{info['memory']}

DISK USAGE
----------
{info['disk']}

TOP PROCESSES (by memory usage)
-------------------------------
{info['top_processes']}

DOCKER CONTAINERS
-----------------
{info['docker_containers']}

DOCKER STATISTICS
-----------------
{info['docker_stats']}

Report generated at: {timestamp}
"""
    return report

def send_email(report_content, api_key, from_email, to_email):
    """Send the system report via SendGrid."""
    try:
        sg = sendgrid.SendGridAPIClient(api_key=api_key)
        
        hostname = run_command('hostname')
        subject = f"System Report: {hostname} - {datetime.now().strftime('%Y-%m-%d %H:%M')}"
        
        message = Mail(
            from_email=from_email,
            to_emails=to_email,
            subject=subject,
            plain_text_content=report_content
        )
        
        response = sg.send(message)
        print(f"Email sent successfully! Status code: {response.status_code}")
        return True
        
    except Exception as e:
        print(f"Error sending email: {str(e)}")
        return False

def main():
    """Main function to orchestrate the system report generation and sending."""
    
    # Load configuration from dot files
    print("Loading configuration...")
    config = load_config()
    
    print("Gathering system information...")
    system_info = get_system_info()
    
    print("Formatting report...")
    report = format_report(system_info)
    
    print("Sending email...")
    success = send_email(report, config['api_key'], config['from_email'], config['to_email'])
    
    if success:
        print("System report sent successfully!")
    else:
        print("Failed to send system report.")
        sys.exit(1)

if __name__ == "__main__":
    main()

Getting This Beauty Up and Running

1. Save the Script

Pop the script into a file called system_report.py and make it executable:

chmod +x system_report.py

2. Install the Dependencies

You'll need the SendGrid Python package:

pip3 install sendgrid

3. Create Configuration Files

Instead of hardcoding sensitive information (rookie mistake!), create secure dot files for your configuration:

# Create your SendGrid API key file
echo "your_actual_sendgrid_api_key_here" > .apikey

# Create your sender email file
echo "[email protected]" > .from_email

# Create your recipient email file
echo "[email protected]" > .to_email

Important: Secure these files so only you can read them:

chmod 600 .apikey .from_email .to_email

This ensures that only the file owner (you) can read and write these sensitive configuration files.

4. Take It for a Spin

Run the script to see if it behaves:

python3 system_report.py

5. Automate the Whole Thing with Cron

Because manual is for muggles, set up a cron job:

crontab -e

Add this line to get a daily report at 8 AM (adjust to taste):

0 8 * * * cd /path/to/your/script && /usr/bin/python3 system_report.py

For extra points, you might want weekly detailed reports and daily brief ones. Just create different versions of the script or add command-line arguments.

Making It Even Better

Want to jazz it up further? Consider adding:

  • Alerts for critical thresholds (disk space below 10%, memory usage above 90%)
  • Historical trending (store data and compare with previous reports)
  • Slack or Teams integration (because not everyone lives in their email)
  • Custom metrics for your specific applications
  • HTML formatting for prettier emails (though plain text has its charm)

Some Thoughts

This script transforms the tedious chore of system monitoring into a "set it and forget it" operation. Your future self will thank you when you spot that rogue process eating all your RAM before it brings down your entire application.

Remember, the best monitoring system is the one you actually use. Start simple, get it working, then gradually add bells and whistles as needed.

Got any clever enhancements or war stories about system monitoring gone wrong? I'd love to hear about them. After all, we're all in this together, trying to keep our Linux boxes happy and our bosses none the wiser about that time we almost filled up the root partition with log files.

Now go forth and automate responsibly! 🐧

This article is also posted on Medium:

Keeping Your Linux machines Happy: Automated System Reports That Actually Get Read
Let’s be honest: manually checking your Linux systems is about as enjoyable as watching paint dry in a server room. You know you should do…