Posts Tagged ‘devops’

Building a self-healing WordPress Monitoring shell Script using Systemd, Apache, MariaDB simple automation for Linux server Auto Recovery

Friday, May 22nd, 2026

linux-self-healing-wordpress-script-automation-how-to-auto-recovery-broken-apache-mysql-mariadb-wordpress-server-tux-logo

Running a WordPress website in production is not only about publishing content  it is also about keeping the server healthy 24 / 7 to have a good yearly Website Uptime and if needed fit an SLA.

Even on stable Linux systems, services occasional failures are common for a reasons like:

  • Apache Crash / stop responding (due to bug or whatever)
  • MariaDB Database server acts unstable after heavy load (or server overheat)
  • WordPress platform auto updates leaves the site stuck in maintenance mode (until manually fixed)
  • Network outage (or a DHCP server malfunction, IP / MAC conflics can disrupt network).

There is plenty of other things that can go wrong, but generally usually a website infrastructure running on a Linux server that counts for proper productivity on basically a web server (apache) / mariadb / postgresql (or whatever other service) and WordPress based website has a set of common issues faced. That require a sysadmin to partake simple steps to resolve it.
Temporary outages can become kind of permanent without proper monitoring and introduction of automatic recovery procedures.

Within the age of clouds and automation reducing outages is key to success!

To reduce downtime and avoid manual intervention, there is a lot of things a sysadmin can do but a lot of traditional options are mostly neglected or uknown by the the new and knowledgable SREs (Site Reliability Engineers), most of whom seem to be a Gen-Z 🙂

Thus an alternative approach to the new ways of working is to keep up to the old standards and use lightweight self-healing Bash monitoring script for my WordPress based site / blog. I use such script myself as a do have a self-hosted infrastructure, so decided to share it with hope someone can benefit of it.

The server-health-check-restore-wp-apache-mariadb.sh script continuously checks:

  • Apache health state
  • MariaDB availability
  • HTTP response code status equals 200 ( OK )
  • WordPress maintenance mode (is not disabled

As an auto healing steps It then partakes:

  • Restarts of found failed services
  • Cleans stuck . maintenance wordpress files
  • Reboots the entire server after repeated database failures

This approach provides a simple but highly effective watchdog mechanism without needing complex monitoring software.

1. The server-health-check-restore-wp-apache-mariadb.sh
automation self-healing Script

 

$ cat /usr/local/bin/server-health-check-restore-wp-apache-mariadb.sh

#!/bin/bash

URL="https://www.pc-freak.net/blog/"
MAINT_FILE="/var/www/blog/.maintenance"
KEYWORD="Briefly unavailable for scheduled maintenance"

APACHE_SERVICE="apache2"
MARIADB_SERVICE="mariadb"

MAX_DB_RESTARTS=5
RESTART_COUNT_FILE="/var/run/mariadb_restart_count"

log() {
    echo "$(date): $1"
}

# —- Apache check —-
if ! systemctl is-active –quiet "$APACHE_SERVICE"; then
    log "Apache is not running. Restarting…"
    systemctl restart "$APACHE_SERVICE"
    sleep 5
fi

# —- MariaDB check —-
if ! systemctl is-active –quiet "$MARIADB_SERVICE"; then
    log "MariaDB is not running."

    # Read restart count
    if [ -f “$RESTART_COUNT_FILE” ]; then
        RESTART_COUNT=$(cat "$RESTART_COUNT_FILE")
    else
        RESTART_COUNT=0
    fi

    RESTART_COUNT=$((RESTART_COUNT + 1))
    echo "$RESTART_COUNT" > "$RESTART_COUNT_FILE"

    log "Restart attempt $RESTART_COUNT of $MAX_DB_RESTARTS"
    systemctl restart "$MARIADB_SERVICE"
    sleep 10

    # Re-check MariaDB
    if ! systemctl is-active –quiet "$MARIADB_SERVICE"; then
        log "MariaDB still unhealthy after restart."

        if [ “$RESTART_COUNT” -ge “$MAX_DB_RESTARTS” ]; then
            log "MariaDB failed $MAX_DB_RESTARTS times. Rebooting server!"
            rm -f "$RESTART_COUNT_FILE"
            /sbin/reboot
            exit 0
        fi

        exit 0
    fi
else
    # MariaDB healthy → reset counter
    if [ -f “$RESTART_COUNT_FILE” ]; then
        log "MariaDB is healthy again. Resetting restart counter."
        rm -f "$RESTART_COUNT_FILE"
    fi
fi

# —- HTTP sanity check —-
HTTP_CODE=$(curl -L –max-redirs 5 -s -o /dev/null -w "%{http_code}" –max-time 10 "$URL")

if [[ “$HTTP_CODE” != “200” ]]; then
    log "Site returned HTTP $HTTP_CODE. Skipping WordPress maintenance cleanup."
    exit 0
fi

# —- WordPress maintenance check —-
PAGE_CONTENT=$(curl -L –max-redirs 5 -s –max-time 10 "$URL")

if echo "$PAGE_CONTENT" | grep -qi "$KEYWORD"; then
    if [ -f “$MAINT_FILE” ]; then
        rm -f "$MAINT_FILE"
        log "WordPress maintenance file removed."
    else
        log "Maintenance message detected, but .maintenance file not found."
    fi
else
    log "Site healthy. No maintenance mode detected."
fi

1.1. Make script executable

Store the script somewhere under /usr/local/bin/ and make it executable:

# chmod +x /usr/local/bin/server-health-check-restore-wp-apache-mariadb.sh

1.2. Schedule it to run via Cron job

Run the script lets say every 5 minutes with cron and make it log to a log file:

# crontab -u root -e

*/5 * * * * /usr/sbin/server-health-check-restore-wp-apache-mariadb.sh >> /var/log/wp_healthcheck.log 2>&1

2. What This Script Actually Does

The script acts like a mini watchdog daemon.

Instead of relying on heavyweight enterprise monitoring systems, it uses:

systemctl , curl , grep combined with simple scripting  logic.

The simplicity of solution advantage is for maintenance it is easy it is transparent and highly portable as it will run on virtually ever Linux server / VPS without the need to install anything additional.

2.1 Apache Health Checks

The first section checks whether Apache is running:

# systemctl is-active –quiet apache2

If Apache is down, the script automatically restarts it:

# systemctl restart apache2

This solves temporary crashes caused by:

  • memory exhaustion
  • bad PHP workers
  • failed reloads
  • temporary kernel pressure

A short sleep delay gives Apache time to recover before additional checks continue.

2.2. MariaDB Recovery Script Logic

The database layer is more critical than Apache.

A web server can recover instantly, but repeated MariaDB crashes often indicate:

  • corrupted tables
  • exhausted RAM
  • deadlocks
  • disk problems
  • InnoDB failures

Because of that, the script implements a restart counter.

2.3. Restart Counter Logic

The counter is stored in file:

/var/run/mariadb_restart_count

Every failed startup increments the counter:

RESTART_COUNT=$((RESTART_COUNT + 1))

If MariaDB recovers successfully, the counter is deleted.

This prevents accidental reboot loops.

2.4. Automatic Server Reboot if too many auto recovery attempts

If MariaDB fails too many times:

MAX_DB_RESTARTS=5

the script escalates to a full system reboot:

/sbin/reboot

2.5. Why use reboot at continuous services failure?

Well reboot might not always work and under some cases it can make things better, but in case if you have a multiple servers running the same set of service with Apache and Mysql  with Haproxy or other Load balancer in front this set of logic is just perfect:

  • kernel resources are exhausted
  • filesystem locks remain stuck
  • memory fragmentation becomes severe
  • hardware drivers misbehave

A clean reboot can recover the machine faster than manual debugging during production outages !

This kind of script can be especially useful on:

  • Rarely mainteinaed Linux / VPS servers
  • unattended cloud instances
  • remote hosting environments

2.6. HTTP Sanity Check

After validating services, the script checks whether the website actually responds correctly.

$ curl -L –max-redirs 5

The script expects as normal a return code of:

HTTP 200

Anything else:

  • 500 errors
  • redirect loops
  • gateway failures
  • CDN problems

will stop the maintenance cleanup logic.

This prevents accidental removal of WordPress maintenance files during unrelated outages.

2.7. Automatic WordPress Maintenance Mode Recovery

One of the most annoying WordPress problems happens during failed updates.

WordPress creates under its install directory say /var/www/ a file:

.maintenance

If the update crashes, the file remains forever and the site displays:

“Briefly unavailable for scheduled maintenance.”

The script detects this message directly from the webpage content with grep:

$ grep -qi "$KEYWORD"

If detected, it removes the stale file automatically:

rm -f "$MAINT_FILE"

This instantly restores the site without requiring manual SSH intervention.

3. Why Simple script approach Works well and is good idea

This setup has several advantages, among key one is It is Extremely Lightweight.

No additional complications of use of trendy stuff like:

  • Docker stack
  • Zabbix
  • Kubernetes
  • Prometheus
  • external monitoring agents etc.

Everything is handled with simple native well known Linux tools.

3.1. It is Easy to Debug

Everything is plain Bash.

No hidden automation layers.

Every action is visible and understandable.

3.2. Production Friendly

The script tolerates:

  • temporary outages
  • service crashes
  • failed WordPress upgrades

without requiring administrator interaction.

4. Possible future script Improvements

There are many ways to extend script setup further, here is few ideas.

4.1. Add Email Notifications

Send alerts when:

  • services restart
  • reboot occurs
  • maintenance mode is detected

4.2. Add Disk Space Monitoring

Automatically detect:

  • full disks
  • inode exhaustion
  • backup growth

4.3. Add simple MySQL Query Health Checks

Instead of only checking the service state:

mysqladmin ping

could validate actual database responsiveness.

4.4. Introduce systemd Integration

Instead of cron-based execution, you might want to make the script could be made native if you use :

  • systemd timer
  • systemd service

Close up Summary

In many cases, simple Linux automation still beats overengineered solutions.

Today overcomplication of monitoring is a trend especially for big companies however for home brew small projects on little budget, sometimes the best server automation is the simplest one.
 A few lines of Bash can improve as shown above could improve uptime and reduce operational headaches.

For small-to-medium WordPress / Website deployments, this kind of self-healing “watchdog “ guarantees you reliability , simplicity , transparency , relatively quick fast recovery in case of crashes without brining a any  unnecessary infrastructure complexity, plus this setup works with zero human interaction and if combined with a simple Slack / Discord monitoring python script you can sleep better.

 

Building a 10-Server FreeBSD Jail Cluster Running a LAMP (Linux / Apache / MySQL / Perl / PHP / Python) Stack

Wednesday, March 25th, 2026

building-freebsd-jails-cluster-running-linux-apache-10-cluster-high-availability-with-mariadb-perl-php-howto

Virtualization and workload isolation are foundational to modern infrastructure.
While most teams today default to container platforms like Docker and orchestration systems such as Kubernetes, an older and highly capable alternative exists in the form of jails from FreeBSD.

FreeBSD jails provide lightweight OS-level isolation, allowing multiple independent userland environments to run on a single host. Introduced long before containers became mainstream, jails were designed with a strong focus on security, simplicity, and performance.
Despite their maturity and robustness, they are less commonly used today, largely due to the rapid rise of container ecosystems and cloud-native tooling.

Choosing between jails and containers is not simply a matter of “old vs new,” but rather a trade-off between control and simplicity versus portability and ecosystem support.

Short Comparison of FreeBSD jails and Containers ( Pros and Cons )

Advantages of FreeBSD Jails

a. Strong, simple isolation

Jails provide a clear and tightly integrated security boundary within the FreeBSD kernel. Their design is straightforward, reducing the risk of misconfiguration compared to layered container security models.

freebsd_jails_infographic_diagram

b. High performance

Because jails operate very close to the base system, they deliver near-native performance with minimal overhead—especially beneficial for networking and I/O-heavy workloads.

c. Operational simplicity

There are fewer component moving parts (easier to maintain and debbug):

  • No separate container runtime
  • No image layers
  • No complex orchestration requirements

This makes jails appealing for stable, long-running systems.

d. Predictability and stability

FreeBSD’s conservative, design philosophy results in systems that are highly stable over long periods, that is ideal for infrastructure roles like: storage or networking.

Disadvantages of FreeBSD Jails

a. Limited portability

Not neceserry a huge disadvantage but still,
Jails are tied to FreeBSD. Unlike containers, they cannot be easily moved across different operating systems or cloud platforms.


b. Smaller ecosystem

FBSD Jails is not full equivallent to:

  • Container registries (like Docker Hub)
  • Massive orchestration ecosystems (similar things has to be done with scripts and customizations)
  • Broad third-party integrations

This can slow down a bit development and deployment workflows. Though for a matured Applications that are once well tuned with jails that can be not a real probblem.

Note that though a con, this can also be a pros, as once you tune up an App for it becomes easier to maintain.

c. Less automation tooling

While tools exist, they are not as standardized or widely adopted as container-based CI/CD pipelines.

d. Harder to find people for it
 

Most developers and DevOps engineers are trained in container technologies, making hiring and collaboration easier in container-based environments. However for senior hard core sysadmins and system engineers that could be also advantage as not so many people have an indepth insight with both freebsd and fbsd jails.

This guide walks through a practical, production-style setup: 10 FreeBSD servers, each running isolated jails that host a classic LAMP stack (Linux, here replaced by FreeBSD, Apache, MySQL/MariaDB, PHP).
However still the use of companies or individuals who choose freebsd jails aim to better focus is on repeatability, clean architecture, and operational sanity, not just getting it to run once.

Architecture Overview of sample FBSD Cluster

Our Goal:

  • 10 physical or virtual servers
  • Each server runs multiple jails
  • Each jail runs a LAMP app instance
  • Load balancing across nodes (to have a High Availability Cluster like setup)

Host Setup:

  • 2 × load balancer nodes (nginx or HAProxy)
  • 6 × application nodes (Apache + PHP in jails)
  • 2 × database nodes (MariaDB primary/replica)

All systems run FreeBSD, using native jails for isolation.

1. Base FreeBSD Installation (All 10 Servers)

Install FreeBSD on each machine (minimal install is fine).

Update system:

# freebsd-update fetch install
# pkg update && pkg upgrade -y

Install base tools:

# pkg install -y sudo vim bash git

2. Install Jail Management tool (iocage)

We’ll use iocage, a modern jail manager.

# pkg install -y iocage
# sysrc iocage_enable="YES"
# service iocage start

Activate ZFS (recommended):

# zpool create zroot /dev/da0

Initialize iocage:

# iocage activate zroot
# iocage fetch

3. Create a Reusable Jail Template

Instead of building each jail manually, create a golden template.

# iocage create -n lamp-template -r 13.2-RELEASE ip4_addr="vnet0|10.0.0.10/24" boot=off
# iocage start lamp-template
# iocage console lamp-template

4. Install LAMP Stack Inside the Jail

Inside the jail:

4.1. Install Apache

# pkg install -y apache24
# sysrc apache24_enable="YES"

4.2. Install MariaDB

# pkg install -y mariadb106-server
# sysrc mysql_enable="YES"

Initialize DB:

service mysql-server start
mysql_secure_installation

4.3. Install PHP pre-compiled ports

# pkg install -y php82 php82-mysqli php82-mbstring php82-opcache


Configure Apache to use PHP:

# echo 'LoadModule php_module libexec/apache24/libphp.so' >> /usr/local/etc/apache24/httpd.conf
# echo 'AddType application/x-httpd-php .php' >> /usr/local/etc/apache24/httpd.conf

5. Test LAMP Stack works OK

Create a test file:

# echo "<?php phpinfo(); ?>" > /usr/local/www/apache24/data/index.php

Start services:

service apache24 start

Visit the jail IP and confirm PHP (page output) works in Firefox / Chrome Browser.

6. Convert Template into Clones

Stop Jail and snapshot:

iocage stop lamp-template
iocage snapshot lamp-template@base

Clone for production:

iocage clone lamp-template -n app01 ip4_addr="vnet0|10.0.0.21/24"
iocage clone lamp-template -n app02 ip4_addr="vnet0|10.0.0.22/24"

Repeat across servers and once working create a small shell script to run as a cron job to create backups automated.

Each server might run 5 up to 20 jails depending on resources.

7. Networking Between Jails

Use VNET for proper isolation:

Enable bridge on host:

# ifconfig bridge0 create
# ifconfig bridge0 addm em0 up

Assign jail interfaces automatically via iocage.

8.  Load Balancing Layer

On 2 dedicated nodes, install nginx:

# pkg install -y nginx
# sysrc nginx_enable="YES"

Example config:

http {
    upstream backend {
        server 10.0.0.21;
        server 10.0.0.22;
        server 10.0.1.21;
        server 10.0.1.22;
    }

    server {
        listen 80;

        location / {
            proxy_pass http://backend;
        }
    }
}

9. Database Strategy

You have few options to choose from:

a. Use Centralized DB

  • Dedicated DB jails on 2 nodes
  • Primary + replica

b. Use Per-node DB (simpler)

  • Each jail has its own MariaDB
  • Use app-level replication if needed

10. Automation Across 10 Servers

Use tools like:

  • Ansible
  • SSH scripts
  • ZFS replication

Example (simple parallel execution loop) or use a set of scripts to handle updating with some Ansible Playbooks or Puppet:

# for host in server{1..10}; do
  ssh $host "pkg update"
done

Few more Operational Tips to consider

a. Tune up setup / Do Resource management

  • Limit jail CPU/memory using rctl
  • Avoid overcommitting RAM

b. Use Centralized Logging

c. Do regular jail Backups

  • Use ZFS snapshots to backup each of the Jails:

# zfs snapshot zroot/iocage/jails/app01@backup

d. Tighten Security

  • Disable root SSH
  • Use PF firewall on host
  • Keep jails minimal

e. Do a Further Scaling Strategy

  • Add more servers -> replicate template
  • Add more jails -> clone snapshots
  • Scale horizontally via load balancer

Summary and Last Thoughts

When Choose FBSD Jails and when Containers

  • Use jails when you control the infrastructure, need maximum efficiency, and value simplicity (e.g., appliances, CDNs, storage systems).
  • Use containers when portability, scalability, and integration with modern DevOps workflows are critical.

This setup plays to the strengths of FreeBSD jails:

1. Performance: near-native speed
2.Isolation: strong and predictable
3. Simplicity: fewer layers than container stacks

FreeBSD jails remain a powerful and efficient isolation mechanism, particularly well-suited for controlled, performance-sensitive environments. Containers, however, dominate in modern application deployment due to their flexibility and ecosystem. The choice ultimately depends on whether you prioritize system-level control or platform-level convenience.

You won’t get the ecosystem of tools like Docker or Kubernetes, but you gain control, stability, and efficiency, which is exactly why companies like Netflix still rely on this model in critical infrastructure.

 

Create user and password on Linux non interactive and add it to sudo a tiny Dev Ops script

Thursday, September 20th, 2018

Bash-Final-the-Bourne-again-shell-logo
A common task for SysAdmins who managed a multitude of servers remotely via Secure Shell was to add a user and assign password by using a script, this was sometimes necessery to set-up some system users and create access for university users on 10 / 20 testing Linux servers.

Nowadays this task of adding user to a list of remote servers and granting the new user superuser permissions through /etc/sudoers is practiced heavily by the so called Dev Ops (Just another Buziness Word for Senior System Admiistrators with good scripting skills and a little bit of development experience – same game different name.

The Dev Ops System Integration Engineers use this useful add non-interactive user via SSH in Cloud environments in order to prepare superuser (root permissioned through /etc/sudoers) user, that is later be used for lets say deployment on a few hundred of servers of lets say LAMP (Linux + Apache + MySQL + PHP) or LEMP (Linux NGINX MySQL PHP) or Software Load Balancer HAProxy  balacing for MySQL clusters / Nginx Application servers / JIRAs etc, through a Playbook script with some deployment automation tool such as Ansible.

Well enough talk here is the few lines of code which does create a user locally:
 

linux:~# apt-get install –yes sudo
linux:~# useradd devops –home /home/devops -s /bin/bash
linux:~# mkdir /home/devops
linux:~# chown -R devops:devops /home/devops
linux:~# echo 'username:testpass' | chpasswd


Though this lines could be invoked easily by passing it as arguments via ssh it is often unhandy to run them on remote host, because some of the remote hosts against executed, might have already the user existent with granted permissions for sudo

Thus a much better way to do things is use below script and first upload it to remote servers by running the scp command in a loop:

while read line; do
scp  root@$i:/root/
ssh "
create_user_noninteractive_and_add_to_sudoers.sh"
done < servers_list.txt


Where servers_list.txt contains a list of remote IPs:

#!/bin/bash
# Create new user/group and add nopasswd login to sudoers
# Author: Georgi Georgiev
# has to be run sa root – sudo devops
# hipo@www.pc-freak.net

 

u_id='devops';
g_id='devops';
pass='testpass';
sudoers_f='/etc/sudoers';

check_install_sudo ()  {
if [ $(dpkg –get-selections | cut -f1|grep -E ‘^sudo’) ]; then
apt-get install –yes sudo
else
        printf "Nothing to do sudo installed";
fi
}

check_install_user () {

if [ “$(sed -n “/$u_id/p” /etc/passwd|wc -l)” -eq 0 ]; then
apt-get install –yes sudo
apt-get install –yes sudo
useradd $u_id –home /home/$u_id
mkdir /home/$u_id
chown -R $u_id:$g_id /home/$u_id
echo "$u_id:$pass" | chpasswd
cp -rpf /etc/bash.bashrc /home/$u_id
if [ “$(sed -n “/$u_id/p” $sudoers_f|wc -l)” -eq “0” ]; then
echo "$u_id ALL=(ALL) NOPASSWD: ALL" >> $sudoers_f
else
        echo "$u_id existing. Exiting ..";
        exit 1;
fi

else
        echo "Will do nothing because $u_id exists";
fi

}

check_install_sudo;
check_install_user;


By the way this task was the simplest task given by a Company where I applied for a Dev Ops System Engineer, so I hope this will help someone else too.

P.S. If you prefer Shell scripts (even though much harder, time consuming etc.) as a mean of automation as an alternative to Ansible / Chef I suggest you check out and perhaps try to do the task with http://fuckingshellscripts.org 🙂