Posts Tagged ‘disk space’

Analyze disk space usage in Linux / BSD with du / find and filelight /qdirstat / baobab GUI disk usage analyzers to check what takes up your disk space on Unix like OSes

Friday, April 21st, 2023

linux-how-to-find-out-what-files-and-directories-has-occupied-all-your-disk-space-partition-from-console-and-GUI_du-find-filelight-baobab-qdirstat-duff-linux-450x450

If you're a Desktop Linux or BSD UNIX user and your hard disk / external SSD / flash drive etc. space starts to be misteriously disapper due to whatever reaseon such as a crashing applications producing rapidly log error / warning messages leading quickly to filling up the disk or out of a sudden you have some Disk space lost without knowing what kind of data filled up the disk or you're downloading some big sized bittorrent files forgotten in your bittorrent client or complete mirroring a large website and you suddenly get the result of root directory ( / ) getting fully or nearly filled up, then you definitely would want to check out what has disk activity has eaten up your disk space and leaing to OS and Aplication slow responsiveness.

For the Linux regular *nix user finding out what is filling the disk is a trivial task with with find / du -hsc * but as people have different habits to use find and du I'll show you the most common ways I use this two command line tools to identify disk space low issues for the sake of comparison.
Others who have better easier ways to do it are very welcome to share it with me in the comments.
 

1. Finding large files on hard disk with find Linux command tool
 

host:~# find /home -type f -printf "%s\t%p\n" | sort -n | tail -10
2100000000    /home/hipo/Downloads/MameUIfx incl. ROMs/MameUIfx incl. ROMs-6.bin
2100000000    /home/hipo/Downloads/MameUIfx incl. ROMs/MameUIfx incl. ROMs-7.bin
2100000000    /home/hipo/Downloads/MameUIfx incl. ROMs/MameUIfx incl. ROMs-8.bin
2100000000    /home/hipo/Downloads/MameUIfx incl. ROMs/MameUIfx incl. ROMs-9.bin
2815424080    /home/hipo/.thunderbird/h3dasfii.default\
/ImapMail/imap.gmail.com/INBOX
2925584895    /home/hipo/Documents/.git/\
objects/pack/pack-8590b069cad26ac0af7560fb42b51fa9bfe41050.pack
4336918067    /home/hipo/Games/Mames_4GB-compilation-best-arcade-games-of-your-14_04_2021.tar.gz
6109003776    /home/hipo/VirtualBox VMs/CentOS/CentOS.vdi
23599251456    /home/hipo/VirtualBox VMs/Windows 7/Windows 7.vdi
33913044992    /home/hipo/VirtualBox VMs/Windows 10/Windows 10.vdi

I use less rarely find on Desktops and more when I have to do some kind of data usage analysis on servers, of course for my Linux home computer and any other Linux desktop machines, or just a small incomprehensive analysis du cmd is much more appropriate to use.


2. Finding large files Megabyte occupying space files sorted in Megabytes and Gigas with du
 

  • Check main 10 files sorted in megabytes that are hanging in a directory

pcfkreak:~# du -hsc /home/hipo/*|grep 'M\s'|sort -rn|head -n 10
956M    /home/hipo/last_dump1.sql
711M    /home/hipo/hipod
571M    /home/hipo/from-thinkpad_r61
453M    /home/hipo/ultimate-edition-themes
432M    /home/hipo/metasploit-framework
355M    /home/hipo/output-upgrade.txt
333M    /home/hipo/Плот
209M    /home/hipo/Work-New.tar.gz
98M    /home/hipo/DOOM64
90M    /home/hipo/mp3

  • Get 10 top larges files in Gigabytes that are space hungry and eating up your space

pcfkreak:~# du -hsc /home/hipo/*|grep 'G\s'|sort -rn|head -n 10
156G    total
60G    /home/hipo/VirtualBox VMs
37G    /home/hipo/Downloads
18G    /home/hipo/Desktop
11G    /home/hipo/Games
7.4G    /home/hipo/ownCloud
7.1G    /home/hipo/Документи
4.6G    /home/hipo/music
2.9G    /home/hipo/root
2.8G    /home/hipo/Documents


If you want to still work on the console terminal but you don't want to type too much you can use ncdu (ncurses) text tool, install it with

# apt install –yes ncdu


https://www.pc-freak.net/images/ncdu-gnu-linux-debian-screenshot.png

 For the most lazy ones or complete Linux newbies that doesn't want to spend time typing / learing or using text commands or softwares you can also check what has eaten up your full disk space with GUI tools as well.

There are at least 3 tools to use to check in Graphical Interface what has occupied your disk space on Linux / BSD, I'm aware of:

3. Filelight GUI disk usage analysis Linux tool

For those using KDE or preferring a shiny GUI interface that will capture the eye, perhaps filelight would be the option of choice tool to get analysis sum of your directory sturctures and file use on the laptop or desktop *unix OS.

unix-desktop:~# apt-cache show filelight|grep -i description-en -A 7
Description-en: show where your diskspace is being used
 Filelight allows you to understand your disk usage by graphically
 representing your filesystem as a set of concentric, segmented rings.
 .
 It is like a pie-chart, but the segments nest, allowing you to see both
 which directories take up all your space, and which directories
 and files inside those directories are the real culprits.
Description-md5: 397ff9a469e07a772f22460c66b66875


To use it simply go ahead and install it with apt or yum / dnf or whatever Linux package manager your distro uses:

unix-desktop:~# apt-get install –yes filelight

filelight-show-where-disk-space-is-being-used-graphically-tool-linux

4. GNOME DIsk Usage Analyzer Baobab GUI tool

For those being a GNOME / Mate / Budgie / Cinnamon Graphical interface users baobab shold be the program to use as it uses the famous LibGD library.

unix-desktop:~# apt-cache show baobab|grep -i description-en -A10
Description-en: GNOME disk usage analyzer
 Disk Usage Analyzer is a graphical, menu-driven application to analyse
 disk usage in a GNOME environment. It can easily scan either the whole
 filesystem tree, or a specific user-requested directory branch (local or
 remote).
 .
 It also auto-detects in real-time any changes made to your home
 directory as far as any mounted/unmounted device. Disk Usage Analyzer
 also provides a full graphical treemap window for each selected folder.
Description-md5: 5f6072b89ebb1dc83433fa7658814dc6
Homepage: https://wiki.gnome.org/Apps/Baobab

 

gnome-disk-analyzer-baobab-tool-screenshot-of-hard-disk-directory-locations-sorted-by-size

5. Qdirstat graphical application to show where your disk space has gone on Linux

Qdirstat is perhaps well known tool to track disk space issues on Linux desktop hosts, known by the hardcore KDE / LXDE / LXQT / DDE GUI interface / environment lovers and as a KDE tool uses the infamous Qt library. I personally don't like it and don't put it on machines I use because I never use kde and don't want to waste my disk space with additional libraries such as the QT Library which historically was not totally free in terms of licensing and even now is in both free and non free licensing GPL / LGPL and QT Commercial Licensing license.

unix-desktop:~# apt-cache show qdirstat|grep -i description-en -A10
Description-en: Qt-based directory statistics
 QDirStat is a graphical application to show where your disk space has gone and
 to help you to clean it up.
 .
 QDirStat has a number of new features compared to KDirStat. To name a few:
  * Multi-selection in both the tree and the treemap.
  * Unlimited number of user-defined cleanup actions.
  * Properly show errors of cleanup actions (and their output, if desired).
  * File categories (MIME types) and their treemap color are now configurable.
  * Exclude rules for directories are easily configurable.
  * Desktop-agnostic; no longer relies on KDE or any other specific desktop.


qdirstat-linux-screenshot-show-what-directory-uses-most-hard-disk-space

That shiny fuzed graphics is actually a repsesantation of all directories the bigger and if one scrolls on the colorful gamma a text with directory and size or file will appear. Though the graphical represantation is really c00l to me it is a bit unreadable, thus I prefer and recommend the other two GUI tools filelight or baobab instead.

6. Finding duplicate files on Linux system with duff command tool

Talking about big unknown left-over files on your hard drives, it is appropriate to mention one tool here that is a console one but very useful to anyone willing to get rid of old duplicate files that are hanging around on the disk. Sometimes such copies are produced while copying large amount of files from place to place or simply by mistake while copying Photo / Video files from your Smart Phone to Linux desktop etc. 

This is where the duff command line utility might be super beneficial for you.

unix-desktop:~# apt-cache show duff|grep -i description-en -A3
Description-en: Duplicate file finder
 Duff is a command-line utility for identifying duplicates in a given set of
 files.  It attempts to be usably fast and uses the SHA family of message
 digests as a part of the comparisons.

Using duff tool is very straight forward to see all the duplicate files hanging in a directory lets say your home folder.

unix-desktop:~#  duff -rP /home/hipo

/home/hipo/music/var/Quake II Soundtrack – Kill Ratio.mp3
/home/hipo/mp3/Quake II Soundtrack – Kill Ratio.mp3
2 files in cluster 44 (7913472 bytes, digest 98f38be49e2ffcbf90927f9357b3e24a81d5a649)
/home/hipo/music/var/HYPODIL_01-Scakauec.mp3
/home/hipo/mp3/HYPODIL_01-Scakauec.mp3
2 files in cluster 45 (2807808 bytes, digest ce9067ce1f132fc096a5044845c7fac73e99c0ed)
/home/hipo/music/var/Quake II Suondtrack – March Of The Stoggs.mp3
/home/hipo/mp3/Quake II Suondtrack – March Of The Stoggs.mp3
2 files in cluster 46 (3506176 bytes, digest efcc401b4ebda9b0b2367aceb8e334c8ba1a357d)
/home/hipo/music/var/Quake II Suondtrack – Quad Machine.mp3
/home/hipo/mp3/Quake II Suondtrack – Quad Machine.mp3
2 files in cluster 47 (7917568 bytes, digest 0905c1d790654016c2ecf2949f78d47a870c3822)
/home/hipo/music/var/Cyberpunk Group – Futureshock!.mp3
/home/hipo/mp3/Cyberpunk Group – Futureshock!.mp3

-r (Recursively search into all specified directories.)

P (Don't follow any symbolic links.  This overrides any previous -H or -L option.  This is the default.  Note that this only applies to directories, as sym‐
             bolic links to files are never followed.)

7. Deleting duplicate files with duff

If you're absolutely sure you know what you're doing and you have a backup in case if something messes up during duplicate teletions, to get rid of lets say any duplicate Picture files found by duff run sommething like:

# duff -e0 -r /home/hipo/Pictures/ | xargs -0 rm

!!! Please note that using duff is for those who absolutely know what they're doing and have their data recent data. Deleting the wrong data by mistake with the tool might put you in the first grade and you'll be the only one to blame  🙂 !!!

Wrap it Up

Filling up the disk with unknown large files is a task to resolve that happens often. For the unlazy on Linux / BSD / Mac OS and other UNIX like OS-es the easiest way is to use find or du with some one liner command. For the lazy Windows addicted Graphical users filelightqdirstat or baobab GUI disk usage analysis tools are there.
If you have a lot of files and many of thems are duplicates you can use duff to check them out and remove all unneded duplicates and save space. 
Hope this article, was helpful for someone.
That's all folks, enjoy your data profilactics, if you know any other good easy command or GUI tools or hints for drive disk space profilactics please share.

How to disable Windows pagefile.sys and hiberfil.sys to temporary or permamently save disk space if space is critically low

Monday, March 28th, 2022

howto-pagefile-hiberfil.sys-remove-reduce-increase-increase-size-windows-logo

Sometimes you have to work with Windows 7 / 8 / 10 PCs  etc. that has a very small partition C:\
drive or othertimes due to whatever the disk got filled up with time and has only few megabytes left
and this totally broke up the windows performance as Windows OS becomes terribly sluggish and even
simple things as opening Internet Browser (Chrome / Firefox / Opera ) or Windows Explorer stones the PC performance.

You might of course try to use something like Spacesniffer tool (a great tool to find lost data space on PC s short description on it is found in my previous article how to
delete temporary Internet Files and Folders to to speed up and free disk space
 ) or use CCleaner to clean up a bit the pc.
Sometimes this is not enough though or it is not possible to do at all the main
partition disk C:\ is anyhow too much low (only 30-50MB are available on HDD) or the Physical or Virtual Machine containing the OS is filled with important data
and you couldn't risk to remove anything including Internet Temporary files, browsing cookies … whatever.

Lets say you are the fate chosen guy as sysadmin to face this uneasy situation and have no easy
way to add disk space from another present free space partition or could not add a new SATA hard drive
SSD drive, what should you do?
 

The solution wipe off pagefile.sys and hiberfil.sys

Usually every Windows installation has a pagefile.sys and hiberfil.sys.

  • pagefile.sys – is the default file that is used as a swap file, immediately once the machine runs out of memory. For Unix / Linux users better understanding pagefile.sys is the equivallent of Linux's swap partition. Of course as the pagefile is in a file and not in separate partition the swapping in Windows is perhaps generally worse than in Linux.
  • hiberfil.sys – is used to store data from the machine on machine Hibernation (for those who use the feature)


Pagefile.sys which depending on the configured RAM memory on the OS could takes up up to 5 – 8 GB, there hanging around doing nothing but just occupying space. Thus a temporary workaround that could free you some space even though it will degrade performance and on servers and production machines this is not a good solution on just user machines, where you temporary need to free space any other important task you can free up space
by seriously reducing the preconfigured default size of pagefile.sys (which usually is 1.5 times the active memory on the OS – hence if you have 4GB you would have a 6Gigabytes of pagefile.sys).

Other possibility especially on laptop and movable devices running Win OS is to disable hiberfil.sys, read below how this is done.


The temporary solution here is to simply free space by either reducing the pagefile.sys or completely disabling it


1. Disable pagefile.sys on Windows XP, Windows 7 / 8 / 10 / 11


The GUI interface to disable pagefile across all NT based Windows OS-es is quite similar, the only difference is newer versions of Windows has slightly more options.


1.1 Disable pagefile on Windows XP


Quickest way is to find pagefile.sys settings from GUI menus

1. Computer (My Computer) – right click mouse
2. Properties (System Preperties will appear)
3. Advanced (tab) 
4. Settings
5. Advanced (tab)
6. Change button

windows-xp-pagefile-disable-screenshot

1.2 Disable pagefile on Windows 7

 

advanced-system-settings-control-panel-system-and-security-screenshot

windows-system-properties-screenshot-properties-advanced-change-Virtual-memory-pagefile-screenshot

system-properties-performance-options
 

Once applied you'll be required to reboot the PC

How-to-turn-off-Virtual-Memory-Paging_File-in-Windows-7-restart

 

1.3 Disable Increase / Decrease pagefile.sys on Windows 10 / Win 11
 

open-system-properties-advanced-win10

win10-performance-options-menu-screenshot

configure-virtual-memory-win-10-screenshot


1.4 Make Windows clear pagefile.sys on shutdown

On home PCs it might be useful thing to clear up ( nullify) pagefile.sys on shutdown, that could save you some disk space on every reboot, until file continuously grows to its configured Maximum.

Run

regedit

Modify registry key at location

 

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management

windows-clean-up-pagefile-sys-file-on-shutdown-or-reboot-registry-editor-value-screenshot

You can apply the value also via a registry file you can get the Enable Clearpagefile at shutdown here .reg.
 

2. Manipulating pagefile.sys size and file delete from command line with wmic tool 

For scripting purposes you might want to use the wmic pagefile which can do increase / decrase or delete the file without GUI, that is very helpful if you have to admin a Windows Domain (Active Directory)
 

[hipo.WINDOWS-PC] ➤ wmic pagefile /?

PAGEFILE – Virtual memory file swapping management.

HINT: BNF for Alias usage.
(<alias> [WMIObject] | [] | [] ) [].

USAGE:

PAGEFILE ASSOC []
PAGEFILE CREATE <assign list>
PAGEFILE DELETE
PAGEFILE GET [] []
PAGEFILE LIST [] []

 

[hipo.WINDOWS-PC] ➤ wmic pagefile
AllocatedBaseSize  Caption          CurrentUsage  Description      InstallDate                Name             PeakUsage  Status  TempPageFile
4709               C:\pagefile.sys  499           C:\pagefile.sys  20200912061902.938000+180  C:\pagefile.sys  525                FALSE

 

[hipo.WINDOWS-PC] ➤ wmic pagefile list /format:list

AllocatedBaseSize=4709
CurrentUsage=499
Description=C:\pagefile.sys
InstallDate=20200912061902.938000+180
Name=C:\pagefile.sys
PeakUsage=525
Status=
TempPageFile=FALSE

wmic-pagefile-command-line-tool-for-windows-default-output-screenshot

 

  • To change the Initial Size or Maximum Size of Pagefile use:
     

➤ wmic pagefileset where name="C:\\pagefile.sys" set InitialSize=2048,MaximumSize=2048

  • To move the pagefile / change location of pagefile to less occupied disk drive partition (i.e. D:\ drive)

     

     

    Sometimes you might have multiple drives on the PC and some of them might be having multitudes of gigabytes while main drive C:\ could be fully occupied due to initial install bad drive organization, in that case a good work arount to save you space so you can work normally with the server is just to temporary or permanently move pagefile to another drive.

wmic pagefileset where name="D:\\pagefile.sys" set InitialSize=2048,MaximumSize=2048


!! CONSIDER !!! 

That if you have the option to move the pagefile.sys for best performance it is advicable to place the file inside another physical disk, preferrably a Solid State Drive one, SATA disks are too slow and reduced Input / Output disk operations will lead to degraded performance, if there is lack of memory (i.e. pagefile.sys is actively open read and wrote in).
 

  • To delete pagefile.sys 
     

➤ wmic pagefileset where name="C:\\pagefile.sys" delete

 

If for some reason you prefer to not use wmic but simple del command you can delete pagefile.sys also by:

Removing file default "Hidden" and "system" file attributes – set for security reasons as the file is a system file usually not touched by user. This will save you from "permission denied" errors:
 

➤ attrib -s -h %systemdrive%\pagefile.sys


Delete the file:
 

➤ del /a /q %systemdrive%\pagefile.sys


3. Disable hibernation on Windows 7 / 8 and Win 10 / 11

Disabling hibernation file hiberfil.sys can also free up some space, especially if the hibernation has been actively used before and the file is written with data. Of course, that is more common on notebooks.
Windows hibernation has significantly improved over time though i didn't have very pleasant experience in the past and I prefer to disable it just in case.
 

3.1 Disable Windows 7 / 8 / 10 / 11 hibernation from GUI 

Disable it through:

Control Panel -> All Control Panel Items -> Power Options -> Edit Plan Settings -> Change advanced power settings


 like shown in below screenshot:

Windqows-power-options-Advanced-settings-Allow-Hybrid-sleep-option-menu-screenshot

 

3.2 Disable Windows 7 / 8 / 10 / 11 hibernation from command line

Disable hibernation Is done in the same way through the powercfg.exe command, to disable it
if you're cut of disk space and you want to save space from it:

run as Administrator in Command Line Windows (cmd.exe)
 

powercfg.exe /hibernate off

If you later need to switch on hibernation
 

powercfg.exe /hibernate on


disable-hiberfile-windows-screenshot

3.3 Disable Windows hibernation on legacy Windows XP

On XP to disable hibernation open

1. Power Options Properties
2. Select Hibernate
3. Select Enable Hibernation to clear the checkbox and disable Hibernation mode. 
4. Select OK to apply the change.

Close the Power Options Properties box. 

enable-disable-hibernate-windows-xp-menu-screenshot

To sum it up

We have learned some basics on Windows swapping and hibernation and i've tried to give some insight on how thiese files if misconfigured could lead to degraded Win OS performance. In any case using SSD as of 2022 to store both files is a best practice for machines that has plenty of memory always try to completely disable / remove the files. It was shown how  to manage pagefile.sys and hiberfil.sys across Windows Operating Systems different versions both from GUI and via command line as well as how you can configure pagefile.sys to be cleared up on pc reboot.
 

No space left on device with free disk space / Why no space left on device while there is plenty of disk space on drive – Running out of Inodes

Tuesday, November 17th, 2015

no_space_left-on-device-while-there-is-disk-space-running-out-of-file-inodes-unix_linux_file_system_diagram.gif

 

On one of the servers, I'm administrating the websites started showing some Mysql database table corrup errors like:
 

 

Table './database_name/site_news_list_com' is marked as crashed and last (automatic?) repair failed

The server is using Oracle MySQL server community stable edition on Debian GNU / Linux 6.0, so I first thought during work the server crashed either due to some bug issue in MySQL or it crashed due to some PHP cron job that did something messy. Thus to solve the crashed tables, tried using mysqlcheck tool which helped pretty fine, at many times whether there were database / table corruptions. I've run the following set of mysqlcheck commands with root (superuser) in a bash shell after logging in through SSH:

:

server:~# /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–check –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log
server:~# /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf –analyze –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log
server:~# /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–auto-repair –optimize –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log
server:~# /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–optimize –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log


In order for above commands to work, I've created the /root/.my.cnf containing my root (mysql CLI) mysql username and password, e.g. file has content like below:

 

[client]
user=root
password=MySecretPassword8821238

 

Btw a good note here is its generally a good idea (if you want to have consistent mysql databases) to automatically execute via a cron job 2 times a month, I've in root cronjob the following:

 

crontab -u root -l |grep -i mysqlcheck
04 06 5,10,15,20,25,1 * * /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–check –all-databases –silent -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log 07 06 5,10,15,20,25,1 * * /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf –analyze –all-databases –silent -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log 12 06 5,10,15,20,25,1 * * /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–auto-repair –optimize –all-databases –silent -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log 17 06 5,10,15,20,25,1 * * /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–optimize –all-databases –silent -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log


Strangely I got a lot of errors that some .MYI / .MYD .frm temp files, necessery for the mysql tables recovery can't be written inside /home/mysql/database_name

That was pretty weird and I thought there might be some issues with permissions, causing the inability to write, due to some bug or something so I went straight and checked /home/mysql/database_name permissions, e.g.::

 

server:/home/mysql/database_name# ls -ld soccerfame
drwx—— 2 mysql mysql 36864 Nov 17 12:00 soccerfame
server:/home/mysql/database_name# ls -al1|head -n 10
total 1979012
drwx—— 2 mysql mysql 36864 Nov 17 12:00 .
drwx—— 36 mysql mysql 4096 Nov 17 11:12 ..
-rw-rw—- 1 mysql mysql 8712 Nov 17 10:26 1_campaigns_diez.frm
-rw-rw—- 1 mysql mysql 14672 Jul 8 18:57 1_campaigns_diez.MYD
-rw-rw—- 1 mysql mysql 1024 Nov 17 11:38 1_campaigns_diez.MYI
-rw-rw—- 1 mysql mysql 8938 Nov 17 10:26 1_campaigns.frm
-rw-rw—- 1 mysql mysql 8738 Nov 17 10:26 1_campaigns_logs.frm
-rw-rw—- 1 mysql mysql 883404 Nov 16 22:01 1_campaigns_logs.MYD
-rw-rw—- 1 mysql mysql 330752 Nov 17 11:38 1_campaigns_logs.MYI


As seen from above output, all was perfect with permissions, so it should have been something else, so I decided to try to create a random file with touch command inside /home/mysql/database_name directory:

 

touch /home/mysql/database_name/somefile-to-test-writtability.txt touch: cannot touch ‘/scr1/data/somefile-to-test-writtability.txt‘: No space left on device


Then logically I thought the /home/mysql/ mounted ext4 partition got filled, because of crashed SQL database or a bug thus, checked with disk free command df whether there is enough space on server:

server:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/md1 20G 7.6G 11G 42% /
udev 10M 0 10M 0% /dev
tmpfs 13G 1.3G 12G 10% /run
tmpfs 32G 0 32G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/md2 256G 134G 110G 55% /home

Well that's weird? Obviously only 55% of available disk space is used and available 134G which was more than enough so I got totally puzzled why, files can't be written.

Then very logically, I thought it might be that /home directory has remounted as read only, because the SSD memory disk on server is failing and checked for errors in dmesg, i.e.:

 

server:~# dmesg|grep -i error


Also checked how exactly was partition mounted, to check whether it is (RO) read-only:

 

server:~# mount -l|grep -i /home
/dev/md2 on /home type ext4 (rw,relatime,discard,data=ordered)


Now everything become even more weirder, as obviously the disk continued to be claiming no space left on device, while in reality there was plenty of disk space.

Then after running a quick research on the internet for the no space left on device with free disk space, I've come across this great superuser.com thread which let me realize the partition run out of inodes and that's why no new file inodes could be assigned and therefore, the linux kernel is refusing to write the file on ext4 partition.

For those who haven't heard of Linux Partition Inodes here is link to Wikipedia and a quick quote:

 

In a Unix-style file system, the inode is a data structure used to represent a filesystem object, which can be one of various things including a file or a directory. Each inode stores the attributes and disk block location(s) of the filesystem object's data.[1] Filesystem object attributes may include manipulation metadata (e.g. change,[2] access, modify time), as well as owner and permission data (e.g. group-id, user-id, permissions).[3]
Directories are lists of names assigned to inodes. The directory contains an entry for itself, its parent, and each of its children.


Once I understood it is the inodes, I checked how many of them are occupied with cmd:

 

server:~# df -i /home
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/md2 17006592 17006592 0 100% /home


You see, there were 0 (zero) free file inodes on server and that was the reason for no space left on device while there was actually free disk space

To clean up (free) some inodes on partition, first thing I did is to delete all old logs which were inside /home and files I positively know not to be necessery, then to find which directories allocating most innodes used:

 

server:~# find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n


If you're on a regular old fashined IDE Hard Drive and not SSD or you have too much files inside this command will take really long …:

Therefore a better solution might be to frist:

a) Try to find root folders with large inodes count:

for i in /home/*; do echo $i; find $i |wc -l; done
Try to find specific folders:


You should get output like:

 

/home/new_website
606692
/home/common
73
/home/pcfreak
5661
/home/hipo
33
/home/blog
13570
/home/log
123
/home/lost+found
1

b) Then once you know the directory allocating most inodes, run the command again to see the sub-directories with most files (eating) partition innodes:

 

for i in /home/webservice/*; do echo $i; find $i |wc -l; done

 

One usual large folder which could free you some nodes is the linux source headers, but in my case it was simply a lot of tiny old logs being logged on the system for few years in the past without cleaning:

After deleting the log dirs and cache folder in my case /home/new_website/{log,cache}:

server:~# rm -rf /home/new_website/log/*
server:~# rm -rf /home/new_website/cache/*

 

 

a) Then, stopping Apache webserver to check prevent Apache to use MySQl databases while running database repair and restaring MySQL:
 

server:~# /etc/init.d/apache2 stop Restarting MySQL server
..
server:~# /etc/init.d/mysql restart
..


b) And re-issuing MySQL Check / Repair / Optimize database commands:
 

 

mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–check –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log

mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf –analyze –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log

mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–auto-repair –optimize –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log

mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–optimize –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log

c) And finally starting the Apache Webserver again:
 

server:~# /etc/init.d/apache2 start


Some innodse got freed up:
 

server:~# df -i /home Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/md2 17006592 16797196 209396 99% /home


And hooray by God's Grace and with help of prayers of The most Holy Theotokos (Virgin) Mary, websites started again !

Fix MySQL ibdata file size – ibdata1 file growing too large, preventing ibdata1 from eating all your server disk space

Thursday, April 2nd, 2015

fix-solve-mysql-ibdata-file-size-ibdata1-file-growing-too-large-and-preventing-ibdata1-from-eating-all-your-disk-space-innodb-vs-myisam

If you're a webhosting company hosting dozens of various websites that use MySQL with InnoDB  engine as a backend you've probably already experienced the annoying problem of MySQL's ibdata1 growing too large / eating all server's disk space and triggering disk space low alerts. The ibdata1 file, taking up hundreds of gigabytes is likely to be encountered on virtually all Linux distributions which run default MySQL server <= MySQL 5.6 (with default distro shipped my.cnf). The excremental ibdata1 raise appears usually due to a application software bug on how it queries the database. In theory there are no limitation for ibdata1 except maximum file size limitation set for the filesystem (and there is no limitation option set in my.cnf) meaning it is quite possible that under certain conditions ibdata1 grow over time can happily fill up your server LVM (Storage) drive partitions.

Unfortunately there is no way to shrink the ibdata1 file and only known work around (I found) is to set innodb_file_per_table option in my.cnf to force the MySQL server create separate *.ibd files under datadir (my.cnf variable) for each freshly created InnoDB table.
 

1. Checking size of ibdata1 file

On Debian / Ubuntu and other deb based Linux servers datadir is /var/lib/mysql/ibdata1

server:~# du -hsc /var/lib/mysql/ibdata1
45G     /var/lib/mysql/ibdata1
45G     total


2. Checking info about Databases and Innodb storage Engine

server:~# mysql -u root -p
password:

mysql> SHOW DATABASES;
+——————–+
| Database           |
+——————–+
| information_schema |
| bible              |
| blog               |
| blog-sezoni        |
| blogmonastery      |
| daniel             |
| ezmlm              |
| flash-games        |


Next step is to get some understanding about how many existing InnoDB tables are present within Database server:

 

mysql> SELECT COUNT(1) EngineCount,engine FROM information_schema.tables WHERE table_schema NOT IN ('information_schema','performance_schema','mysql') GROUP BY engine;
+————-+——–+
| EngineCount | engine |
+————-+——–+
|         131 | InnoDB |
|           5 | MEMORY |
|         584 | MyISAM |
+————-+——–+
3 rows in set (0.02 sec)

To get some more statistics related to InnoDb variables set on the SQL server:
 

mysqladmin -u root -p'Your-Server-Password' var | grep innodb


Here is also how to find which tables use InnoDb Engine

mysql> SELECT table_schema, table_name
    -> FROM INFORMATION_SCHEMA.TABLES
    -> WHERE engine = 'innodb';

+————–+————————–+
| table_schema | table_name               |
+————–+————————–+
| blog         | wp_blc_filters           |
| blog         | wp_blc_instances         |
| blog         | wp_blc_links             |
| blog         | wp_blc_synch             |
| blog         | wp_likes                 |
| blog         | wp_wpx_logs              |
| blog-sezoni  | wp_likes                 |
| icanga_web   | cronk                    |
| icanga_web   | cronk_category           |
| icanga_web   | cronk_category_cronk     |
| icanga_web   | cronk_principal_category |
| icanga_web   | cronk_principal_cronk    |


3. Check and Stop any Web / Mail / DNS service using MySQL

server:~# ps -efl |grep -E 'apache|nginx|dovecot|bind|radius|postfix'

Below cmd should return empty output, (e.g. Apache / Nginx / Postfix / Radius / Dovecot / DNS etc. services are properly stopped on server).

4. Create Backup dump all MySQL tables with mysqldump

Next step is to create full backup dump of all current MySQL databases (with mysqladmin):

server:~# mysqldump –opt –allow-keywords –add-drop-table –all-databases –events -u root -p > dump.sql
server:~# du -hsc /root/dump.sql
940M    dump.sql
940M    total

 

If you have free space on an external backup server or remotely mounted attached (NFS or SAN Storage) it is a good idea to make a full binary copy of MySQL data (just in case something wents wrong with above binary dump), copy respective directory depending on the Linux distro and install location of SQL binary files set (in my.cnf).
To check where are MySQL binary stored database data (check in my.cnf):

server:~# grep -i datadir /etc/mysql/my.cnf
datadir         = /var/lib/mysql

If server is CentOS / RHEL Fedora RPM based substitute in above grep cmd line /etc/mysql/my.cnf with /etc/my.cnf

if you're on Debian / Ubuntu:

server:~# /etc/init.d/mysql stop
server:~# cp -rpfv /var/lib/mysql /root/mysql-data-backup

Once above copy completes, DROP all all databases except, mysql, information_schema (which store MySQL existing user / passwords and Access Grants and Host Permissions)

5. Drop All databases except mysql and information_schema

server:~# mysql -u root -p
password:

 

mysql> SHOW DATABASES;

DROP DATABASE blog;
DROP DATABASE sessions;
DROP DATABASE wordpress;
DROP DATABASE micropcfreak;
DROP DATABASE statusnet;

          etc. etc.

ACHTUNG !!! DON'T execute!DROP database mysql; DROP database information_schema; !!! – cause this might damage your User permissions to databases

6. Stop MySQL server and add innodb_file_per_table and few more settings to prevent ibdata1 to grow infinitely in future

server:~# /etc/init.d/mysql stop

server:~# vim /etc/mysql/my.cnf
[mysqld]
innodb_file_per_table
innodb_flush_method=O_DIRECT
innodb_log_file_size=1G
innodb_buffer_pool_size=4G

Delete files taking up too much space – ibdata1 ib_logfile0 and ib_logfile1

server:~# cd /var/lib/mysql/
server:~#  rm -f ibdata1 ib_logfile0 ib_logfile1
server:~# /etc/init.d/mysql start
server:~# /etc/init.d/mysql stop
server:~# /etc/init.d/mysql start
server:~# ps ax |grep -i mysql

 

You should get no running MySQL instance (processes), so above ps command should return blank.
 

7. Re-Import previously dumped SQL databases with mysql cli client

server:~# cd /root/
server:~# mysql -u root -p < dump.sql

Hopefully import should went fine, and if no errors experienced new data should be in.

Altearnatively if your database is too big and you want to import it in less time to mitigate SQL downtime, instead import the database with:

server:~# mysql -u root -p
password:
mysql>  SET FOREIGN_KEY_CHECKS=0;
mysql> SOURCE /root/dump.sql;
mysql> SET FOREIGN_KEY_CHECKS=1;

 

If something goes wrong with the import for some reason, you can always copy over sql binary files from /root/mysql-data-backup/ to /var/lib/mysql/
 

8. Connect to mysql and check whether databases are listable and re-check ibdata file size

Once imported login with mysql cli and check whther databases are there with:

server:~# mysql -u root -p
SHOW DATABASES;

Next lets see what is currently the size of ibdata1, ib_logfile0 and ib_logfile1
 

server:~# du -hsc /var/lib/mysql/{ibdata1,ib_logfile0,ib_logfile1}
19M     /var/lib/mysql/ibdata1
1,1G    /var/lib/mysql/ib_logfile0
1,1G    /var/lib/mysql/ib_logfile1
2,1G    total

Now ibdata1 will grow, but only contain table metadata. Each InnoDB table will exist outside of ibdata1.
To better understand what I mean, lets say you have InnoDB table named blogdb.mytable.
If you go into /var/lib/mysql/blogdb, you will see two files
representing the table:

  •     mytable.frm (Storage Engine Header)
  •     mytable.ibd (Home of Table Data and Table Indexes for blogdb.mytable)

Now construction will be like that for each of MySQL stored databases instead of everything to go to ibdata1.
MySQL 5.6+ admins could relax as innodb_file_per_table is enabled by default in newer SQL releases.


Now to make sure your websites are working take few of the hosted websites URLs that use any of the imported databases and just browse.
In my case ibdata1 was 45GB after clearing it up I managed to save 43 GB of disk space!!!

Enjoy the disk saving! 🙂

Manually deleting spam comments from WordPress blogs and websites to free disk space and optimize MySQL

Monday, November 24th, 2014

WordPress-delete_spam_comments_manually_with_sql_query_to-optimize_mysql-and-free-disk-space
If you're a web-hosting company or a web-development using WordPress to build multitudes of customer blogs or just an independent blogger or sys-admin with a task to optimize a server's MySQL allocated storage  / performance on triads of WordPress-es a a good tip that would help is to removing wp_comments marked as spam.

Even though sites might be protected of thousands of spam message daily caught by WP anti-spam plugin Akismet, spam caught messages aer forwarder by Akismet to WP's Spam filter and kept wp_comments table with comments_approved column  record 'spam'.

Therefore you will certainly gain of freeing disk space uselessly allocated by spam messages into current MySQL server storage dir (/var/lib/mysql   /usr/local/mysql/data – the directory where my.cnf tells the server to keep its binary data .MYI, .MYD, .frm files) as well as save a lot of disk space by excluding the useless spam messages from SQL daily backup archives.

Here is how to remove manually spam comments from a WordPress blog under database (wp_blog1);

mysql> use wp_blog1;
mysql> describe wp_comments;
+———————-+———————+——+—–+———————+—————-+
| Field | Type | Null | Key | Default | Extra |
+———————-+———————+——+—–+———————+—————-+
| comment_ID | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| comment_post_ID | bigint(20) unsigned | NO | MUL | 0 | |
| comment_author | tinytext | NO | | NULL | |
| comment_author_email | varchar(100) | NO | | | |
| comment_author_url | varchar(200) | NO | | | |
| comment_author_IP | varchar(100) | NO | | | |
| comment_date | datetime | NO | | 0000-00-00 00:00:00 | |
| comment_date_gmt | datetime | NO | MUL | 0000-00-00 00:00:00 | |
| comment_content | text | NO | | NULL | |
| comment_karma | int(11) | NO | | 0 | |
| comment_approved | varchar(20) | NO | MUL | 1 | |
| comment_agent | varchar(255) | NO | | | |
| comment_type | varchar(20) | NO | | | |
| comment_parent | bigint(20) unsigned | NO | MUL | 0 | |
| user_id | bigint(20) unsigned | NO | | 0 | |
+———————-+———————+——+—–+———————+—————-+


The most common and quick way useful for scripting (whether you have to do it for multiple blogs with separate dbs) is to delete all comments being filled as 'Spam'.

To delete all messages which were filled by Akismet's spam filter with high probabily being a spam issue from mysql cli interface:

DELETE FROM wp_comments WHERE comment_approved = 'spam';


For Unread (Unapproved) messages the value of comment_approved field are 0 or 1, 0 if the comment is Red and Approved and 1 if still it is to be marked as read (and not spam).
If a wordpress gets heavily hammered with mainly spam and the probability that unapproved message is different from spam is low and you want to delete any message waiting for approvel as not being spam from wordpress use following SQL query:

DELETE FROM wp_comments WHERE comment_approved = 0;

Another not very common you might want to do is delete only all apprved comments:

DELETE FROM wp_comments WHERE comment_approved = 1;

For old installed long time unmaintained blogs (with garbish content), it is very likely that 99% of the messages might be spam and in case if there are already >= 100 000 spam messages and you don't have the time to inspect 100 000 spam comments to get only some 1000 legitimate and you want to delete completely all wordpress comments for a blog in one SQL query use:

TRUNCATE wp_comments;

Another scenario if you know a blog has been maintained until certain date and comments were inspected and then it was left unmaintained for few years without any spam detect and clear plugin like Akismet, its worthy to delete all comments starting from the date wordpress site stopped to be maintained:

DELETE FROM wp_comments WHERE comment_date > '2008-11-20 05:00:10' AND comment_date <= '2014-11-24 00:30:00'

Check your GNU / Linux Desktop for all used “Evil” Non-free ( proprietary ) Software with VRMS

Wednesday, June 26th, 2013

Virtual Richard Stallman VRMS Check what non free software is on your GNU Linux system

If you want to be strict on using only Free Software (in a as in freedom sense), just like Richard Stallman. You will be happy to know there is a tool in Linux called Virtual Richard Stallman ( vrms – report of installed non-free software ) 🙂

On launch vrms simply lists, all software and software documentation installed on Debian GNU / Linux  that is not 100% free software licenses / GPL compatbile. This is software installed via non-free  package Debian repositories or somehow not sticking to the standards of Debian Free Software Guidelines. Of course living with 100% free software is only for the hard core free software evangelists and rarely there is someone who can use computer on daily basis without some bits of proprietary software like flashplugin-nonfree, Skype rar, unrar. I tried for a while living on only 100% free software but didn't succeeded cause some non-free software is still a must to be able to not detach from "Digital Society". Living on only free software is not easy especially if you want to have normal multimedia  stuff on Desktop. Anyways even if you don't plan to purge your non-free software vrms is useful to list what free-software is installed on PC.

noah:~# apt-cache show vrms|grep -i description

Description-en: virtual Richard M. Stallman
 The vrms program will analyze the set of currently-installed packages
 on a Debian-based system, and report all of the packages from the
 non-free and contrib trees which are currently installed.
 .

Install vmrs with:

noah:~# apt-get install --yes vrms

 

Reading package lists… Done
Building dependency tree      
Reading state information… Done
The following packages were automatically installed and are no longer required:
  liboggkate1 xulrunner-10.0
Use 'apt-get autoremove' to remove them.
The following NEW packages will be installed:
  vrms
0 upgraded, 1 newly installed, 0 to remove and 101 not upgraded.
Need to get 0 B/13.0 kB of archives.
After this operation, 102 kB of additional disk space will be used.
Retrieving bug reports… Done
Parsing Found/Fixed information… Done
Selecting previously unselected package vrms.
(Reading database … 226672 files and directories currently installed.)
Unpacking vrms (from …/apt/archives/vrms_1.16_all.deb) …
Processing triggers for man-db …
Setting up vrms (1.16) …

  Below is a list of all non-free software installed on my Debian 7 Thinkpad:

noah:~# vrms

                Non-free packages installed on noah

acroread                            Adobe Acrobat Reader: Portable Document Format file vi
acroread-data                       data files for acroread
acroread-dictionary-en              English dictionary for for acroread
acroread-escript                    Adobe EScript Plug-In
acroread-l10n-en                    English language package for acroread
firmware-iwlwifi                    Binary firmware for Intel PRO/Wireless 3945 and 802.11
frogatto-data                       2D platformer game starring a quixotic frog
mame                                Multiple Arcade Machine Emulator (MAME)
mame-tools                          Tools for MAME and MESS
mess                                Multi Emulator Super System (MESS)
mess-data                           Data files for the Multi Emulator Super System (MESS)
mozilla-acroread                    Adobe Acrobat(R) Reader plugin for mozilla / konqueror
nikto                               web server security scanner
opera                               Fast and secure web browser and Internet suite
rar                                 Archiver for .rar files
skype                               Skype
teamviewer                          TeamViewer (Remote Control Application)
unrar                               Unarchiver for .rar files (non-free version)
xmame-tools                         Transitional package for mame-tools

                Contrib packages installed on noah

cbedic                              Text-mode Bulgarian/English Dictionary
dosemu                              DOS Emulator for Linux
flashplugin-nonfree                 Adobe Flash Player – browser plugin
frogatto                            2D platformer game starring a quixotic frog
gnome-video-arcade                  Simple MAME frontend
mess-desktop-entries                Desktop entries for MESS ROMs
ttf-mscorefonts-installer           Installer for Microsoft TrueType core fonts
winetricks                          package manager for WINE to install software easily

     Contrib packages with status other than installed on noah

gxmame                              ( dei)  GTK XMame frontend

  19 non-free packages, 0.8% of 2531 installed packages.
  9 contrib packages, 0.4% of 2531 installed packages.

 

If you want to go the Stallman way and be a 100% Free Software user, Go free and purge all "evil" non-free software  🙂 issue:

# for i in $(vrms -q|grep -v 'Contrib packages'|grep -v 'Non-free'|awk '{ print $1 }' | awk 'NF'); \
do \
apt-get remove --yes $i; dpkg --purge $i; done

 

Creating data backups on Debian and Ubuntu servers with Bacula professional backup tool

Wednesday, April 17th, 2013

Bacula professional GNU Linux Freebsd Netbsd backup software logo with bat

1. Install Bacula Backup System

root@pcfreak:~# apt-cache show bacula |grep -i description -A 5

Description: network backup, recovery and verification – meta-package
 Bacula is a set of programs to manage backup, recovery and verification
 of computer data across a network of computers of different kinds.
 .
 It is efficient and relatively easy to use, while offering many advanced
 storage management features that make it easy to find and recover lost or
 damaged files. Due to its modular design, Bacula is scalable from small
 single computer systems to networks of hundreds of machines.
 .

root@pcfreak:~# apt-get install bacula

 

Reading package lists… Done
Building dependency tree      
Reading state information… Done
The following extra packages will be installed:
  bacula-client bacula-common bacula-common-sqlite3 bacula-console bacula-director-common bacula-director-sqlite3 bacula-fd bacula-sd
  bacula-sd-sqlite3 bacula-server bacula-traymonitor libsqlite0 mt-st mtx sqlite sqlite3
Suggested packages:
  bacula-doc dds2tar scsitools sg3-utils kde gnome-desktop-environment sqlite-doc sqlite3-doc
The following NEW packages will be installed:
  bacula bacula-client bacula-common bacula-common-sqlite3 bacula-console bacula-director-common bacula-director-sqlite3 bacula-fd bacula-sd
  bacula-sd-sqlite3 bacula-server bacula-traymonitor libsqlite0 mt-st mtx sqlite sqlite3
0 upgraded, 17 newly installed, 0 to remove and 0 not upgraded.
2 not fully installed or removed.
Need to get 2,859 kB of archives.
After this operation, 6,992 kB of additional disk space will be used.
Do you want to continue [Y/n]? Y
Get:1 http://security.debian.org/ squeeze/updates/main bacula-common amd64 5.0.2-2.2+squeeze1 [637 kB]
Get:2 http://security.debian.org/ squeeze/updates/main bacula-common-sqlite3 amd64 5.0.2-2.2+squeeze1 [102 kB]
Get:3 http://security.debian.org/ squeeze/updates/main bacula-console amd64 5.0.2-2.2+squeeze1 [67.6 kB]
Get:4 http://security.debian.org/ squeeze/updates/main bacula-director-common amd64 5.0.2-2.2+squeeze1 [56.6 kB]
Get:5 http://security.debian.org/ squeeze/updates/main bacula-director-sqlite3 amd64 5.0.2-2.2+squeeze1 [308 kB]
Get:6 http://security.debian.org/ squeeze/updates/main bacula-sd amd64 5.0.2-2.2+squeeze1 [459 kB]
Get:7 http://security.debian.org/ squeeze/updates/main bacula-sd-sqlite3 amd64 5.0.2-2.2+squeeze1 [435 kB]
Get:8 http://security.debian.org/ squeeze/updates/main bacula-server all 5.0.2-2.2+squeeze1 [48.5 kB]
Get:9 http://security.debian.org/ squeeze/updates/main bacula-fd amd64 5.0.2-2.2+squeeze1 [124 kB]
Get:10 http://security.debian.org/ squeeze/updates/main bacula-client all 5.0.2-2.2+squeeze1 [48.5 kB]
Get:11 http://security.debian.org/ squeeze/updates/main bacula all 5.0.2-2.2+squeeze1 [1,030 B]
Get:12 http://security.debian.org/ squeeze/updates/main bacula-traymonitor amd64 5.0.2-2.2+squeeze1 [70.0 kB]
Get:13 http://ftp.uk.debian.org/debian/ squeeze/main sqlite3 amd64 3.7.3-1 [100 kB]
Get:14 http://ftp.uk.debian.org/debian/ squeeze/main libsqlite0 amd64 2.8.17-6 [188 kB]
Get:15 http://ftp.uk.debian.org/debian/ squeeze/main sqlite amd64 2.8.17-6 [22.0 kB]
Get:16 http://ftp.uk.debian.org/debian/ squeeze/main mtx amd64 1.3.12-3 [154 kB]
Get:17 http://ftp.uk.debian.org/debian/ squeeze/main mt-st amd64 1.1-4 [35.6 kB]                                                            
Fetched 2,859 kB in 6s (471 kB/s)                                                                                                           
Selecting previously deselected package bacula-common.
(Reading database … 86693 files and directories currently installed.)
Unpacking bacula-common (from …/bacula-common_5.0.2-2.2+squeeze1_amd64.deb) …
Adding user 'bacula'… Ok.
Selecting previously deselected package bacula-common-sqlite3.
Unpacking bacula-common-sqlite3 (from …/bacula-common-sqlite3_5.0.2-2.2+squeeze1_amd64.deb) …
Selecting previously deselected package bacula-console.
Unpacking bacula-console (from …/bacula-console_5.0.2-2.2+squeeze1_amd64.deb) …
Processing triggers for man-db …
Setting up bacula-common (5.0.2-2.2+squeeze1) …
Selecting previously deselected package bacula-director-common.
(Reading database … 86860 files and directories currently installed.)
Unpacking bacula-director-common (from …/bacula-director-common_5.0.2-2.2+squeeze1_amd64.deb) …
Selecting previously deselected package sqlite3.
Unpacking sqlite3 (from …/sqlite3_3.7.3-1_amd64.deb) …
Selecting previously deselected package libsqlite0.
Unpacking libsqlite0 (from …/libsqlite0_2.8.17-6_amd64.deb) …
Selecting previously deselected package sqlite.
Unpacking sqlite (from …/sqlite_2.8.17-6_amd64.deb) …
Selecting previously deselected package bacula-director-sqlite3.
Unpacking bacula-director-sqlite3 (from …/bacula-director-sqlite3_5.0.2-2.2+squeeze1_amd64.deb) …
Selecting previously deselected package mtx.
Unpacking mtx (from …/mtx_1.3.12-3_amd64.deb) …
Selecting previously deselected package bacula-sd.
Unpacking bacula-sd (from …/bacula-sd_5.0.2-2.2+squeeze1_amd64.deb) …
Selecting previously deselected package bacula-sd-sqlite3.
Unpacking bacula-sd-sqlite3 (from …/bacula-sd-sqlite3_5.0.2-2.2+squeeze1_amd64.deb) …
Selecting previously deselected package bacula-server.
Unpacking bacula-server (from …/bacula-server_5.0.2-2.2+squeeze1_all.deb) …
Selecting previously deselected package bacula-fd.
Unpacking bacula-fd (from …/bacula-fd_5.0.2-2.2+squeeze1_amd64.deb) …
Selecting previously deselected package bacula-client.
Unpacking bacula-client (from …/bacula-client_5.0.2-2.2+squeeze1_all.deb) …
Selecting previously deselected package bacula.
Unpacking bacula (from …/bacula_5.0.2-2.2+squeeze1_all.deb) …
Selecting previously deselected package bacula-traymonitor.
Unpacking bacula-traymonitor (from …/bacula-traymonitor_5.0.2-2.2+squeeze1_amd64.deb) …
Selecting previously deselected package mt-st.
Unpacking mt-st (from …/archives/mt-st_1.1-4_amd64.deb) …
Processing triggers for man-db …
Setting up acct (6.5.4-2.1) …
Setting up bacula-director-common (5.0.2-2.2+squeeze1) …
Setting up bacula-director-sqlite3 (5.0.2-2.2+squeeze1) …
config: Running dbc_go bacula-director-sqlite3 configure
Stopping Bacula Director…:.
 *** Checking type of existing DB at /var/lib/bacula/bacula.db: None
 *** Will create new database at this location.
dbconfig-common: writing config to /etc/dbconfig-common/bacula-director-sqlite3.conf

Creating config file /etc/dbconfig-common/bacula-director-sqlite3.conf with new version
creating database bacula.db: success.
verifying database bacula.db exists: success.
populating database via sql…  done.
Processing configuration…Ok.
Starting Bacula Director…:.
Setting up bacula-sd (5.0.2-2.2+squeeze1) …
Starting Bacula Storage daemon…:.
Setting up acct (6.5.4-2.1) …
insserv: warning: script 'K02courier-imap' missing LSB tags and overrides
insserv: script iptables: service skeleton already provided!
insserv: warning: script 'courier-imap' missing LSB tags and overrides
Turning on process accounting, file set to '/var/log/account/pacct'.
Done..
Setting up bacula-sd-sqlite3 (5.0.2-2.2+squeeze1) …
Setting up bacula-server (5.0.2-2.2+squeeze1) …
Setting up bacula-fd (5.0.2-2.2+squeeze1) …
Starting Bacula File daemon…:.
Setting up bacula-client (5.0.2-2.2+squeeze1) …
Setting up bacula (5.0.2-2.2+squeeze1) …
Setting up proftpd-basic (1.3.3a-6squeeze6) …
Starting ftp server: proftpd.
Setting up mt-st (1.1-4) …
update-alternatives: using /bin/mt-st to provide /bin/mt (mt) in auto mode.
 

 

Once installed you will have 3 processes running in background used by Bacula backup system (bacula-dir, bacula-sd and bacula-fd)
root@pcfreak:~# ps ax |grep -i bacula|grep -v grep
6044 ? Ssl 0:00 /usr/sbin/bacula-dir -c /etc/bacula/bacula-dir.conf -u bacula -g bacula
6089 ? Ssl 0:00 /usr/sbin/bacula-sd -c /etc/bacula/bacula-sd.conf -u bacula -g tape
6167 ? Ssl 0:00 /usr/sbin/bacula-fd -c /etc/bacula/bacula-fd.conf

Here is what each of them does:

a) Bacula-dir or Bacula-Director is main Bacula Backup system component. Bacula-dir controls the whole backup system and the various other 2 daemons Bacula-FD and  Bacula-SD.

b) Bacula-fd – (Bacula File Daemon) acts as the interface between  Bacula network backup system and the filesystems to be backed up:  it  is  responsible for   reading/writing/verifying the files to be  backup'd/verified/restored. Network transfer can optionally be compressed.

c) Bacula-sd – (Bacula Storage Daemon) acts as interface between Bacula network backup system and Tape Drive or filesystem where backups will be stored

Each of 3 processes bacula-dir, bacula-fd and bacula-sd has their own init script in /etc/rc.d/, e.g.:

# /etc/init.d/bacula-directory
# /etc/init.d/bacula-fd
# /etc/init.d/bacula-sd

2. Configuring Bacula Backup System

Configuring Bacula is done via configuration files located in /etc/bacula

root@pcfreak:~# cd /etc/bacula
root@pcfreak:/etc/bacula# ls -1
bacula-dir.conf
bacula-fd.conf
bacula-fd.conf.dist
bacula-sd.conf
bacula-sd.conf.dist
bconsole.conf
common_default_passwords
scripts/
tray-monitor.conf

3. Defining what needs to be backed up

Here is a short description of most important configuration blocks in Bacula's main config bacula-dir.conf
 

1.Director resource defines the Director’s parameters. Name, Password, WorkingDirectory, and PidDirectory must be set. QueryFile specifies where the Director can find the SQL queries.

2.Job defines a backup or restore to perform. You will need at least one job per client. To simplify configuration of similar clients, create a common JobDefs resource and refer to it from within a Job. For example, if you have one set of defaults for desktops and another set for servers, you can create a Desktop and Server (these names are arbitrary and set with the Name attribute) JobDefs and refer to those two collections of settings from a Job.

3. Schedule resource is referred to within a Job to allow it to occur automatically.

4. FileSet resource defines which files are to be backed up. You can both Include and Exclude files.

5.Each Client resource details the clients that this Director can back up.

6.Storage resource specifies the storage daemon available to the Director.

7.Pool identifies a set of storage volumes (tapes/files) that Bacula can write data to. Each Pool can be configured to use different sets of tapes for different jobs.

8.Catalog resource defines Bacula catalog (database) to be used.

9. Messages resource captures where to send messages and which messages to send.
 

a) Defining directories to be backed up

Defining what needs to be backed up is done through bacula-dir.conf ( /etc/bacula/bacula-dir.conf ). In the file there is a FileSet section, where dirs to backed up have to be included, below config defines to backup /usr/sbin, /etc/, /root, /usr and /var directories
 

# List of files to be backed up
FileSet {
  Name = "Full Set"
  Include {
    Options {
      signature = MD5
    }
#   
#  Put your list of files here, preceded by 'File =', one per line
#    or include an external list with:
#
#    File = <file-name
#
#  Note: / backs up everything on the root partition.
#    if you have other partitions such as /usr or /home
#    you will probably want to add them too.
#
#  By default this is defined to point to the Bacula binary
#    directory to give a reasonable FileSet to backup to
#    disk storage during initial testing.
#
    File = /usr/sbin
    File = /root
    File = /etc
    File = /usr
    File = /var

  }

b) Defining where to store back ups

All configuration of where Bacula will store created backups is done through /etc/bacula/bacula-sd.conf

There are few configurations that needs to be tuned according to custom user purposes, below I paste them from config:
 

Storage {                             # definition of myself
  Name = pcfreak-sd
  SDPort = 9103                  # Director's port     
  WorkingDirectory = "/var/lib/bacula"
  Pid Directory = "/var/run/bacula"
  Maximum Concurrent Jobs = 20
  SDAddress = 127.0.0.1
}

Device {
  Name = FileStorage
  Media Type = File
  Archive Device = /nonexistant/path/to/file/archive/dir
  LabelMedia = yes;                   # lets Bacula label unlabeled media
  Random Access = Yes;
  AutomaticMount = yes;               # when device opened, read it
  RemovableMedia = no;
  AlwaysOpen = no;
}

Messages {
  Name = Standard
  director = pcfreak-dir = all

}

 

Storage sets working directory where temporary backups are created on backup creation time – default is /var/lib/bacula

Device – defines exact directory where backups will be stored after created – usually this is a directory with  mounted hard disk specially for backups. Bacula default is /nonexistant/path/to/file/archive/dir

Messages – configures where and what kind of messages are send on bacula operations

c) Configuring Bacula to create backups via network

Configuring where Bacula will act just on server localhost, or will bind and be visible to store backups via network IP is done from Bacula-FD (Bacula File Daemon).

By default it listens to localhost127.0.0.1. Bacula-FD configurations are done from /etc/bacula/bacula-fd.conf. Most important section configuring where bacula listens is named FileDaemon.
 

#
# "Global" File daemon configuration specifications
#
FileDaemon {                          # this is me
  Name = pcfreak-fd
  FDport = 9102                  # where we listen for the director
  WorkingDirectory = /var/lib/bacula
  Pid Directory = /var/run/bacula
  Maximum Concurrent Jobs = 20
  FDAddress = 127.0.0.1
}
 

 

By commenting FDAddress, Bacula will automatically listen to external IP configured on lan interface eth0

4. Managing Bacula Command Line Interfa – bconsole

Managing bacula interactively is done through bconsole (Bacula's Management Console) command.

root@pcfreak:~# bconsole

Connecting to Director localhost:9101
1000 OK: pcfreak-dir Version: 5.0.2 (28 April 2010)
Enter a period to cancel a command.
*
*help
  Command       Description
  =======       ===========
  add           Add media to a pool
  autodisplay   Autodisplay console messages
  automount     Automount after label
  cancel        Cancel a job
  create        Create DB Pool from resource
  delete        Delete volume, pool or job
  disable       Disable a job
  enable        Enable a job
  estimate      Performs FileSet estimate, listing gives full listing
  exit          Terminate Bconsole session
  gui           Non-interactive gui mode
  help          Print help on specific command
  label         Label a tape
  list          List objects from catalog
  llist         Full or long list like list command
  messages      Display pending messages
  memory        Print current memory usage
  mount         Mount storage
  prune         Prune expired records from catalog
  purge         Purge records from catalog
  python        Python control commands
  quit          Terminate Bconsole session
  query         Query catalog
  restore       Restore files
  relabel       Relabel a tape
  release       Release storage
  reload        Reload conf file
  run           Run a job
  status        Report status
  setdebug      Sets debug level
  setip         Sets new client address — if authorized
  show          Show resource records
  sqlquery      Use SQL to query catalog
  time          Print current time
  trace         Turn on/off trace to file
  unmount       Unmount storage
  umount        Umount – for old-time Unix guys, see unmount
  update        Update volume, pool or stats
  use           Use catalog xxx
  var           Does variable expansion
  version       Print Director version
  wait          Wait until no jobs are running

When at a prompt, entering a period cancels the command.

You have messages.
*
 

On run bconsole launches another service bacula-console.

root@pcfreak:~# ps ax |grep -i bacula-console|grep -v grep 13959 pts/5 Sl+ 0:00 /usr/sbin/bacula-console -c /etc/bacula/bconsole.conf

There are 4 tcp/ip ports via which communication between Bacula processes is done;

a) Communication from bconsole to Bacula is throigh Port Number 9101
b) Communication from bacula-dir to bacula-sd is done using Port Number 9103
c) bacula-dir to bacula-fd talks via Port Number 9102
d) Messages between Bacula-fd to bacula-sd is via port num 9103

Both of 4 ports are only listening on (127.0.0.1) / localhost and thus there is no security risk from external malicious users to enter Bacula remotely.

a) some essential commands while in bconsole shell

*show pools
Pool: name=Default PoolType=Backup
      use_cat=1 use_once=0 cat_files=1
      max_vols=0 auto_prune=1 VolRetention=1 year
      VolUse=0 secs recycle=1 LabelFormat=*None*
      CleaningPrefix=*None* LabelType=0
      RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
      MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
      MigTime=0 secs MigHiBytes=0 MigLoBytes=0
      JobRetention=0 secs FileRetention=0 secs
Pool: name=File PoolType=Backup
      use_cat=1 use_once=0 cat_files=1
      max_vols=100 auto_prune=1 VolRetention=1 year
      VolUse=0 secs recycle=1 LabelFormat=*None*
      CleaningPrefix=*None* LabelType=0
      RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
      MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=53687091200
      MigTime=0 secs MigHiBytes=0 MigLoBytes=0
      JobRetention=0 secs FileRetention=0 secs
Pool: name=Scratch PoolType=Backup
      use_cat=1 use_once=0 cat_files=1
      max_vols=0 auto_prune=1 VolRetention=1 year
      VolUse=0 secs recycle=1 LabelFormat=*None*
      CleaningPrefix=*None* LabelType=0
      RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
      MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
      MigTime=0 secs MigHiBytes=0 MigLoBytes=0
      JobRetention=0 secs FileRetention=0 secs
You have messages.

*status
Status available for:
     1: Director
     2: Storage
     3: Client
     4: All
Select daemon type for status (1-4):

*label
Automatically selected Catalog: MyCatalog
Using Catalog "MyCatalog"
Automatically selected Storage: File
Enter new Volume name:

*messages

b) Restoring Backups with bconsole

Restoring from backups is done with restore command

*restore
Automatically selected Catalog: MyCatalog
Using Catalog "MyCatalog"

First you select one or more JobIds that contain files
to be restored. You will be presented several methods
of specifying the JobIds. Then you will be allowed to
select which files from those JobIds are to be restored.

To select the JobIds, you have the following choices:
     1: List last 20 Jobs run
     2: List Jobs where a given File is saved
     3: Enter list of comma separated JobIds to select
     4: Enter SQL list command
     5: Select the most recent backup for a client
     6: Select backup for a client before a specified time
     7: Enter a list of files to restore
     8: Enter a list of files to restore before a specified time
     9: Find the JobIds of the most recent backup for a client
    10: Find the JobIds for a backup for a client before a specified time
    11: Enter a list of directories to restore for found JobIds
    12: Select full restore to a specified Job date
    13: Cancel
Select item:  (1-13):

 

Bacula can create backups on Tapes as well as tapes are still heavily used for backing data in some Banks, airports and other organizations where data is crucial.

Bacula is not among the easiest systems to create backups but for Backup administrators who work with Linux and FreeBSD it is great. Its scalability allows to make a very robust and complex backupping scheme which are hardly achievalable with other less professional backup tools like rsnapshot or rsync.
 

Create Easy Data Backups with Rsnapshot back-up tool on GNU / Linux

Monday, April 15th, 2013

 

rsnapshot Linux and FreeBSD easy data backup tool logo
Backing up information on Linux servers is essential part of routine system adminsitrator job. Thus I decided to write for those interested in how one can easily create backups of important data through a tiny tool called rsnapshot which I prior used to make periodic data incremental backups on few of Debian Linux servers I manage. In case you wonder why use rsnapshot and not just rsync – the reasons are 2.
a. Rsnapshot is very easy to configure and use and you don't need to have deep understanding on  rsync numerous options to use it.
b. Rsnapshot does support incremental data backups – saving a lot of disk space on backup host.

 

 

 

Mentioning  incremental data backups for some those term might be a news so I will in short explain here what is Incremental Data Backups?

Incremental Data Backups are such backups which only create new backup of system scheduled files to backup only whether there are changes in files to backup or new ones are added to directory/directories set to be routinely backed up. Incremental backups are often desirable as they consume minimum storage space and are quicker to perform than normal periodic whole data archiving (differential backups). rsync has also support for incremental backups but configuring it to do so takes time and requires extra time on reading and understanding how they work, so I personally prefer simplicity rsnapshot brings.

1. Installing rsnapshot with apt-get

Here is rsnapshot debian package description;

debian:~#  apt-cache show rsnapshot|grep -i description -A 5

 

Description: local and remote filesystem snapshot utility
 rsnapshot is an rsync-based filesystem snapshot utility. It can take
 incremental backups of local and remote filesystems for any number of
 machines. rsnapshot makes extensive use of hard links, so disk space is
 only used when absolutely necessary.
Homepage: http://www.rsnapshot.org/

As you can read from description, rsnapshot is a frontend command using rsync to make data backups.

Install of rsnapshot is done through;

 debian:~# apt-get install --yes rsnapshot

Reading package lists… Done
Building dependency tree      
Reading state information… Done
The following NEW packages will be installed:
  rsnapshot
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/140 kB of archives.
After this operation, 598 kB of additional disk space will be used.
Selecting previously deselected package rsnapshot.
(Reading database … 87026 files and directories currently installed.)
Unpacking rsnapshot (from …/rsnapshot_1.3.1-1_all.deb) … –
Processing triggers for man-db …
Setting up rsnapshot (1.3.1-1) …

2. Rsnapshot  package content and Documentation

Once installed here is file content of rsnapshot deb package;

debian:~# dpkg -L rsnapshot

 

/.
/usr
/usr/share
/usr/share/doc-base
/usr/share/doc-base/rsnapshot
/usr/share/doc
/usr/share/doc/rsnapshot
/usr/share/doc/rsnapshot/TODO
/usr/share/doc/rsnapshot/changelog.gz
/usr/share/doc/rsnapshot/Upgrading_from_1.1.gz
/usr/share/doc/rsnapshot/examples
/usr/share/doc/rsnapshot/examples/rsnapshot.conf.default.gz
/usr/share/doc/rsnapshot/examples/utils
/usr/share/doc/rsnapshot/examples/utils/backup_mysql.sh
/usr/share/doc/rsnapshot/examples/utils/mysqlbackup.pl
/usr/share/doc/rsnapshot/examples/utils/random_file_verify.sh
/usr/share/doc/rsnapshot/examples/utils/rsnapreport.pl.gz
/usr/share/doc/rsnapshot/examples/utils/make_cvs_snapshot.sh
/usr/share/doc/rsnapshot/examples/utils/backup_pgsql.sh
/usr/share/doc/rsnapshot/examples/utils/rsnapshotdb
/usr/share/doc/rsnapshot/examples/utils/rsnapshotdb/CHANGES.txt
/usr/share/doc/rsnapshot/examples/utils/rsnapshotdb/rsnapshotDB.pl.gz
/usr/share/doc/rsnapshot/examples/utils/rsnapshotdb/INSTALL.txt
/usr/share/doc/rsnapshot/examples/utils/rsnapshotdb/TODO.txt
/usr/share/doc/rsnapshot/examples/utils/rsnapshotdb/rsnapshotDB.xsd
/usr/share/doc/rsnapshot/examples/utils/rsnapshotdb/rsnapshotDB.conf.sample
/usr/share/doc/rsnapshot/examples/utils/rsnapshotdb/README.txt
/usr/share/doc/rsnapshot/examples/utils/rsnapshot-copy
/usr/share/doc/rsnapshot/examples/utils/backup_rsnapshot_cvsroot.sh
/usr/share/doc/rsnapshot/examples/utils/backup_dpkg.sh
/usr/share/doc/rsnapshot/examples/utils/sign_packages.sh
/usr/share/doc/rsnapshot/examples/utils/mkmakefile.sh
/usr/share/doc/rsnapshot/examples/utils/rsnaptar
/usr/share/doc/rsnapshot/examples/utils/rsnapshot_invert.sh
/usr/share/doc/rsnapshot/examples/utils/rsnapshot_if_mounted.sh
/usr/share/doc/rsnapshot/examples/utils/README
/usr/share/doc/rsnapshot/examples/utils/debug_moving_files.sh
/usr/share/doc/rsnapshot/examples/utils/backup_smb_share.sh
/usr/share/doc/rsnapshot/README.gz
/usr/share/doc/rsnapshot/changelog.Debian.gz
/usr/share/doc/rsnapshot/copyright
/usr/share/doc/rsnapshot/README.Debian
/usr/share/doc/rsnapshot/html
/usr/share/doc/rsnapshot/html/rsnapshot-HOWTO.en.html
/usr/share/doc/rsnapshot/NEWS.Debian.gz
/usr/share/lintian
/usr/share/lintian/overrides
/usr/share/lintian/overrides/rsnapshot
/usr/share/man
/usr/share/man/man1
/usr/share/man/man1/rsnapshot.1.gz
/usr/share/man/man1/rsnapshot-diff.1.gz
/usr/bin
/usr/bin/rsnapshot-diff
/usr/bin/rsnapshot
/var
/var/cache
/var/cache/rsnapshot
/etc
/etc/cron.d
/etc/cron.d/rsnapshot
/etc/rsnapshot.conf
/etc/logrotate.d
/etc/logrotate.d/rsnapshot

To get basic idea, on rsnapshot and how it can be configured and run manually as well as how it can be set-up to run periodic via a cronjob README shipped with package is a good start point.

debian:~# zless /usr/share/doc/rsnapshot/README.gz
....

It is also useful to check program documentation in HTML, whether you have some text browser installed – i.e. lynx or links:

debian:~# links /usr/share/doc/rsnapshot/html/rsnapshot-HOWTO.en.html

Note that many of information in rsnapshot-HOWTO is related to how rsnapshot is installed manually from source, so for Deb based distro users reading these sections can be safely skipped. For Debian users hence it is useful to read howto from section 4.A onwards. man rsnapshot's Examle section is very good reading too as it gives a lot of use scenarios necessary in more complicated backup situations.

3. Configuring Rsnapshot – Setting Data Directories to Backup

Configuration of Rsnapshot is done through /etc/rsnapshot.conf file. There is plenty of comments in file, so opening in text editor and taking few minutes to read commented lines is necessery. Configuration options just like with most Linux tool config files is done through config directives, not commented.

debian:~# cat /etc/rsnapshot.conf |grep -v "#"|uniq

 

 

config_version    1.2

snapshot_root    /var/cache/rsnapshot/

cmd_rm        /bin/rm

cmd_rsync    /usr/bin/rsync

cmd_logger    /usr/bin/logger

interval    hourly    6
interval    daily    7
interval    weekly    4

verbose        2

loglevel    3

lockfile    /var/run/rsnapshot.pid

backup    /home/        localhost/
backup    /etc/        localhost/
backup    /usr/local/    localhost/

 

 

Above config options are clear to understand, there is interval of backups to set (hourly, daily, weekly), verbose level of rsnapshot backup operation log file, lockfile which will be used by rsnapshot to prevent duplicate rsnapshot runs and last backup directive in which you need to specify what needs to be backed up. In config file there is also commented variable for creating rsnapshot backup once a month

#interval   monthly 3

If you need to create backups once a month uncomment it.

In backup directive add all directories from filesystem which need to have routine backup, for example I keep my Apache Web server files in /var/www/, store various install software in
/root/

and keep backup of Qmail (Vpopmail) old emails kept in
/var/vpopmail
.
To make rsnapshot backup those I add after rest of backup directives:

backup  /var/www/   localhost/
backup  /var/vpopmail/  localhost/
backup  /root/  localhost/


It is good practice to change snapshot_root directive to /root/.backups or whether you prefer to keep snapshot_root to default /var/cache/rsnapshot at least link with ln command /root/.backups to -> /root/.backups.

debian:~# ln -sf /var/cache/rsnapshot /root/.backups

If you change snapshot_root to /root/.backups, don't forget to create /root/.backups and set chmod  dir persmissions only readable to owner, i.e.:

debian:~# mkdir /root/.rsnapshot
debian:~# chmod -R 700 /root/.backups

Note that, it is important to use tab delimiters, everywhere in /etc/rsnapshot.conf, if you use space key delimiter instead of Tab you will end up with errors preventing rsnapshot to run.

4. Testing rsnapshot configuration and launching it first time

I will say it once again use Tab key for delimiters in config. It was my mistake on first time Rsnapshot launch to use spaces to delimiter my config options, thus testing my configuration, rsnapshot print an error and failed:

debian:~# rsnapshot configtest

 

———————————————————
rsnapshot encountered an error! The program was invoked with these options: /usr/bin/rsnapshot configtest ———————————————————
ERROR: /etc/rsnapshot.conf on line 199: ERROR: backup /var/www/ localhost/
ERROR: ———————————————————
ERROR: Errors were found in /etc/rsnapshot.conf, ERROR: rsnapshot can not continue. If you think an entry looks right, make
ERROR: sure you don't have spaces where only tabs should be.  

After changing, Space delimiters with Tabs and re-running rsnapshot configtest if all fine you get:

debian:~# rsnapshot configtest
Syntax OK

Once all good with config to launch Rsnapshot do its first complete incremental data backup, to display what rsnapshot will backup and what exact rsync invocations will it use type:


debian:~# rsnapshot -t hourly

echo 5644 > /var/run/rsnapshot.pid
mv /var/cache/rsnapshot/hourly.2/ /var/cache/rsnapshot/hourly.3/
mv /var/cache/rsnapshot/hourly.1/ /var/cache/rsnapshot/hourly.2/
native_cp_al("/var/cache/rsnapshot/hourly.0", \
    "/var/cache/rsnapshot/hourly.1")
/usr/bin/rsync -a –delete –numeric-ids –relative –delete-excluded /home \
    /var/cache/rsnapshot/hourly.0/localhost/
/usr/bin/rsync -a –delete –numeric-ids –relative –delete-excluded /etc \
    /var/cache/rsnapshot/hourly.0/localhost/
/usr/bin/rsync -a –delete –numeric-ids –relative –delete-excluded \
    /usr/local /var/cache/rsnapshot/hourly.0/localhost/
/usr/bin/rsync -a –delete –numeric-ids –relative –delete-excluded \
    /var/www /var/cache/rsnapshot/hourly.0/localhost/
/usr/bin/rsync -a –delete –numeric-ids –relative –delete-excluded \
    /var/vpopmail /var/cache/rsnapshot/hourly.0/localhost/
/usr/bin/rsync -a –delete –numeric-ids –relative –delete-excluded /root \
    /var/cache/rsnapshot/hourly.0/localhost/
touch /var/cache/rsnapshot/hourly.0/

To launch backup first time manually:

debian:~# rsnapshot hourly

Depending on backupped data (Mega/Giga/Terabytes) size and the number of files which had to be backed up, backup takes from minutes to hours.
Note that it is always good idea to create backups on separate hard disk configured in some kind of RAID array, preferrably (RAID 1 or RAID 5). Creating backups on separate hard disk has numerous advantages, the most important one is it doesn't put too much Input / Output (I/O) stress on hard disk and thus will not create server downtimes on High traffic – Busy servers slow old Hard Disks or servers with Big amount of I/O HDD read/writes .

5. Enabling Rsnapshot to create backups via scheduled cron job

On package install Rsnapshot creates a skele file for running via cronjob in /etc/cron.d/rsnapshot.

debian:~# cat /etc/cron.d/rsnapshot

 

 

# This is a sample cron file for rsnapshot.
# The values used correspond to the examples in /etc/rsnapshot.conf.
# There you can also set the backup points and many other things.
#
# To activate this cron file you have to uncomment the lines below.
# Feel free to adapt it to your needs.

# 0 */4        * * *        root    /usr/bin/rsnapshot hourly
# 30 3      * * *        root    /usr/bin/rsnapshot daily
# 0  3      * * 1        root    /usr/bin/rsnapshot weekly
# 30 2      1 * *        root    /usr/bin/rsnapshot monthly
 

To make hourly, daily, weekly, monthly backup uncomment one of above 4 lines. For paranoid admins scared to loose even a bit of data, hourly data is a good solution. For me personally I prefer configuring weekly backups for the reason I routinely monitor servers – keeping an eye regularly on dmesg and checking Linux smard / smartmontools logs to find out whether a hard disk or RAID has bad blocks

6. Checking backup size / backup difference and backup structure

Checking size of backups can be done by using standard du command on backup directory:

debian:~# du -hsc /var/cache/rsnapshot/*
4.3G /var/cache/rsnapshot/hourly.0
4.5M /var/cache/rsnapshot/hourly.1
68M /var/cache/rsnapshot/hourly.2
4.4G total

rsnapshot also has du argument via which backup size can be viewed:

debian:~# rsnapshot du 4.3G /var/cache/rsnapshot/hourly.0/
4.5M /var/cache/rsnapshot/hourly.1/
68M /var/cache/rsnapshot/hourly.2/
4.4G total

As you can see each new incremental backup is with new number after hourly{0,1,2} etc.

To check difference between two different backups:

debian:~# rsnapshot diff /var/cache/rsnapshot/hourly.0/ /var/cache/rsnapshot/hourly.1/
Comparing /var/cache/rsnapshot/hourly.1 to /var/cache/rsnapshot/hourly.0
Between /var/cache/rsnapshot/hourly.1 and /var/cache/rsnapshot/hourly.0:
660 were added, taking 3728377727 bytes;
492 were removed, saving 17623 bytes;

Structure of backed up files is identical to normal copy of files without any compression:

debian:~# cd /root/.backups/hourly.0/localhost/
debian:~/.backups/hourly.0/localhost# ls

etc/ home/ root/ usr/ var/

 

7. Restoing files or directory from rsnapshot backup

To restore lets say /var directory cd into it:

debian:~/.backups/hourly.0/localhost# cd var
debian:~/.backups/hourly.0/localhost/var#

Then use rsync as follows:

debian:~/.backups/hourly.0/localhost/var# rsync -avr * /
 

 

8. Creating rsnapshot backups from remote server via SSH protocol

In /etc/rsnapshot.conf you should have set SSH port on which remote server is accepting SSH connections. Standard port is 22, however it is wise to configure on backup server SSH to listen to some other non standard port.

In config variables to look on are:

ssh_args -p 22

and

Onwards to enable remote login via ssh uncomment in /etc/rsnapshot.conf :

# cmd_ssh /usr/bin/ssh

to

cmd_ssh /usr/bin/ssh

Before starting rsnapshot to create backups on remote host2 you need to Configure automatic SSH passwordless login by generating DSA or RSA key pair between host1 and host2. Where host1 is machine on which rsnapshot is run and to which backups will be copied from host2
Once passwordless ssh to remote host is active, to force rsnapshot create backups from host1 you will need to add near end of /etc/rsnapshot.conf .

backup  root@host2.com:/root/ host2.com/

The same way you can add a number of remote hosts from which periodic backups will be created to central host1. Only condition is on each node – host3, host4, host5.

backup  root@host3.com:/root/root/ host3.com
backup  root@host4.com:/home/ host4.com
backup  root@host4.com:/var/ host4.com

To create on host1 public key (id_dsa.pub) file with command:

debian:~# ssh-keygen -t dsa
...
....
debian:~# ssh-copy-id -i ~/.ssh/id_dsa.pub root@host3

Once all hosts that needs to get backed up to central backup host – host1. To test if backups gets uploaded manually issue:

debian:~# rsnapshot -v hourly
...

Rsnapshot has a number of other scripts which can be easily integrated with it in /usr/share/doc/rsnapshot/examples/utils.
Inside you can find example scripts on how to create MySQL / PostgreSQL database backup, Samba Share backups, backup CVS repositories and so on. The scripts can be easily modified and work with mostly any data or protocol with a bit of tweaking. Short description of each of example scripts can be found in /usr/share/doc/rsnapshot/examples/utils/README