Posts Tagged ‘Delete Duplicate’

How to find and Delete Duplicate files in directory on Linux server with find and fdupes command

Monday, March 16th, 2015


Linux / UNIX find command is very helpful to do a lot of tasks to us admins such as Deleting empty directories to free up occupied inodes or finding and printing only empty files within a root file system within all sub-directories
There is too much of uses of find, however one that is probably rarely used known by sysadmins find command use is how to search for duplicate files on a Linux server:

find -not -empty -type f -printf “%s\n” | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 –all-repeated=separate

If you're curious how does duplicate files finding works, they are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison.

Most common application of below command is when you want to search and get rid of some old obsolete files which you forgot to delete such as old /etc/ configurations, old SQL backups and PHP / Java / Python programming code files etc.

If you have to do a regular duplicate file find on multiple servers Linux servers perhaps you should install and use  fdupes command.
On Debian Linux to install it:

root@pcfreak:/# apt-cache show fdupes|grep -i descr -A 4
Description: identifies duplicate files within given directories
 FDupes uses md5sums and then a byte by byte comparison to find
 duplicate files within a set of directories. It has several useful
 options including recursion.
Homepage: apt-get install –yes fdupes

To search for duplicate files with fdupes in lets /etc/ just run fdupes without arguments:


root@pcfreak:/# fdupes /etc/



If you want to look up for all duplicate files within root directory:

root@pcfreak:/# fdupes -r /etc/
Building file list /


You can also find duplicate files for multiple directories by just passing all directories as arguments to fdupes


root@pcfreak:/# fdupes -r /etc/ /usr/ /root /disk /nfs_mount /nas

The -r argument (makes a recursive subdirectory search for duplicates), if you want to also see what is the size of duplicate files found add -S option


fdupes -r -S /etc/ /usr/ /root /disk /nfs_mount /nas


If you want to delete all duplicate files within lets say /etc/


root@pcfreak:/# fdupes -d /etc/

fdupes is also available and installable also on RPM based Linux distros Fedora / RHEL / CentOS etc., install on CentOS with:

[root@centos~ ]# yum -y install fdupes

There is also a port available for those who want to run it on FreeBSD on BSD install it from ports:


freebsd# cd /usr/ports/sysutils/fdupes
freebsd# make install clean

If you have a GUI environment installed on the server and you don't want to bother with command line to search for all duplicate files under main filesystem and other lint (junk) files take a look at FSlint


If you're looking for a GUI cross platform duplicate file finder tool that runs on all major used Operating Systems Mac OS X / Windows / Linux take a look at dupeGuru


Share this on