Thursday, January 20, 2011

Windows 7: Choosing a Duplicate File Finder

I had long used DoubleKiller to find and delete duplicate files.  The switch to Windows 7 made this seem like a good time to review the market and see if there were better alternatives.

This question acquired urgency because I came across a need to reconcile two hard drives, and discovered that I could not.  The reason was that my customized Start Menu -- containing not only links to programs but also the entire program folders for my portable apps -- was rife with duplicate files.  These were not like duplicates among my data files.  There, I would generally want to delete duplicates.  Here, deleting duplicates would mean that programs would not run.

DoubleKiller permitted me to compare entire drives and to mark tons of duplicates for deletion, all in one move.  That sort of thing could obviously be misused.  But I had gotten reasonably good at focusing it so as to delete only what I wanted to delete.  What I wanted now -- what DoubleKiller did not offer -- was a way to exclude some subfolders within those drives and top-level folders.

This was a delicate matter.  After all, in some operations (e.g., reconciling two hard drives), I would be trusting this program to accurately identify and delete thousands of duplicate files with one click.  Judging from the number of freeware duplicate detection utilities, many of which drew approval from multiple reviewers, this was a kind of task that could be done well by a good programmer.  I just didn't want any unexpected surprises.  Fortunately, at this point I was using Beyond Compare to check my drives against backups for changes on a file-by-file basis, and I had also begun using GoodSync to analyze differences between two computers (more specifically, to stop before proceeding if more than 10% of the files differed), so I had some protections.  Nonetheless, I wanted a reliable program.

So I went looking for a replacement for DoubleKiller -- something that would render it duplicative, if you will.  Ranked by frequency of download, SnapFiles said the top five general-purpose (i.e., not image-specific) duplicate detectors were (Auslogics) Duplicate File Finder, Fast Duplicate File Finder, AllDup, Duplicate Cleaner, and LookDisk.  (DoubleKiller was sixth.)  eHow gave instructions for using Duplicate Cleaner, Auslogics Duplicate File Finder, and AntiTwin.  CNET was a muddled source for this purpose, offering general-purpose utilities (e.g., Glary) with some duplicate detection capabilities; I decided to stick with a utility dedicated to this specific task.  In two different searches of CNET, Duplicate Cleaner and Auslogics Duplicate File Finder seemed to be leaders.  A review at Gizmo named Duplicate Cleaner, Auslogics, Anti-Twin, and Fast Duplicate File Finder as the top four.  SnapFiles had not said how many downloads they had; I decided their selection was iffy.

I went to the homepages for Duplicate Cleaner, Auslogics, AntiTwin, and Fast Duplicate File Finder, looking particularly for flexibility in excluding the files within a subfolder from deletion.  From the webpage for Duplicate Cleaner, it sounded like they had this capabilty; from those for Auslogics and Fast Duplicate File Finder, I could not tell.  Anti-Twin sounded rather bare-bones -- saying, for instance, that "the software ignores file names."  For some projects, I wanted the option of comparing file names, to identify nominal duplicates that I would then manually choose among.

Conveniently, then, Duplicate Cleaner seemed to be not only the leading program named by several of the sources cited above, but also the only one among these contenders that said, right on the wrapper, that it had what I wanted.  I liked that they had a manual on their website, and that they also had the beginnings of a support forum.  I downloaded and installed it.  Its interface was more user-friendly, which meant that I immediately disliked it.  Seriously, I was turned off by the options to compare for "Same Artist," "Same Album," etc.  But I figured I would learn to ignore that sort of thing.  Its user friendliness did make it much easier to select drives or folders for comparison; and when I selected a drive and then tried to select one of that drive's folders, the program asked if I was naming the folder for exclusion.  Exactly what I wanted.  In the "More Options" area, the default comparison was MD5, but it had an option for byte-to-byte and two SHA formats.

I ran a check for duplicates, using both Duplicate Cleaner (version 2.0) and also DoubleKiller (version 1.6) on the same drives.  I had Duplicate Cleaner set to search for Same Content, any date, no file filters.  I wasn't sure how to set the file size criterion so I set it to Any Size.  It gave an option of setting a minimum file size of 1KB, but I didn't want it to skip files that were, say, 500 bytes; I had written many short text files, containing a note on this or that, that would be that size or smaller.  I went with the default MD5 content comparison type.  In DoubleKiller, I searched for files with identical sizes and CRC32 checksums, and I told it to ignore system files (because I didn't want a hundred copies of Thumbs.db or desktop.ini) as well as files whose size was equal to 0 KB.  I also told DoubleKiller to exclude .dll, .sys, vxd, and .inf files, as well as those of zero bytes and system files.  In this area, I felt that DoubleKiller's options were better, though not ideal.

I started both searches on the same machine at the same time.  During the searches, DoubleKiller remained visible, and slowly started adding visible duplicates to its list.  In a previous search, Duplicate Cleaner had seemed to stall.  Its title bar said "Not responding" and it got that sort of half-screwed-up look that graphics would get sometimes, when the computer was running out of memory or about to crash.  This time around, that didn't happen; still, it was hard to rouse it from the taskbar.  Duplicate Cleaner did not show any interesting specifics about duplicates while it was running.  Both used a screwy way of calculating the percentage of completion:  wait forever for the job to get to 1% completed, then soon it's at 5% and by the time it gets to 40% it's moving right along.  Perhaps because CRC32 checksums were easier to calculate, DoubleKiller was done first, scanning 198,645 files in 46 minutes.  Duplicate Cleaner carried on for a total of 57 minutes.  Both programs surely would have done the job much faster if they had not been competing against each other for the same computer resources.  Duplicate Cleaner found "56 Groups of duplicates" involving a total of 160 files.  DoubleKiller did not present a number, but by my count, it found 111 duplicative files.

I found the screen font displaying outputs to be much more readable in Duplicate Cleaner.  Duplicate Cleaner also allowed me to change the background colors behind alternating duplicate pairs, so as to make it easier to figure out where the alleged duplication was appearing.  Unfortunately, Duplicate Cleaner had many columns of information, but right-clicking on the column row did nothing.  In other words, it did not appear possible to suppress those columns so as to focus on the ones that mattered (e.g., MD5 hash) without scrolling right and left for every duplicative pair.

I was not satisfied with Duplicate Cleaner, and I was out of time to compare the others.  I decided, for now, to continue with DoubleKiller, and to review one of the others sometime in the future.

5 comments:

Anonymous

Duplicate Files Deleter also finds and deletes duplicate. Easy to use software.

Unknown

Do you face problem to delete duplicate files. Here is me who can suggest you the right thing. Visit this website to get proper help. DuplicateFilesDeleter.com Let me know more about it. Thanks

Unknown

In this condition I used DuplicateFilesDeleter effectively. This software will let you get a huge amount of space for your use by deleting the files that were at multiple locations.rsstivi

Alex

I use Duplicate Remover Free. http://manyprog.com

Unknown

Delete duplicate files with ease!
Try DuplicateFilesDeleter program and get rid of duplicate files.
Thank you!