Friday, March 30, 2012

A Backup Arrangement with Beyond Compare

I had been using Beyond Compare (BC) for a year or two.  Over that period, I had settled into what seemed like a decent backup arrangement.  This post describes that arrangement.

For a while, I had a spare internal partition to which I would make backups.  The original concept there was that I would use rsync or some other program to make backups on an hourly basis.  That setup had fallen into disrepair, mostly because I didn't quite like how it was working.  So I didn't have an hourly backup at this point.  The arrangement described here is more on the longer-term (e.g., daily, weekly, monthly) level.

My backup took place on external drives.  I had an external enclosure that I would have to open up (removing several screws) to swap drives, and I also had an inexpensive dock that I could just plug an internal SATA drive into.  Both seemed to work equally well.  The enclosure was handy for unplugging the drive and taking it with me.  Now and then -- especially when the tornado alarms sounded -- I visualized myself grabbing it and running for the basement.  I wondered if that would be one of those fateful delays that would cost me my life.

The external enclosure had an eSATA connector, but my previous motherboard had not been able to accommodate eSATA on a hot-swappable basis.  In other words, I had to reboot in order to get the system to recognize it.  It also had a USB connector.  The external dock (i.e., not the enclosure) was also a USB device.  USB was slower but very adaptable.  That was almost always what I used.  Some partitions on the external drive were compressed, to save space.  I had the impression that this did not help with the USB connection -- that the CPU would unpack the file before shipping it across the slow USB cable to the computer, resulting in at least as much data moving along the wire -- but I hadn't verified that.

For my purposes, Beyond Compare offered two key concepts.  First was the workspace.  If I plugged in the external drive that I used for daily backups, then I would open up the DAILY workspace in BC.  If I plugged in a drive that I used for weekly backups, then I would choose the WEEKLY workspace.  I also had a SIMPLE COMPARE workspace that I would use for random tasks -- say, comparing two folders on a one-shot basis.  And I had a NETBOOK workspace that I would use to synchronize my laptop.  GoodSync might have been better for that if I had been using the laptop frequently, but at this point it was mostly a case of keeping the data on the laptop current with the desktop.  That is, I was mostly doing one-way updates, from desktop to laptop.

My workspaces differed in the tabs they made available.  In the DAILY workspace, I had a tab for each day of the week, plus whatever other comparisons I would want BC to make on a daily basis.  Likewise for the WEEKLY and the other workspaces.  In other words, I used a workspace as a place where I would be able to see tabs for each comparison that I wanted BC to make, whenever I plugged in the weekly drive or the laptop or whatever.

I found that the best approach was to start BC first, let the workspace load, and only then turn on or connect the external USB drive.  That way, BC would not try to do complete comparisons for all of the open tabs.  It would do its calculations for the relevant folders on the drives inside the computer, which were already available to it, but on the external drive it would have to wait until I gave it the go-ahead within a particular tab.

Focusing on the DAILY workspace, I was writing these notes on a Friday.  So to guide my remarks, I opened BC at this point.  Somehow, I had arranged for the DAILY workspace to come up by default; or maybe BC just defaulted to the last open workspace.  I wasn't sure how I had arranged that.  When BC was up and running, I turned on the USB drive.  It took that drive a moment to become available.  (I found that AntiRun was useful, not only for protecting my system from autorun malware and such, but also for telling me when a drive really was online or offline, and for giving me a functional way of taking external drives offline.)

I went to the Friday tab.  BC had stalled because the Friday folder on the external drive had been unavailable.  I told it to retry; and now that the USB drive was connected, BC ran its comparison.  (Details on the kinds of comparisons available, and other program capabilities, are available at Beyond Compare's website.  Their forums and other tech support had been very responsive, the few times I had contacted them.)

I had modified my BC toolbar to present the red Mirror button.  This said, basically, just overwrite whatever is in the backup space (in this case, the Friday folder on the USB drive) with whatever is on drive D in the computer.  Drive D was the one that I backed up daily.  So in this case, a number of files had changed since the previous Friday.  Sometimes I would take a look at them; sometimes not.  Usually not.  It seemed pretty rare that a file would be accidentally deleted.  Daily examination of all changing files had seemed to be overkill.

When I say that I would take a look, I mean that BC showed me two panes, one for each of the folders being compared.  To keep things organized, the left-hand pane was almost always the authoritative one.  The left-hand pane would correspond, that is, to a partition inside the computer.  So I was looking at the right-hand pane, corresponding to the backup device.  If I saw a file listed in the right-hand pane, but not in the left-hand pane, that would mean that it was on the system when I made my backup a week ago, but now it was no longer on the system.  BC would also alert me, with a red font, if the file in the right-hand pane was newer.  Generally speaking, that wasn't supposed to happen.

I had an alternating weekly folder on this backup drive.  I used that one on Saturdays.  That's the one I examined more closely.  If I found that something was missing on Saturday, and I didn't think it should be missing, I could then click on the tabs for the other days of the week until I found the last backed-up version, and I could restore it from there.

Drive D contained the things that were in more active use.  I also had a separate partition, drive E, for things that took up a lot of space and didn't change very often.  Videos were the main example.  Because there were so few changes there, it was easier to look at the differences identified in BC, and verify that additions and deletions were desired.

In net terms, I liked this arrangement because it gave me some flexibility to combine automated and manual processes.  I wasn't vulnerable to one of those black-box backup solutions that would seem like they were working just fine until the moment of crisis, when I would painfully discover that I had failed to adjust some essential setting, or that the drive was malfunctioning, or whatever.

In this arrangement, if I was worried that files were missing, I could look down through lists of what was being added and deleted.  If I was confident that everything was fine, I could just click the Mirror button and the backup would happen.  I could also combine both approaches within a single tab, by telling BC to mirror only the selected folder(s).  This would gradually reduce the number of things remaining on the screen (assuming I had BC set, as usual, to Show Differences rather than Show All).  When confronted with what looked like a mess, I could thus eliminate the parts that seemed OK, and focus on the files and folders that didn't seem like they should have been getting added or deleted.

Like most other computer-related matters, my backup approach continued to evolve.  But as I say, I had been using BC for a while, at this point, and I was pretty much satisfied with the combination outlined here.

Monday, March 26, 2012


This is a companion to the Win7RegEdit.reg file.  That one is for 32-bit Windows 7; this one is for 64-bit.  There are further comments in that post and in the Windows 7 Tweaked Installation post.  It bears repeating that inappropriate tinkering can screw up a system.

Windows Registry Editor Version 5.00

; Run Ultimate Windows Tweaker first.  This adds options not available there.
; More info & restore options in 32-bit version of this file.

; ************* WINDOWS EXPLORER *************

; Disable libraries

; Turn off Details pane (at bottom of WinEx)

; Set Documents folder template as default
[HKEY_CURRENT_USER\Software\Classes\Local Settings\Software\Microsoft\Windows\Shell\Bags\AllFolders\Shell]

; Add context menu option to open files with Notepad
@="Open with Notepad"
@="notepad.exe \"%1\""

; Disable Windows from asking "Do you want to open this file?"
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Download]

; Disable annoying web service dialog for opening files

; Disable Windows 7 built-in CD burning

; ************* START MENU, TASKBAR, AND THUMBNAILS *************

; Make Aero Peek happen quickly (200 milliseconds; default is 500)

; Make Aero taskbar thumbnails show contents quickly when hovering

; Increase Start Menu display speed (default is 400)
[HKEY_CURRENT_USER\Control Panel\Desktop]

; ************* FILE LOCATIONS *************

; Point to W for customized Start Menu and Programs
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders]
"Administrative Tools"="W:\\Start Menu"
"Programs"="W:\\Start Menu\\Programs"
"Startup"="W:\\Start Menu\\Programs\\Startup"
"Start Menu"="W:\\Start Menu"
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders]
"Programs"="W:\\Start Menu\\Programs"
"Startup"="W:\\Start Menu\\Programs\\Startup"
"Start Menu"="W:\\Start Menu"

; Point to Current folder for Music, Video, Pictures, etc.
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders]
"My Music"="D:\\Current"
"My Pictures"="D:\\Current"
"My Video"="D:\\Current"
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders]
"My Music"="D:\\Current"
"My Pictures"="D:\\Current"
"My Video"="D:\\Current"

; Point to X:\Cache for cookies, cache, etc.
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders]
"Cache"="X:\\Cache\\Temporary Internet Files"
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders]
"Cache"="X:\\Cache\\Temporary Internet Files"

; Customize default places bar in Win7's common file dialog box
"Place3"="D:\\Personal Projects"
"Place4"="W:\\Start Menu\\Programs"

; ************* INTERNET EXPLORER *************

; Specify IE download directory
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer]
"Download Directory"="D:\\Current"

; Force IE to launch shortcuts in a new window
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main]

; Disable “Speed up Browsing by Disabling Add-ons” popup notification

; ************* LOGIN, LOGOUT, SHUTDOWN *************

; Save settings on exit

; Disable automatic restart after crash so you can see error messages
"AutoReboot "=dword:00000000

; Manually generate a crash by hitting RightCtrl-ScrollLock (twice on the latter)
; Undo:  "CrashOnCtrlScroll"=dword:00000000

; ************* UNBLOCK MICROSOFT OFFICE *************
; Some older files cannot be opened in Office without these fixes
; Unblock Word

; Unblock Excel


; Unblock PowerPoint


; Unblock Corel Draw
[HKEY_LOCAL_MACHINE\Software\Microsoft\Shared Tools\Graphics Filters\Import\CDR]

; ************* OTHER TWEAKS *************

; Remove "Shortcut" from title of shortcuts

; Disable creation of Thumbs.db

; Disable beep on error
[HKEY_CURRENT_USER\Control Panel\Sound]

; Increase Internet download connections to 10
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings]

; Disable User Account Control (UAC) - probably already done some other way

; Google Earth cache settings
[HKEY_CURRENT_USER\Software\Google\Google Earth Plus]
:: Move cache to drive X
:: Originally "CachePath"="C:\\Users\\Ray\\AppData\\LocalLow\\Google\\GoogleEarth"
"CachePath"="X:\\Cache\\Google Earth"
:: Set disk cache to 2GB and memory cache to 1GB
[HKEY_CURRENT_USER\Software\Google\Google Earth Plus\Cache]

Friday, March 23, 2012

Synchronizing Two Computers: GoodSync and Alternatives

I had a home network with two computers, consistent with my evolving concept of how many computers a person should have.  (The third was a notebook.  The fourth was now distributed, in pieces, across the universe.)  I found it was convenient to have the second of those two computers take care of backup, construct reliable file indexes for searching, hold certain programs open, run certain batch files at regular intervals, and remain available for moments when the primary computer was doing maintenance or for some other reason was unavailable.

I was writing these words at one such moment.  The primary computer had just become dysfunctional.  And this was OK.  I was able to turn to the secondary computer and keep on working, pretty much where I had left off, because I was using a synchronization program that kept the two computers in sync via ethernet cable.

Critique of GoodSync

Except that, unfortunately, now was the moment when my synchronization program became flaky.  That program was GoodSync.  More than a year earlier, I had examined reviews of synchronization programs and had chosen GoodSync over the others.  I had purchased the pro version and worked through the details of setting it up.

During this year, I had found GoodSync largely suited to the task.  There were two principal problems, one of which prompted my search for a replacement.  That was the problem of licensing.  GoodSync's representative told me that the license allowed me only two installations per year.  The problem was that GoodSync could not be deactivated.  So, in my setup, I installed it on the secondary computer, and used it to synchronize my laptop and my primary computer with the secondary computer.  But if Windows became dysfunctional, or if my hard drive crashed, or for any other reason I had to reinstall GoodSync, that would count as one of my two permitted installations.

So I had already had one go-round with the GoodSync people about that.  Now there was another licensing issue.  They promised me that, since I had bought my copy within a certain timeframe, I would qualify for a free upgrade from version 8 to version 9.  But when the time came, the licensing did not work.  So at the moment when I was writing these words, I was looking at a notice that said this:

You have exceeded the Free version lmitations.  To continue using GoodSync you should upgrade to GoodSync Pro version or reduce your jobs to 3 or files in a job to 100.
But when I tried to activate, I got this:
FAILED:  Our records indicate that you have only GoodSync 8 licenses in the order you specified, but you are trying to activate GoodSync 9, which requires a separate license.
My options in this regard were to get on the phone and complain about the broken promise, go through the steps to uninstall version 9 and reinstall version 8, or buy an upgrade to version 9.  I did contact them and get it sorted out.  Possibly it was even my mistake in the first place.  I was not fresh on the details when writing these notes.  Point is, the matter might have been settled there, if it hadn't been for the second problem.

The second problem, which had been more persistently troublesome, was that GoodSync was a tremendous resource hog.  It would interrupt or slow down my work very noticeably, to the point that the computer would become unusable and even unresponsive.  I emailed them about this too.  Their reply was that I should upgrade to version 9.  I did that.  But that didn't fix the problem.  Version 9 did have an option, which may or may not have been present in version 8, to impose a "speed limit" on file copying.  This was obviously not an ideal solution:  it would slow things down when the computer was free, and it might not slow things down enough when the computer was busy.  This would be one more thing I would have to tinker with, in hopes of finding the right value.  Why not just have the kind of "smart" arrangement found in other programs, where it drops back to a slow jog or a standstill, depending on computer resource usage?

The slowdown problem was exacerbated by uncertainty as to how to get the damned thing to just stop when I was finally tired of waiting on it.  Right-clicking the system tray icon would give me options to Stop All or Disable Auto Run.  Disable Auto Run seemed to be only temporary.  A while later, I would be sitting at the computer in a stupor, waiting for it to do something, and would then find that it had sneakily started doing auto synchronizations again.  On the other hand, if I clicked Stop All, a long time could go by before I would think to check up on my supposedly automatic software.  And Stop All did not work reliably anyway.  More than once, I had to go into Task Manager (Ctrl-Alt-Del) and just kill GoodSync immediately.

Those were the big issues.  There were some others.  I didn't like that I couldn't get a clear indication of how long a job would be running.  When a program does the same thing 20 times a day, eventually it would seemingly be able to give a pretty good estimate of how long it would take to complete a synchronization.  I didn't like that GoodSync needed to distribute _gsdata_ folders all over my computer.  I had looked on that as the price of progress, but for some reason those folders sometimes contained copies of files, which could lead to confusing and potentially disastrous results when doing a sweep for unnecessary duplicates or searching for the latest copy of a file.  I didn't like that GoodSync was not able to keep computers synced on an up-to-the-moment basis.  I would have thought that, with the aid of those _gsdata_ folders, the program would have been able to know, pretty quickly, when a file had been changed, and it would ordinarily seem pretty straightforward to get a copy of that file to the other machine.  What I actually got, unfortunately, was a nearly constant hard drive workout on the secondary machine.

A Search for Alternatives

Since I was at this point of having to deal with license issues again, for the third or fourth time in the past 12 months, it occurred to me that it might make sense to see if I could find a better alternative.  I had pretty good impressions of recommendations from Gizmo's Freeware, so I started with their list of Best PC Freeware.  Their choice for best folder synchronization utility was PureSync.  Oddly, Gizmo's link to a review of PureSync actually led to a November 2011 Gizmo article that named a different program, FreeFileSync, as the best free synchronizer.  FreeFileSync didn't sound too good -- no autosync, among other things -- but the article did mention Allway Sync as a commercial alternative. 

A search for these three (i.e., GoodSync, PureSync, and Allway Sync) led to an alternativeTo webpage listing a number of possibilities.  Among those possibilities, I decided that a programmatic solution like rsync would not work for me, partly because I had not tended to devote the time needed to maintain command-line file management solutions, and partly because I wanted to be able to see what was being overwritten.  I decided against using Beyond Compare, which I used and liked for backup verification purposes, because it did not seem to provide a suitably automated solution.  The only other item that stood out, for me, among the alternativeTo entries was Microsoft SyncToy.  I decided to reserve judgment on it for now.  (For reasons of speed and reliability, I was looking solely at ethernet-capable tools.)

Seeking guidance, I looked back at the previous year's post, to see what comparison sources I had consulted then.  This led to the "sync software" category at TopTenReviews and the "data transfer & sync software" category at CNET.  MajorGeeks didn't have a sync category, and at Softpedia and elsewhere, sync software was mixed into a general category.  The top six at TopTenReviews were Syncables 360, GoodSync, SugarSync, Laplink PCsync, ViceVersa, and Allway Sync.  At CNET, filtering for Windows 7 programs, there were essentially two lists.  First were the "sponsored" (i.e., bought) reviews.  These were iffy:  ViceVersa Pro got four stars from editors and users alike, but the sponsored BeyondSync got five stars from editors but only three stars from users.  Otherwise, among programs most downloaded during the preceding week (so as to avoid some that may have been popular in the olden tymes), the only one of note seemed to be Allway Sync.

From those sources and from general browsing, certain names appeared to be emerging, including GoodSync, Allway, and ViceVersa.  A search for comparisons of those three didn't yield obvious recent sources of insight.  (Later, I would find a Wikipedia page that compared many programs on a number of features.  It seemed to indicate that GoodSync was one of the most capable programs.  I wasn't sure, at that point, whether a closer look at the other programs reviewed there would have yielded a different outcome for my purposes.  My tentative conclusion from that webpage was that, along with the programs reviewed below, I probably would have looked at FreeFileSync.)  The TopTenReviews comparison suggested that, in terms of features, the top six just listed were mostly competitive for my purposes, except for SugarSync.  According to TopTenReviews, the top three (Syncables 360, GoodSync, and Laplink) pulled away from ViceVersa and Allway Sync in terms of ease of use, support, and documentation.

I looked at individual product reviews at Softpedia.  There were some contrasts.  GoodSync got 2.9 stars from 118 users.  ViceVersa Free got 2.5 stars from 23 users.  ViceVersa Plus got 3.0 stars from 25 users.  ViceVersa Pro got 3.6 stars from 28 users.  SugarSync got 3.1 stars from 25 users.  Syncables Desktop got 1.6 stars from 3 users.  Allway Sync got 3.3 stars from 112 users.  BeyondSync got 3.2 stars from 11 users.  There was no Softpedia entry for Laplink.  In short, according to voters at Softpedia, the two leading programs were ViceVersa Pro and Allway Sync.  The lackluster score given to GoodSync seemed to be relatively well based.  That is, there seemed to be a fair number of confirming votes.  I should mention that, in a presumably different class, Microsoft SyncToy averaged 3.9 stars from 54 voters.  Of these programs, it appeared that Softpedia's editors rated only SyncToy.  They gave it four stars.

I also checked program reviews at CNET.  GoodSync got four stars from 654 user reviews, and four stars from editors.  ViceVersa Plus got no votes; ViceVersa Free got 3.5 stars from 20 users; ViceVersa Pro got four stars from 30 users, and four stars from editors.  SugarSync got five stars from editors, and 2.5 stars from 82 users.  Syncables 360 got two stars from four users.  Allway Sync got 3.5 stars from 53 users, and 4.5 stars from editors.  BeyondSync got three stars from 14 users, and five stars from editors.  Laplink had one star from one user.  SyncToy got four stars from editors and 3.5 stars from 61 users.  These results suggested that, for CNET users, the worst programs were Laplink, Syncables, and SugarSync, and the best programs were GoodSync and ViceVersa Pro, followed by ViceVersa Free, Allway Sync, and SyncToy.  There was something odd about the fact that GoodSync got at least eight to ten times as many votes as the others.  Again, there were suspicious discrepancies between the votes of editors and of users.

Overall, the reviews provided by these three sources -- TopTenReviews, Softpedia, and CNET -- were only somewhat consistent.  Their sometime inconsistency may have been at least partly due to somewhat different target markets.  At TopTenReviews, the good score for GoodSync was fairly compatible with the results at the other two sources, though the mediocre score at Softpedia was interesting.  TopTenReviews' high ratings for Syncables and Laplink seemed incongruous.  ViceVersa Pro and Allway Sync were in the top or second tier at all three sources.  SyncToy was fairly highly rated by the two sites that reviewed it.  Among the others, ViceVersa Free also seemed to be potentially worth a look.

Although Allway Sync had a free version, its license page indicated that that version had a limit of 40,000 files per month.  Although I was not sure how that would be counted, it seemed likely that a couple of full backups per month would exceed that limit for many people.  The pro version was available for $20, in comparison with $30 for GoodSync and $60 for ViceVersa Pro.  I noticed that the CNET editors described Allway Sync as "one of the best tools we've tried."  I decided to try it myself.  Also, while I now saw that SyncToy had no scheduler, it appeared likely to be useful for some purposes, so I downloaded it too.

Microsoft SyncToy 2.1

I decided to start with SyncToy for two reasons:  I believed it would be simple to install and run, and I figured it would probably be pretty severely limited, and could therefore be disposed of quickly.  In other words, it seemed likely to be a handy tool to know about, for some purposes, but not my primary machine for keeping computers reliably synchronized.

The program's interface seemed pretty straightforward.  Nonetheless, I ran a search for sources of guidance and insight.  A How-to-Geek webpage said SyncToy would require me to set up a pair of folders that I wanted to synchronize, and would then ask me which kind of synchronization I desired.  There were three options:  Synchronize (copy and delete in both directions, between the two folders, so that the latest additions, changes, and deletions appeared on both), Echo (update the right folder so it matched the left), or Contribute (add to the right so that it contained everything on the left, but don't delete anything from the right).  The How-to-Geek webpage went on to explain how SyncToy tasks could be automated through Windows Task Scheduler.

A discussion thread pointed out some limitations with SyncToy:  no volume shadow copy, so locked files would not be copied (which also seemed to be the case with GoodSync); no way to back up your folder pairs, for reinstallation or for use on another computer, so they would all have to be recreated by hand; a nasty bug regarding timestamps when synchronizing NTFS to FAT drives.  Hints to a potential solution for the folder pair backup issue appeared in another thread, discussing the possibility of creating folder pairs via command line.

The "What's New" section in SyncToy's internal help file suggested that SyncToy 2.1 might have resolved some or all of these issues.  I ran another search and saw references to memory leaks and other bugs.  A discussion thread contained indications of some dissatisfaction.  I decided SyncToy was probably pretty good, but it was not the sort of limited but airtight and portable tool I had imagined.  It was more like a contender for the role of primary synchronization tool.  For that purpose, I had a sense that I would like Allway Sync more, so I turned in that direction.

Allway Sync

For some reason, after that brief look at SyncToy, I started getting cold feet.  A couple of days had passed since I had started the thread, and now, as I looked at alternatives to GoodSync, it seemed like I was getting myself into a lot of unnecessary hassle.  This was probably due to two things.  First, I was busy with other stuff and didn't want to deal with this at the time; and second, I was having an unrelated system issue that kept the secondary/server computer from seeing the primary/workstation computer.

I rebooted as an expression of my faith that this would make it all better.  It didn't.  I gave it a couple of hours and the let spiritual power of saying "to hell with it" do its magic.  When I came back to my existential mess, computer B was seeing computer A.  All was well with the world.

So I installed and ran Allway Sync -- an awkward name, by the way, referred to hereinafter as AS.  AS had distinct Analyze and Synchronize passes, like GoodSync.  I set it up to compare a folder on each machine.  The interface was OK.  I didn't like the huge chunk of real estate, at the top of the screen, devoted to nothing in particular, but found I could get rid of that with View > deselect Show Logo.  As it was running its analyze pass, it suddenly gave me this:
script error, Unspecified error., URL:
file:///C:/Program%20Files/Allway%20Sync/Skins/default/profileex.js, line: 5138.
It offered to send an error report.  I let it do that.  I wasn't sure what effect the error might have on that comparison across two machines.  There were 89 Important Messages.  The brief messages provided -- "Questionable file (manual review recommended)" -- were not as informative as GoodSync's left/right comparison messages.  I didn't like its interface.  While I was poking around with it, I got another script error.

I wasn't excited about Allway Sync.  Blame it on them, blame it on me, but for whatever reason, I decided to keep looking.


Around this time, I came across the freeware WinMerge program.  I hadn't seen much mention of it in TopTenReviews or other commercial sites.  Wikipedia said the WinMerge project was currently dormant.  The program's homepage indicated that the last stable version (2.12.4) had been released nearly three years earlier, in mid-2009.  But Softpedia contained an "experimental" v. 2.13.20, uploaded in October 2010, rated 4.5 by 139 users, and CNET boasted installed and portable versions of a version 2.12.4 (4.5 stars, 29 votes).

Wikipedia's comparison of file comparison tools listed Beyond Compare but not GoodSync.  I had purchased BC Pro.  It had worked well for backup purposes, where I manually eyeballed what was going on.  I hadn't wanted to use it for automated syncing of computers, which seemed to be a less critical operation in my system.  (If GoodSync had been randomly losing stuff, I would have noticed it when I did my Beyond Compare backups.  In other words, after some early watching and tinkering, I was comfortable with letting GoodSync keep the two computers aligned, taking only an occasional look at what it was doing.)  I might have been more inclined to use Beyond Compare for that purpose if I had learned its scripting language, which might not have been that hard; but I had gotten started with GoodSync through a free trial, and just went ahead with that.

Based on the Wikipedia comparison, it now appeared that what I got for my money, in buying Beyond Compare rather than using the free WinMerge, was the ability to do three-way comparisons (which I never did) and possibly the ability to do scripting (Wikipedia drew a blank on that).  WinMerge could not do "horizontal" something (that was all Wikipedia said:  horizontal or vertical, and BC could do both).  WinMerge would have given me a "moved lines" capability that Beyond Compare did not have.  WinMerge did not have FTP support.  These things did not seem crucial to me, though further research to verify what they meant would obviously have been advisable.  But there was one real problem:  WinMerge would not do CRC checks.  Granted, I had not actually been doing those checks in GoodSync either.  My reason was that GoodSync was already taking over my secondary computer; I could hardly bear the thought of making it even more demanding.  But I did want that capability.  It had been a while since memory or other system errors had noticeably corrupted numerous files, but I'd had that experience in the past, and preferred to have some warning if, perchance, one computer quietly began to wander off course.

Other programs mentioned above (e.g., Allway, SyncToy, Vice Versa) did not appear on the Wikipedia comparison list.  It seemed I was getting into an application for which a program like Beyond Compare or WinMerge was designed, and at which it might not excel.  For instance, the comparison did not include features, like scheduling and networking, that seemed essential for synchronizing computers. 
Nonetheless, I decided to try the portable version of WinMerge.  It was small, as befit a portable.  I could see right away that we weren't going to be coddling anybody.  Specifically, I couldn't figure out how to use it.  It didn't have Beyond Compare's user-friendly buttons and menus.  There were no tooltips, so I'd have to just learn what the various buttons meant.

There didn't seem to be a way to tell WinMerge to actually do anything.  I went to its online help manual.  The answer I was looking for was to go to File > Open.  This would seem obvious in another kind of program, but here I wasn't looking to do anything with any particular file.  I was looking for a way to compare folders.  But now that I did take that route, I saw a dialog allowing me to compare folders.  It was able to navigate to the other networked computer, so I compared the D:\Workspace folder on two different computers.  WinMerge seemed to be doing this comparison very slowly.  After a while, I saw that it had stalled on comparison of a particular item.  It didn't name the item; it just said it was item no. 7115.  That may have been a bug; when I killed it, I got a display showing 7142 items, which I think was the total number of items it had said it was comparing.

The WinMerge display was informative.  I could set it to show only the items that were different.  It was a plain-text display, without colors or graphics; but because it used the same tight font as Windows Explorer, with more narrowly spaced lines, it provided a lot more information per screenful than GoodSync did.  It had a collapsible tree mode, like GoodSync.  It was willing to generate a patch.  I didn't know what that was, but apparently it meant a code patch.  It seemed to be a way to help programmers offer suggestions that would improve WinMerge.  There weren't any scheduling features, so it appeared that, for my purposes, I'd be learning how to write command-line scripts and running them through Windows Task Scheduler.  But if I was going to do that, I thought it might make more sense to use Robocopy, which was already built into Windows.


In one of those uncanny cyclical events, exactly a year had passed since I had last worked with Robocopy.  Robocopy was a command-line tool in Vista and Windows 7 (and possibly elsewhere).

Back then, I had set up batch files to run relatively simple Robocopy commands to make backups.  A month later, I had tried to use Backup Maker instead, but that hadn't lasted long.  By this point, both of these tools had fallen into disuse, because I didn't like the backup configuration they were giving me, and I was too busy with other stuff to revisit the question.  I decided to revisit it now.

One thing I hadn't liked about Robocopy was that, every hour (or whenever), Windows Task Scheduler would pop up one of my old-style batch files to run my Robocopy script.  It was inelegant; it was distracting.  So I was interested in Robocopy front ends that might conceal the script execution.  There seemed to be several such front ends:  WinRoboCopy and Easy RoboCopy and Robocopy GUI and SH-Soft RoboCopy GUI and RichCopy, apparently a hybrid using some Robocopy and some other tools.  RichCopy offered ways of fine-tuning a copying process, but they didn't seem relevant to me.  When I ran across a critical report on RichCopy, I decided not to start with it.  Among the others, Robocopy GUI seemed to have the best pedigree, being a Microsoft product like Robocopy.

The link for the Robocopy GUI download was confusingly described as a link to the code for the Technet article discussing it.  I wasn't sure whether it was a link to the code for the article itself or was, rather, a link to the code (or, less nerdily and more accurately, the utility, including its accompaniments, e.g., User's Guide) discussed in the article.  Anyway, somehow I managed to get the utility itself.  I installed and ran the GUI.  It was a concise little thing that you could not possibly use safely without the aid of a Robocopy guide or manual, or a preexisting knowledge of Robocopy.  I say that because, for example, its Copy Options tab (one among seven available) included such options as /E and /ZB.  What did these mean?  The program did not say.  It did have a Help > Robocopy User's Guide menu option that led to a 35-page Word document which I promptely PDFd.  It seemed the principal value of the GUI would be to remind and organize, not really to reduce the need to understand Robocopy's options.

At this point, I started a separate post to discuss Robocopy options in detail.  I also considered using something like Beyond Compare, which would allow checksum comparisons rather than Robocopy's relatively simpleminded and potentially incorrect comparison of dates and times.  As noted in that other post, the main problem I was encountering, with both Robocopy and Beyond Compare (aside from the latter's sometime ability to bog down system processes), was that they weren't true automated synchronization solutions.

What I probably needed, for the checksum issue, was a distinct program that would do continual, low-priority background comparisons between the two computers, so as to distinguish files whose checksum inconsistency was accompanied by a change in file date and time (probably indicating a legitimate update of the file) from those whose checksums differed while timestamps remained the same (suggesting the possibility that one had become corrupt). Having this as a low-priority background function would allow this potentially resource-intensive file comparison function to run when the computer was not busy -- which, itself, would not necessarily be a good trait for a synchronizer. This background checksum comparison wouldn't ordinarily be an urgent function, though it could be an important one. I would be happy with a report, presented to me each morning, as to the results of checks during the last 24 hours, with links that would take me to individual files displaying expected and unexpected checksum inconsistencies. Ideally, such a stealth checker would (also, or perhaps instead) compare checksums against a database, which would be updated upon file creation or acquisition. In other words, as soon as I would PDF a Word document, this background tool would calculate and store that PDF's checksum, so as to facilitate future estimations of what had been happening to it (e.g., it had been subsequently edited in Acrobat, which would tend to indicate manual user verification and therefore an expected change in checksum).

Regarding the bogging-down of system resources, it seemed that, if I did find a way to use something like Robocopy, there was the question of how I could prevent the program from hogging system resources.  There seemed to be two parts to this question: how do I detect that the computer is busy, and how do I slow Robocopy in that case? Or was there maybe one tool that would let me slow down designated processes when computer resource usage reached a certain level? This led to a separate inquiry. At this point, that inquiry had yielded the impressions that something like Process Hacker or Process Explorer might put the brakes on how heavily a program would draw upon system resources during busy times. Those tools hadn't been especially successful against GoodSync, but possibly GoodSync had been programmed to demand priority, in a way that would not be a problem with something like Robocopy.

There was also the question of how I could run a command-line tool like Robocopy unobtrusively.  So far, I had used Robocopy in batch files that would pop up and require a manual click to get them out of the way.  Not a huge problem, but when it occurs multiple times per hour for each of several partitions, it could get to be an irritant.


At this point, I was not finding good alternatives to GoodSync.  The slowness problem seemed to become less pronounced, over the several days during which I was doing these investigations and writing up these notes.  I wasn't sure if that was just because we had not again encountered a heavy load, or if GoodSync 9 had perhaps made a huge improvement over version 8.

My investigation had turned up numerous possible replacements.  Among these, I had identified several programs with potential.  The leading one would be WinMerge; and if that didn't work, I would try Microsoft SyncToy.  I decided to give GoodSync 9 a while longer, to see if it continued to take over the secondary computer.  In that event, I thought I might first make a renewed search for tools that would reliably slow down GoodSync during times when the machine was not idle.

Wednesday, March 21, 2012

Open a Search for a Certain Type of File Every Day

I was in the process of converting scattered Microsoft Word .doc files to PDF.  I had found a way to automate the conversion of large numbers of such files, provided I was willing to put them all into a single folder; but that left me with some that would have to be handled on a more piecemeal basis, one folder at a time.

To aid in this process, I wanted my computer to give me a list of where my remaining .doc files were located, so that I could choose the ones that were ripe for conversion.  I had found Everything to be a very useful file finder.  On a one-time basis, I could find my Word docs by just typing *.doc on the Everything search line.  But now I wanted a command-line solution that would run Everything and do that search automatically.

I had not previously tried using Everything with command-line options, but now I found those options in the Everything wiki.  For this purpose, I found that this command would do the trick:

Everything -filename *.doc
That command worked because I had put a copy of Everything.exe (originally named Everything- into C:\Windows.  Now, if I wanted that command to run on a regular basis (say, once a day, or once a week), I could add it to a regularly scheduled batch file.  Not that I would have to; I could also just run it from the command line as needed, or save it in a batch file that I would run by double-clicking on it (or on a shortcut to it).

Windows 7: BSOD: PROCEXP111.SYS

My computer was sailing along, when suddenly I got a Blue Screen of Death (BSOD).  The message began, oddly, with a sentence fragment:

to your computer.



If this is the first time you've seen this Stop error screen, restart your computer.  If this screen appears again, follow these steps:

Check to make sure any new hardware or software is properly installed.  If this is a new installation, ask your hardware or software manufacturer for any Windows updates you might need.

If problems continue, disable or remove any newly installed hardware or software.  Disable BIOS memory options such as caching or shadowing.  If you need to use Safe Mode to remove or disable components, restart your computer, press F8 to select Advanced Startup Options, and then select Safe Mode.

Technical information:

*** STOP: 0x00000050 (0xFFFFFA8100043F20, 0x0000000000000000, 0xFFFFF880073B6DDD,0x0000000000000000)

*** PROCEXP111.SYS - Address FFFFF880073B6DDD base at FFFFF880073B5000, DateStamp 47194089

Collecting data for crash dump ...
Initializing disk for crash dump ...
Physical memory dump complete.
Contact your system admin or technical support group for further assistance.
I probably didn't need to type out all that information, but doing so provided a constructive outlet for frustration.  Besides, you never know what archaeologists of some future civilization will find absolutely crucial for understanding what they are digging out of the rocks.

This was the second time I'd gotten this BSOD, so evidently it was not going to be good enough to simply reboot and hope for the best.  To respond to the BSOD's first bit of advice, there wasn't any new hardware or software.  On the software side, I had recently restored a backup image of drive C that I had made more than a month earlier.  The first BSOD had occurred within the past few days, prior to the restoration.  In other words, I was suddenly getting BSODs on an install that had worked fairly well for weeks, with and also without software changes made during those weeks.

I did notice something atypical on the hardware side.  I had two different USB external drives connected.  That, itself, was not unusual, though it had not happened often.  The unusual part was that the system would not reboot without one of them being turned off.  It would get as far as giving me a message, which I think was "Loading Operating System," and then it would pause until I shut one of those USB drives off.

I wasn't actually doing anything in particular on the computer when the BSOD happened.  They had been up all night; I had just returned to the system in the morning; and at the moment I wasn't even using that computer; I was working on the other machine.  By the time I got to writing these notes, I didn't recall if I had even done anything on that computer.  Not much, anyway; nothing that would seem to have provoked the crash.

As I worked through this issue, I was guided by two posts I had written up a few months earlier, regarding a different STOP error.  One was a closer look at the "memory dump" concept mentioned toward the bottom of the BSOD; the other was a more general-purpose review of possibilities.  The memory dump investigation came to mind at this point because, on reboot, Windows 7 gave me a dialog that said, "Windows has recovered from an unexpected shutdown.  Windows can check online for a solution to the problem."  I hadn't always gotten this dialog after a crash.  It dimly seemed that something I had changed about my system, during the process of working through the prior memory dump post, had given me this information; otherwise I had to use something like BlueScreenView to see it.  The dialog gave me an option, "View problem details."  I took that option and got some technical information that I wasn't eager to read.  It pointed me toward two "Files that help describe the problem."  I copied the addresses of those files (without the actual filename), pasted them into the address bar in Windows Explorer, and looked at them.  One was an XML file that, if I just double-clicked on it (or if I pasted the full path and filename in Windows Explorer), would open as code in Internet Explorer.  This file was arguably more readable in Firefox, but I didn't see anything particularly informative in it.  The other was a minidump file that I opened in BlueScreenView.  (In Notepad, it was semi-gibberish.)  Problem is, I hadn't fared too well in interpreting the page dump, last time around, and that was still the case this time.  Following that previous guidance, I did notice that this day's minidump, and also the one from the previous BSOD, did contain lines referring specifically to PROCEXP111.SYS, named in the BSOD.  But, as before, I didn't know what else, if anything, I could do with the memory dump information.

Two other things worth noting about this crash.  First, after the crash, Glary Registry Repair found an unusually large number of registry errors.  Since I ran Glary every day, I suspected these were a result, not a cause, of the crash.  Second, in recent days the system had been functioning extremely slowly.  This seemed to depend on the number of programs running, but not entirely.  In particular, I was having the previously noted slowdowns that I had attributed to resource-hog programs (especially GoodSync and BeyondCompare).  Sometimes I noticed that, when those programs were out of the picture, the system sped up considerably; at other times, there seemed to be a lingering effect where the system continued to seem screwed up.  This was what had prompted me to do the system restore.  In that previous post, I mentioned trying Process Hacker to put a speed limit on these resource-intensive programs; but I also noted that this had not seemed to make much difference.  I wasn't sure that there was anything particularly wrong about the Windows 7 installation as a whole, and certainly wasn't eager to reinstall from scratch.

A search led to the information -- surprising to me, but obvious once stated -- that PROCEXP111.SYS was related to Sysinternals Process Explorer.  I had just begun using Process Explorer, after several weeks of using Process Hacker, to control certain programs -- especially GoodSync -- that made excessive resource demands, to the point of making the computer unusable while they were running.  I hadn't noticed specifically whether the previous BSOD named PROCEXP111.SYS as the culprit; but since it had occurred just one day earlier, probably PROCEXP111.SYS was named in that one too.  I probably could have figured this out from the minidump, with sufficient time investment.

I wasn't sure how to interpret this information.  Generally, Microsoft Sysinternals tools like Process Explorer had seemed relatively stable.  It seemed possible that the crash named PROCEXP111.SYS because, unlike Process Hacker, Process Explorer was actually succeeding in putting the brakes on some overly grabby programs, and they didn't like it.  That is, it may have been a problem with Process Explorer, but it may instead have been a problem with these other programs -- that, basically, they would either run at their preferred speed or not at all.

I tried another search.  This led to the suggestion that I should be using a more recent version.  I hadn't checked, but now I saw that mine was v. 11.04, copyright 2007.  Oops.  Upon closer examination, I saw that they were now up to v. 15.13.  I downloaded and installed the upgrade.  I wasn't sure how long it might be until the next BSOD due to Process Explorer, so I decided to close this post at this point.

BIOS Problem: Bootup to Blank Screen

I had a working computer.  Then I decided to fix it.  Now the screen was completely black.  The computer seemed to have booted up nonetheless -- the hard drive light was flashing now and then, suggesting that some program was playing with itself in what I hoped was a nondestructive fashion -- but I could not see anything onscreen.  The monitor was plugged in and turned on, but I guess we were no longer on speaking terms.

What I had tried to fix was a setting in the BIOS.  I was using a Gigabyte GA-MA785GM-US2H motherboard with an Award BIOS (v. 6.00PG).  On bootup, I hit DEL and went into the BIOS settings -- specifically, into Advanced BIOS Features > IGX Configuration > UMA Frame Buffer Size.  My objective was to dedicate some system RAM to video.  So I hit Enter and changed the IGX Configuration from Auto (the default) to 512MB and rebooted.  This gave me the aforementioned black screen, just as it had done for another poor soul.

So now the question was how to fix it.  I first tried to do it blindly.  I booted the machine and kept hitting DEL for a while, figuring that this would take me into the BIOS setup.  Then, following the sequence of steps that would have been required to change the IGX Configuration back to Auto on another machine, I went through a series of keystrokes (Down, Enter, etc.).  Those steps, done in the proper order, took me to the IGX Configuration part of the BIOS on that other machine.  But they didn't seem to work on the blacked-out machine.  When I hit the keys needed to save the settings and reboot, I found myself still looking at a black screen.

If the BIOS was fubared such as to produce a black screen immediately upon bootup, without ever showing a trace of life, then it wouldn't seem to matter whether I booted with a CD, USB drive, floppy, or hard drive.  The one exception, I figured, would be if I booted with some program designed to speak directly to the BIOS.  And for that, the candidate was presumably a BIOS flasher.

In other words, I saw an opportunity, here, to update my BIOS while fixing it.  For this solution, I went to the motherboard's BIOS upgrade download webpage.  Gigabyte had a program called Xpress Recovery2, but its purpose seemed to be to recover hard drive data, not to recover the BIOS.  They also had @BIOS, a live update utility, which would have been great if I could have booted Windows to run it.  The motherboard's manual seemed to be telling me that I needed, instead, to use its Q-Flash utility.  Q-Flash was said to be embedded in the motherboard's hardware, so I wouldn't need any particular drive to run it.  It said I could use Q-Flash to install a BIOS update that I would download on the other computer and save to a FAT32/16/12 USB flash drive.

But could I use Q-Flash if I couldn't even see it?  The way to fire up Q-Flash, according to the manual, was to hit the End key while the system was booting.  The blanked system was currently running, so I tried using WinKey-U-R to restart it, since I could see that those keystrokes were what it would have taken to reboot the other computer from Windows 7.  I gave that several minutes, since I had no idea what was running on that computer at this point.  I never got a beep, though, so I thought maybe it was waiting for me to Force Restart.  I hit the F key.  Nothing happened.  I tried Enter.  A brief hard drive flicker.  I gave it another minute and then just punched the reset button on the computer.  Then I kept hitting End for a while.  Perhaps I was now in Q-Flash.  No way of knowing:  the screen was still blank.

I thought of trying to trace my way through a BIOS flash blindly, as I had tried to trace through the reset of the IGX Configuration option.  Thinking of that gave me an obvious idea:  reset!  Maybe I could just take the steps needed to reset the entire BIOS back to its defaults.  I punched the computer's restart button again and then, after I got the reboot beep, I kept hitting Del, twice a second for about 15 seconds.  The keys I hit at this point (copied, again, from the sequence on another machine running a hopefully similar CMOS setup utility) were:  right-arrow (to take me to the option for Load Fail-Safe Defaults), Enter (to actually load those defaults), then Y to confirm, then F10 to save and exit.  That produced no results, so I hit Esc several times, in hopes of backing out to the main CMOS menu, and tried again:  Right, Enter, Y, F10.  This time I added another Enter for good measure.  And oh, my Christ, it worked.  I was able to read my screen again.  Fricking brilliant.  Amazing what you can do when you can't see a thing.

I went back into the BIOS, because of course I hadn't had enough of this, to take a look at how things were now.  The UMA Frame Buffer Size was back to Auto.  Funny, I didn't recall even seeing a UMA Frame Buffer option on the other computer.

It seemd obvious, now, that I should have just gone directly for the Fail-Safe option in the first place.  I reconfigured the BIOS as desired, saved, and rebooted.  Everything was fine.  I wasn't going to need to root around anymore in my Google search for solutions.  Although I did realize, a bit later, that I probably could have achieved the same thing, without working blindly -- resetting the BIOS (a/k/a clearing the CMOS) -- by either removing the quarter-sized battery from the motherboard for five minutes or shorting across the motherboard's "clear CMOS" jumper, which the manual would probably have helped me to find.

But no.  Not so fast.  On reboot, I was back to a black screen.  Why?  I hadn't even touched the IGX Configuration stuff this time around.  But, ah, false alarm.  Apparently the fail-safe options concealed the Power-On Self-Test (POST) information.  After a short panic, I had Windows onscreen.  I'd just have to take another look at the CMOS setup, next time I rebooted, to find the setting that would restore the POST display during bootup.  There may have been a way to do that with Gigabyte's Easy Tune utility, though if there was, it wasn't immediately obvious to me.

I decided to go ahead and deal with that now.  Unfortunately, when I rebooted and hit Del repeatedly, it just gave me a blank screen.  I hit Esc and then Enter, to exit the BIOS and reboot without saving any changes, but that didn't do anything.  I tried again, and then tried F10 and Enter.  After a blank screen, that got me back into Windows, at least.

Well.  Were the fail-safe defaults preventing me from getting into the CMOS setup?  It seemed that maybe I should go ahead and update the BIOS after all, or else open the computer and use one of those hardware BIOS-reset methods.  I ran Gigabyte's @BIOS utility and selected the "Update BIOS from Gigabyte Server" option.  I had to approve a couple of choices, and then it ran.  In a half-minute or so, it had apparently downloaded the new version.  It said this:

The screen will freeze for a few seconds while updating the BIOS.
Do you want to update the BIOS?
I clicked OK.  After a moment, it said, "BIOS Update completed!  You must restart your system to take new changes."  I said, "Restart Later."  I didn't want to lose all the stuff I had open, so I hibernated the machine (Start > Shut Down > Hibernate) and then, after it died, I pushed the power button and started it back up.  That worked:  I could now see the POST screen.  I hit Del and went into the CMOS setup.  I had to reset the clock and make other adjustments.  Then I rebooted.  And yet, once again, I was not seeing the POST screen, though once again at least the computer did proceed on into Windows.  It seemed that maybe one of my changes was responsible for this, or else perhaps that the hibernation was fouling things up.

It was hard to tell what ultimately fixed this.  Something did.  When I returned to these notes a while later to wrap up this post, I was no longer having the problem.  Possibly the steps described here did solve it on reboot, though I think in that case I would have made note of it.  It seemed I would need to have the problem again in order to comment further on it.

Monday, March 19, 2012

Batch Converting Many Microsoft Word (.doc) or WordPerfect (.wpd) Files to PDF - Streamlined

I had previously figured out a semi-automated command-line solution to the question of how to convert many Microsoft Word docs to PDF.  This process involved automatically opening, printing, and closing the files, one at a time, in Word.  Now I had another set of documents to convert.  So the first purpose of this post was to boil down what I wrote up in that previous post.

The second purpose was to see if the same approach would work for WordPerfect (.wpd) docs.  The logic was that Word could handle WPDs, and that it would automatically convert them upon opening.  So it seemed that I should be able to run an almost identical command to convert a document, regardless of whether it was a WPD or a DOC (or, presumably, an RTF, or a DOCX, etc.).

The command I used was long, but not super-complicated.  This was the one-line command I used to convert all of the DOC files in a folder:

FOR /F "usebackq delims=" %%g IN (`dir /b "*.doc"`) DO "C:\Program Files (x86)\Microsoft Office\Office11\winword.exe" "%%g" /q /n /mFilePrintDefault /mFileExit && TASKKILL /f /im winword.exe
Its success depended on several factors.  Some, such as cleaning out the %Temp% folder, are detailed in the previous post.  I structured the command with the aid of early answers to a question I posted.

It is worth noting that the command could be used to specify various kinds of files.  This example referred to *.doc (that is, to all .doc) files in the folder in which it the command was run.  (It goes without saying that it would be wise to have a backup before fooling with this or any command.)  Also, I set my PDF printer (Bullzip > Options > General and Dialogs tabs) to operate without asking me any questions.  Note, further, that the precise location of winword.exe would vary, from one system to another.  Of course, saving the command as the entire contents of an executable batch file was just a matter of putting it into a text file with a .bat extension, and then saving and running that .bat file.

I tried modifying the command to refer to .wpd rather than .doc, and ran it in a folder full of WPDs.  It worked.  Then, as detailed elsewhere, I did a file count to verify that I had the right number of resulting PDFs, and ran Boxoft PDF to JPG Converter to do a quick test, highlighting files that didn't look right.  This process worked with WPDs.

I wasn't sure how much further the command could be extended.  A test with PPTs (using PowerPoint rather than Word) failed.  In response to my follow-up question regarding the possibility of specifying the filetype on the command line (by typing e.g., CONVERTER.BAT DOC), it was suggested that I simply experiment to see if I could get the variable ("DOC," in that example) to work.  I didn't have any more files to work on at the moment, so that investigation would have to wait until later.

Sunday, March 18, 2012

Batch Converting DOC to PDF with 7-PDF Maker

I had some Microsoft Word .doc files.  I wanted to convert them to PDF.  I wanted to be able to do this from the command line, so as to reach into different folders and process large numbers of them at once.

I went into Softpedia and did a search.  It came up with numerous free programs for this purpose.  I chose 7-PDF Maker.  It had a pretty good rating, as Softpedia programs go (4.0 stars; 16,001 downloads), and it did offer a command-line option.  I also downloaded its manual.  (It had a real manual!)

Once 7-PDF Maker was installed, I searched for its command-line executable, 7p.exe.  I put a copy of it into D:\Workspace (i.e., the folder where I was working).  That way, my commands that referred to 7p.exe would know where to find it.  There were other ways, but this was simplest, and 7p.exe was not a filename that would get confused with the ones I wanted to convert.

I opened a command window in D:\Workspace and typed "7p /?" to see what the command line options were.  Basically, it seemed, I could save the DOC as a PDF with a command as simple as "7p D:\Workspace\File.doc."  The /? instructions seemed to be saying that I had to specify an absolute path for the source file (i.e., not just "File.doc" without the drive and folder information).  I was not sure whether that was necessary with a copy of 7p.exe in the working folder.  There was also an option to save the resulting PDF to a different folder (e.g., "7p File.doc D:\Workspace\Output").  In addition, I could use wildcards.  7p.exe D:\Folder\*.doc would convert all doc files in Folder to PDF.  The same command with *.* would convert all supported files to PDF.  There were many supported filetypes (manual p. 18), including Word, WordPerfect, OpenOffice, Excel, PowerPoint, and various image formats (e.g., BMP, TIF, JPG, PNG).

There were also options for overwriting and recursion (i.e., working down through subdirectories).  In both cases, the default was false (i.e., don't recurse, don't overwrite).  The default was all I needed, so I did not investigate the exact syntax.  But it appeared that one instance of the word "true" on the command line would be construed as an instruction to recurse.

I gave it a test run with x.doc.  The command I used was simply "7p x.doc."  That gave me an error, so I tried "7p D:\Workspace\x.doc."  That gave me a different error:  "Variante referenziert kein Automatisierungsobjekt."  One translation was, "Variant does not reference an automation object."  Did this mean that x.doc was not a convertible DOC file?  Or that I should have been running this in the 7-PDF installation folder on drive C?  I tried the latter with an absolute path (i.e., not just "7p x.doc").  Same "Variante referenziert" error.

I tried opening x.doc in Word.  Oh.  Now I understood.  It was called a DOC file, but it was actually just a text file with a DOC extension.  But the manual said that text files were supported.  Maybe the .doc extension was confusing 7p?  I changed it to x.txt and tried the original approach of running the command in D:\Workspace rather than in the installation folder on drive C.  Specifically, I tried just "7p x.txt."  It said, "URL seems to be an unsupported one."  Maybe it was the wrong kind of text file.  Whatever; I used a text to PDF converter for them instead.

I did not proceed further with 7-PDF because, at this point, I found an alternative I liked better.  Not to say that 7-PDF was a bad program; it just was not working really well for me at this point.

Batch Converting Many Microsoft Word (.doc) Files to PDF - First Try

I had recently figured out how to batch convert many text files to PDF.  Now I was on a roll.  I wanted to know how to do the same thing with word processing documents produced in Microsoft Word 2003.

The approach used for the text files didn't seem likely to produce good results for Word files.  The text approach used Notepad on the command line.  Notepad would lose all the formatting.  It might actually create a mess.  It seemed there would probably be a better way.

In that previous approach, I had configured my default PDF printer, Bullzip, to shut up and stop asking questions.  So it sailed right through the printing task.  When producing complete garbage, I prefer not to be interrupted -- although, in that case, the output actually seemed OK.  In similar spirit, I wished for a Word conversion process that would just follow orders.

I thought of setting Bullzip in minimal-interruption mode, as in the text file approach, and just selecting a gaggle of Word docs in Windows Explorer, right-clicking, and choosing Print.  Sad to say, Windows 7 was not interested in giving me a print option when I selected more than 15 items.  So I would have to repeat the process with groups of 15 files at a time.  This did not fall within my definition of hassle-free.

Seeking some alternate approach, I did a search.  I was thinking, first, that maybe Word had command line options like Notepad.  Microsoft did not seem to offer any such option.  Others concurred that I would probably need some kind of macro, script, or other third-party solution.

Another search led to some relatively less desirable solutions, such as buying Fineprint or A-PDF or easyPDF SDK (seemingly complicated) or using a combination of VBScript and Automation or uploading Word docs to OCR Convert or using AnyToPDF, which admirably developed OpenOffice but would require the system to restart OO for each document being printed.  I found a thread that yielded other possibilities, including an apparent Word command-line possibility after all.  It seemed to require something called Quiet PDF Printer, which I could not locate.

As I was browsing Wikipedia's list of PDF software, not seeing much of relevance, I realized I would much prefer a solution that would use Word, as distinct from some other program, so as to have the greatest likelihood of preserving formatting.  After all, I was not planning to inspect the resulting PDFs closely.  I didn't want to find out, a year down the line, long after I had discarded the original Word docs, that the PDFs were missing the bottom two lines of text, or that important characters were being misprinted or something.  No doubt this approach of opening Word was going to be slow, though, as in the OpenOffice alternative disparaged above.

I saw that Quiet PDF Printer suggestion repeated in another thread, but without any mention of Quiet PDF Printer.  Maybe the first person who mentioned it meant that I should just have a no-hassle PDF printer, like Bullzip with the desired settings.  Anyway, the suggestion was to run this command:

"C:\Program Files\Microsoft Office\Office\winword.exe" "C:\My Documents\doc1.doc" /mFilePrintDefault
Of course, the path to winword.exe would have to be adjusted on some systems, and doc1.doc was just an example.  But the point is, it worked.  One problem:  it left Word running, and another iteration of the command opened another instance of Word.  So unless I wished to have a couple hundred unused Word sessions lounging around, consuming system resources, I would need to kill Word after printing the PDF.  Further reading in that same thread led to a refinement:
"C:\Program Files\Microsoft Office\Office\winword.exe" test.rtf /q /n /mFilePrintDefault /mFileExit
The description seemed to say that (1) those last two items were actually Word's way of calling a macro on the command line; (2) the selection of commands available for such use was visible in this menu pick in Word 2003:  Tools > Macro :> Macros > Macros in Word Commands (in ribbon versions of Word, try this key sequence:  Alt-T, M, M); (3) FilePrintDefault and FileExit were two such commands); and (3) if I went into Tools > Options > Print tab > uncheck Background Printing, I would not have Word exiting before the PDF was done printing.

I decided to try that last command line approach.  I made the stated settings changes in Word, and set Bullzip to stun.  Now it was a question of working up the list of commands, for all these Word documents that I wanted to PDF.  Ordinarily, I would have used a combination of DIR and Excel for that purpose, with one command per file, producing a batch file containing many commands.  But spring had arrived and, you know, in spring a man begins to feel powerful urges.  My social life being what it was, this translated into some recent experimentation with looping batch files.  That is, I believed I might be able to devise a batch program that would provide a simpler (or at least more direct) way to run this printing process.  So, from a command prompt in the folder containing my Word docs, I ran a batch file that I called Printit.bat.  That batch file contained just one line, though it wraps over several lines here.  The line was:
FOR /F %%g IN ('dir /b *.doc') DO "C:\Program Files (x86)\Microsoft Office\OFFICE11\WINWORD.EXE" test.rtf /q /n /mFilePrintDefault /mFileExit
Word immediately gave me a message indicating that it had encountered an error.  I wondered if that was because I had a session of Word open before running the batch file.  But that didn't seem to be the answer.  Well, maybe it was because I already had a PDF printout of the first file in the folder.  I had created that PDF during the process of testing this stuff.  Apparently my batch file and/or Word were not going to dilly-dally to ask me about overwriting.  So now I deleted that preexisting PDF and tried again.  No, that wasn't it; I still got the error.  This time, instead of guessing, I clicked its Show Help button and got an explanation:
The file you tried to open was not found. . . . [If the file exists but] does not open, it is either corrupt, locked by another application, or is protected by file permissions.
So, silly me, I looked again at my batch command.  Test.rtf?  WTF was Test.rtf?  I had copied the foolish thing verbatim, without pausing to reflect.  When your professors try to tell you how important it is to master critical thinking, believe them.  They're right.  As it turned out, there were multiple problems with that first try at a batch command.  One of those problems was that, contrary to initial hopes, Word was actually not postponing the next doc until it had closed the previous doc; therefore, it was stumbling over itself.  The solution was a batch file containing this one long line:
FOR /F "usebackq delims=" %%g IN (`dir /b "*.doc"`) DO "C:\Program Files (x86)\Microsoft Office\OFFICE11\WINWORD.EXE" "%%g" /q /n /mFilePrintDefault /mFileExit && TASKKILL /f /im winword.exe
The changes were mainly to add USEBACKQ and to change quotation marks (and use backquotes) accordingly, and also to add the "&& TASKKILL" part.  The && said that the next part (the taskkill) should proceed only after the previous command on the same line (i.e., printing) ran successfully.  From this point, the process ran pretty smoothly.  I found that it did not seem to matter if I already had a Word session active when this ran.  (If there was such a session, I would get a dialog; maybe I should have added another instance of TASKKILL before starting the FOR loop.)  Also, I found that Word would prompt me before overwriting.  I also had an interruption for a problem encountered when the batch file tried to convert a file created in an earlier version of Word.

There was another problem.  I got a dialog saying, "There is insufficient memory."  A search led to a Microsoft webpage that said this could result from a cramped paging file, or from some antivirus software or from using floppy disks.  None of these seemed to apply in my case.  Another discussion said that maybe this problem came from a corrupted  That was a possibility in my case; I had occasional error messages involving  Another potential cause:  abnormal termination of Word (such as I was doing myself, in this batch file, with TASKKILL), leaving junk in the %Temp% folder (located via Start > Run > %Temp% -- in my case, C:\Users\Ray\AppData\Local\Temp).  Cleaning out the %Temp% folder seemed to help:  there were hardly any memory error messages during the rest of the process.  The process seemed suitably restrained, whether by the "&&" device or otherwise, to the point that (judging from system tray icons) there were usually no more than one or two Bullzip processes underway at once.

When the process was done, most but not all of the DOCs had been converted.  I looked at the ones that had not.  (For that, I use an Excel comparison, with VLOOKUP, of filelists obtained by DIR from the input and output folders.)  All gave me an "insufficient memory" error when I tried to open them in Word.  Some seemed to be corrupted to various degrees.  I used Notepad and wReplace to slightly clean up the ones whose corruption prevented them from printing to PDF in a more or less normal fashion.  (In wReplace, the option I used was Replace Many > Open (arrow) > Diacritic to ASCII.)  Several others were printable, but I hadn't printed them.  That is, when the batch file was running, Word kept asking me if I wanted to save changes to (or to print; can't remember for sure) a document with a weird name.  The same name, over and over again.  I thought it was some kind of error, since that name wasn't in my file list.  Possibly this problem had something to do with the fact that these documents were originally created on a Mac and then converted.  So I had to PDF those manually.

Next, I wanted to take a quick look, to see whether any of the resulting PDFs were actually junk -- whether, for any reason, some of them made it through the process in garbled form.  For that, I took the approach of converting just the first page of each PDF to JPG, and then flipping through them in a photo viewer (e.g., IrfanView).  This process did turn up a few corrupted documents.  I was able to verify that they had been corrupted before I started this process; it did not appear that the steps described here had any effect.

I wasn't extremely concerned about these documents.  If I had been, I think a modified strategy would have been advisable:  take a quick look through all of the documents, as just mentioned, and then take a closer look at any that seemed important.  It would have been handy, for that purpose (and others), to have image- (and audio-) viewing (or listening) software that would not only display the item in question, but would also let me shove it into various categories with the touch of a key.  In this case, the categories would have been OK and Not OK and Examine More Closely.