Wednesday, November 30, 2011

Win7RegEdit.reg Updated

I had previously developed a .reg file that would automate a number of registry edits in Windows 7.  This post presents a slimmed-down version of that .reg file.  The tweaks in this .reg file were part of a Win7 installation that also used Ultimate Windows Tweaker and other tools, as described in another post in this blog.  The purpose of the present post is just to display the contents of Win7RegEdit.reg.

The following text will probably display better if copied and pasted into Notepad.  It may be necessary to correct some line breaks inserted by the blog website.  I wouldn't recommend using this information if you aren't sure what you're doing and haven't backed up your system.

Windows Registry Editor Version 5.00
; Run Ultimate Windows Tweaker first.  This adds options not available there.
; More info & restore options in 32-bit version of this file.

; ************* WINDOWS EXPLORER *************

; ***** Disable Libraries *****

; Set Documents folder template as default
[HKEY_CURRENT_USER\Software\Classes\Local Settings\Software\Microsoft\Windows\Shell\Bags\AllFolders\Shell]

; Add context menu option to open files with Notepad
@="Open with Notepad"
@="notepad.exe \"%1\""

; Disable Windows from asking "Do you want to open this file?"
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Download]

; Disable annoying web service dialog for opening files

; Disable Windows 7 built-in CD burning

; ************* START MENU, TASKBAR, AND THUMBNAILS *************

; Make Aero Peek happen instantly

; Make Aero taskbar thumbnails show contents immediately when hovering

; Increase Start Menu display speed -- default is 400
[HKEY_CURRENT_USER\Control Panel\Desktop]

; ************* FILE LOCATIONS *************

; Point to W for customized Start Menu and Programs
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders]
"Administrative Tools"="W:\\Start Menu"
"Programs"="W:\\Start Menu\\Programs"
"Startup"="W:\\Start Menu\\Programs\\Startup"
"Start Menu"="W:\\Start Menu"
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders]
"Programs"="W:\\Start Menu\\Programs"
"Startup"="W:\\Start Menu\\Programs\\Startup"
"Start Menu"="W:\\Start Menu"

; Point to Current folder for Music, Video, Pictures, etc.
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders]
"My Music"="D:\\Current"
"My Pictures"="D:\\Current"
"My Video"="D:\\Current"
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders]
"My Music"="D:\\Current"
"My Pictures"="D:\\Current"
"My Video"="D:\\Current"

; Point to X:\Cache for cookies, cache, etc.
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders]
"Cache"="X:\\Cache\\Temporary Internet Files"
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders]
"Cache"="X:\\Cache\\Temporary Internet Files"

; Customize default places bar in Win7's common file dialog box
"Place3"="D:\\Personal Projects"
"Place4"="W:\\Start Menu\\Programs"

; ************* INTERNET EXPLORER *************

; Specify IE download directory
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer]
"Download Directory"="D:\\Current"

; Force IE to launch shortcuts in a new window
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main]

; ************* LOGIN, LOGOUT, SHUTDOWN *************

; Save settings on exit

; Disable automatic restart after crash so you can see error messages
"AutoReboot "=dword:00000000

; ************* OTHER TWEAKS *************

; Remove "Shortcut" from title of shortcuts

; Disable creation of Thumbs.db

; Disable beep on error
[HKEY_CURRENT_USER\Control Panel\Sound]

; Increase Internet download connections to 10
[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings]

; Google Earth cache settings
[HKEY_CURRENT_USER\Software\Google\Google Earth Plus]
:: Move cache to drive X
:: Originally "CachePath"="C:\\Users\\Ray\\AppData\\LocalLow\\Google\\GoogleEarth"
"CachePath"="X:\\Cache\\Google Earth"
:: Set disk cache to 2GB and memory cache to 1GB
[HKEY_CURRENT_USER\Software\Google\Google Earth Plus\Cache]

Thursday, November 24, 2011

Freeware: A Thanksgiving Tradition: First Cut


I decided that Thanksgiving would be a good time to revisit, annually, the question of what freeware I was using, and what an appropriate contribution would be. I was continuing to develop my customized Start Menu as a repository of links to all of the websites, installed programs, and portables that I used. So a search of the Start Menu, plus a list of Firefox add-ons, seemed to give me a substantial if not complete list of freeware programs for which some contribution might be appropriate. I developed that list in a spreadsheet.  At least for the time being, I excluded some programs (e.g., those that I had used previously but wasn't using anymore; those provided by corporations like Google and Microsoft), so as to focus on the ones that seemed most currently entitled to compensation.  I decided on appropriate values for each program, and also decided to do writeups or reviews.  There were a number of them, so I set up my computer to reopen the spreadsheet weekly as a reminder.  I hoped to be caught up by the next Thanksgiving.


My computer, like many, was running a variety of free and paid-for programs.  The motives behind the free programs seemed to vary.  Some programmers evidently hoped their creations would become famous, at which point they could begin selling the software rather than giving it away.  Some supported their work via advertisements.  Some wanted to help others; some just shared a tool that they had invented to address their own needs.

Whether the inventor asked for payment or not, it seemed only fair to pay them something for their work.  There were, however, some problems with that thought.  One was that paying them would cost money.  Most of us, at one time or another, have been tempted not to pay even when we could easily afford it.  In a less piggish vein, there was also the reality that many of us could not afford to pay a fair price for all of the many free tools that a computer system might be running.  We might instead be inclined not to use them, with inferior results for everyone concerned.

A related problem was that it was not clear how much to pay.  Few freeware writers seemed guilty of asking too much.  To the contrary, even the developers of incredibly useful programs tended to ask far less than their programs were worth.  Maybe they were humble, or were underselling themselves; maybe they didn't want to appear too demanding or ridiculous.  For whatever reason, it appeared that freeware compensation provided on an honor-system basis would preferably draw upon an estimation of each program's comparative value, regardless of what the programmer proposed to charge for it.

There was another side to that question of how much to pay.  If I wanted to use Microsoft PowerPoint, I would have to buy a copy.  Depending on Microsoft's internal decisions, I might be able to buy PowerPoint by itself, or I might have to buy a copy of the entire Office suite.  This would be true regardless of whether I wound up using PowerPoint all day, every day, or actually only had a one-time need for it.  In the for-profit market, this issue tended to be worked out on the macro level -- Microsoft's profits depended on charging at a balanced price across a large number of potential purchasers -- but not on the individual level.  That is, I would pay the same price as someone whose usage was very different from mine.  But in the honor-system freeware world, I could choose whatever payment plan made the most sense.  I could buy it outright, or set aside money on a per-use basis, or pay an annual license-like fee, or adopt some other basis, as I chose.

Over the past several years, I had written up a couple of blog posts on the question of how to calculate how much I had used various pieces of freeware.  There didn't yet seem to be a widely used system that would help me in this.  I had eventually decided that maybe this would be something to deal with once a year, during the Christmas holiday season, but that didn't work out.  That season tended to be busy, and it also wasn't usually overflowing with spare cash.  So then I came to the idea of pinning this inquiry to Thanksgiving instead.  As I thought about it, that actually seemed like a better connection.  Freeware was a gift, to be sure; but it was a gift to be thankful for.  And if I made it an annual thing, it could boil down to a couple of relatively simple questions:  how thankful am I, based on my past year's usage, and how do I express that?  The answer to the latter question could range from gratitude to cash payment, depending on the situation; I would have to work that out.

For starters, I decided that it would be OK to do this calculation just once a year.  Yes, there would be programs that I had used during the year but had then discarded, and I might forget or unintentionally minimize their importance to me as of Thanksgiving.  But I didn't think that would be a major problem, and I also felt it would be unwise to try to do it more frequently.  An annual tradition could become something to be proud of; but an expectation that I would do this every month could convert the whole thing into a chore.

The next step, I thought, would be to figure out what I was using.  In my case, there seemed to be two principal locations for freeware:  Firefox add-ons and my Start Menu.  The Firefox part was easy enough:  I could just go into Firefox Tools > Add-ons for a list of the extensions and themes in use.  Alternately, as someone advised, I could type "about:support" in the Firefox address bar to get a printable report.  The Start Menu was also easy enough to see:  I could just go to the Windows 7 Start button and write down all of the programs visible there.  My Start Menu was an especially concentrated location for the programs that I would use because I had customized it to include not only links to installed programs but also the complete program folders for portables.  I also had a project underway to convert my Firefox bookmarks to links in the Start Menu (for websites that I considered tools, such as Softpedia) or to items for my Reference list (for informational sites like Wikipedia).  So it seemed that Firefox and the Start Menu would pretty much capture the list of programs I was using. 

I decided to create the list in an Excel spreadsheet.  (I was using Excel 2003.)  I had columns for the name of the program, the version, and the serial number, if any.  I got a good start on this by copying and pasting the results of that Firefox about:support list.  But I had tons of stuff in my Start Menu.  I didn't want to copy all that information manually.  It seemed advisable to automate the process, if possible.

Next, I extracted relevant information from my customized Start Menu.  This could be done manually.  The following comments describe my attempts to automate that process somewhat.  I began by using Windows Explorer to visit the folder where my Start Menu was located.  I had moved my customized Start Menu to a drive other than drive C, so as to share it across my network and to back it up along with my other data; but as far as I could recall, the way to find the Start Menu folder in a more virgin version of Windows 7 would have been to right-click on the Start button and choose Open Windows Explorer.  Once I had the top level of the Start Menu, I went to the address bar in Windows Explorer and selected and copied its path.  Then I opened a command window (Start > Run > cmd) and typed two commands.  First, C: (or whatever the drive letter was, for where the Start Menu was located), and then "cd " followed by the pathname that I had just copied from Windows Explorer.  (To paste into a command window, I had to right-click on its top bar and then choose Edit > Paste.)  Since this pathname had spaces in it, I began and ended it with quotation marks.  Example:  cd "C:\Folder\Start Menu" and then Enter.  Now I ran a few commands.  Of course, I could save these in a batch file to simplify things in the future.  The commands were as follows:
dir *.lnk /s /b > "D:\Folder Name\SMProgs.txt"
dir *.exe /s /b >> "D:\Folder Name\SMProgs.txt"
These commands would fill SMProgs.txt with directory entries for every shortcut and executable file in my Start Menu folder.  Since the second command was almost identical to the first, the fast way to enter it was just to press the up-arrow and then use the left arrow to go back and add a second ">" symbol and change LNK to EXE.  (I chose the /s and /b options for the DIR command based on information obtained by typing "dir /?" and I was able to view the full printout of resulting information by highlighting the cmd window and pressing WinKey-LeftArrow to make the cmd window tall.)  These two commands created a file called SMProgs.txt.  I opened that file and copied and pasted its contents into an empty Excel spreadsheet.  I did search-and-replace operations to remove the .exe and .lnk extensions, and then ran a formula down an adjacent column to automatically detect exact duplicates.  (That is, sort by the column to be tested, and then use a formula like "If A1 = A2, put an X here, otherwise put nothing."  Of course, the results produced by such commands would then change if I sorted the Xs together, unless I first used an Edit-Copy, Edit-Paste Special-Values combo to convert the formulas into values.)  After deleting exact duplicates, I used a reverse-text function with FIND and MID commands to extract the filename and folder into separate columns.  I now realized that the preceding duplicate-detection step was probably unnecessary, as I now sorted on the filename column and deleted duplicates again.  So, for example, I now had only one entry for a file called Microsoft Excel.  But I still had more than 1,500 rows in the spreadsheet.  Further sorting, editing, and filtering gave me a list of about 450 actually installed programs.

The automated steps had helped somewhat.  I hoped the process would become faster if I did it again in subsequent years.  But from this point forward, it was a manual process.  Using that list of 450, I added spreadsheet columns to mark purchase dates for programs I had already purchased, to exclude those that I did not intend to pay for (e.g., free Microsoft utilities), and to indicate those that I had actually used, as distinct from those that I might have tried but didn't remember, or had installed because I thought I might need them someday.  In the resulting list of about 150 programs, I looked at the list of about a dozen that I had used but would probably not use anymore.  I barely remembered some of these programs, but a few had been really useful in Windows XP.

At this point, I had to decide what I owed.  I felt there was probably not much of an obligation to the people who had written programs that I had only used on a trial basis, though at least I could write reviews for the benefit of others who might use those programs, if I remembered enough to say something helpful.  So I started with that thought.  My reviews could be on sites like Softpedia or CNET (or Newegg or Amazon, for purchased programs), or perhaps a discussion here in a blog post would be appropriate.  I probably would not bother doing a writeup if there were already many reviews, especially since these programs were increasingly outmoded.  It occurred to me that it might also be helpful if I wrote reviews of purchased programs.  I decided to treat the question of hardware reviews separately.  So now I went back down my list of 150 programs I had used and, in a new spreadsheet column, marked those for which I had enough experience at least to write a brief comment or review.  The result was roughly 50-50:  I could say something about half of the programs, and not about the other half.  I looked at the latter and, not surprisingly, found that I felt no particular obligation to pay anything either.  These programs were generally on their way into my life, or out of my life, but had not yet been and might never be useful to me, aside from possibly a brief exploration at some point.  No doubt the list would change somewhat by the time another Thanksgiving rolled around; I planned to revisit them again next year.

So I focused on the 75 or 80 programs that I had used enough to write something about.  There was a question of what to write, and where to write it.  I had reviewed some commercial programs on various websites (e.g., Amazon, CNET), and had also written about my use of some programs in posts on this blog.  While any serious writeup would be better than no writeup, I preferred writing posts on my own blog, for several reasons.  One was that, here, if I added links to other sites, they would not be removed.  I could also describe a process, and the program's performance in it, in much greater detail than would be acceptable in a typically brief review on someone else's website.  Of course, I also appreciated the opportunity to build up my own site while I was discussing someone else's product.  There was the additional concern that writing on a website that might have more visitors (e.g., Amazon) could help to make it more appealing than another that I might actually consider better (e.g., Newegg).  In the past, I had sometimes posted reviews across a number of commercial and sharing websites (including e.g., TigerDirect, Major Geeks).  This could have the drawback of confusing potential users who might think that such websites were sharing reviews among themselves, seeing that they would encounter exactly the same review on multiple sites.  It could also appear that I was propagandizing.  And it could be time-consuming for me to post reviews on six to ten websites, when they requested not only a review but various stars, statements of pros and cons, bottom-line summaries, and so forth.  I decided, as I had decided previously, that the best approach, where possible, was to do a writeup on my own blog that would provide detail and information beyond what would be allowed on a commercial site, so that I could find it, link to it, and expand on it in the future.

Going down the list again, in another spreadsheet column, I marked off those programs that I had already reviewed or discussed in some detail (e.g., IrfanView) and those for which there did not seem to be much need for a review because they were already well known and I was not using them in any noteworthy way (e.g., Skype).  That cut the list in half again, down to about 40 programs that I hadn't yet reviewed and felt I probably should, for the benefit of the programmers and/or of the users.  I added a line to my WEEKLY.BAT file to open this spreadsheet, as a reminder to write something about one of these programs each week.  There would be weeks when I didn't do it, but I hoped that, come next Thanksgiving, I would have substantially reduced this particular obligation.

With the topic of reviews out of the way for the moment, I had to face the matter of money.  The first problem I tackled, in this area, was to decide which programs called for payment.  Among programs that I had used in the past but no longer used, some had been worth paying for.  I wasn't sure how to recall or reconstruct which programs those might be.  I decided to defer that question for the time being, so as to keep this project manageable, and focused strictly on those programs that I expected to continue to use, that I had not yet paid for, and that were of a type for which payment could reasonably be expected.  I excluded those for which I did not yet know enough to write much of a review, on the theory that, in those cases, I was still in something like a shareware trial period.  That is, it seemed unlikely that I would have bought these programs for which I did not have much present use.  Filtering the spreadsheet for these criteria yielded a list of about 50 programs.

Now, given this list of programs for which I should pay something, how much should I pay?  One answer would have been that I should buy the Pro version -- should upgrade from the freeware version, that is -- for those programs that offered that option.  Before reaching that conclusion, though, I decided that payment should ideally be on a sliding scale.  If I were rich, I would want to buy the company or support the individual that had done such good work, in hopes that they would do more of the same.  If I were well-off but not truly rich, I might think that I should pay five or ten times the asking price for the professional version, or maybe buy and distribute five or ten pro licenses, so as to make up for some others who had not yet gotten around to paying, or who couldn't afford it.  If I had only enough money to take care of myself but not enough to cover others, the answer might be to just buy the pro version, with one caveat:  I probably should pay more if the programmer priced it too low.  It vaguely seemed to me that the pro versions of programs I had bought in the past year or two had tended to be around $40, so I tentatively decided that my target payment + contribution for a pro version of a significant program should be in that range, if I could afford it.  In my experience, less significant shareware programs tended to cost around $10-20.  In the commercial market, of course, major programs could cost $100 or more.

Where money was tight, there would be an option of trying to pay or contribute a few dollars per program, so as to cover all programs at the same time, or instead singling out a few for more generous reimbursement.  This question was already decided, in the case of those programs where I needed the pro version and therefore just had to pay what they asked.  But for these ~50 programs, it was up to me.  I decided to start at the low end, with a target of $5 for relatively trivial and $10 or $15 for more significant Firefox add-ons -- for the ones, that is, that saved me money and/or time -- and I assigned these values to those programs in a Target Price column in my spreadsheet.  That accounted for a total of 19 rows in the spreadsheet and a value of $155.  Among the remaining programs, I decided that, on the other end of the spectrum, Firefox had been, for me, an incredibly valuable and complex program easily worth $100 to me, even though it humbly suggested contributions of $5 to $30 -- and that I would probably have had to pay $100 or more to get it, if its major competitors (i.e., Google Chrome and Microsoft Internet Explorer) weren't supported by mammoth corporations that apparently saw the browser as a way to control access to the Internet for profit.  Given that view, I probably wouldn't have paid more than around $20 for Opera, which I used only occasionally.  With thoughts like these, I proceeded to ascribe target values to each of the other ~50 programs on my list -- the question being, again, not what would be the lowest price I could get it for but, rather, what was it worth more realistically, considering such factors as my own need and encouraging software development.  Most of the other programs on my list wound up in the $10, $15, and $20 categories.  I would have to adjust those values if further investigation revealed that there were pro versions I didn't know about, at whatever price they might be selling.  I had also kind of rushed through my estimate rather than focusing on each program.  But for purposes of rough estimation, taking account of everything from Firefox to its add-ons, I estimated a value of $750 for these ~50 programs, for an average of about $15 each.

I wasn't in a position to spend $750 on software right then.  I also didn't want to donate $20 to some program and then find out, later, that they had come out with a pro version or had gone to a for-profit model and were now going to charge me another $30.  I decided that this question of payment would be best completed item by item, as I revisited the spreadsheet on a weekly basis -- so that, hopefully, I would be caught up by the next Thanksgiving.  For now, I decided to start by paying for a few programs and add-ons that I had been using for years and for which I was most appreciative.  As I began to focus on those programs, I realized that this drawn-out, weekly approach would probably counteract a bit of stinginess that had set in as I was rushing through my valuation of those programs -- that, in other words, I would probably tend to bump the prices up closer to an average of $20 as I proceeded.  And so it seemed I was set for the coming year's worth of weekly returns to the spreadsheet, with writeups and contributions as circumstances warranted.

Wednesday, November 9, 2011

Documenting Computer Work with Screenshots and Duplicate Detectors

I was doing some work in Windows 7.  I wanted to log the changes periodically as I went along.  I had already developed a batch file, which I called Shotshooter.bat, to take screenshots and save them as .png files periodically.  (I think this will work in all versions of Windows.)

Having already done the NirCmd setup required to make that batch file work, I slightly revised it to read as follows.  (If the print is too small, use Ctrl-+ or copy and paste into Notepad.  Batch files are best not edited in a word processor like Word, since they change characters sometimes.)

:: Shotshooter.bat

:: Captures a series of screenshots.

:: See for info on NirCmd.exe.

:: Takes two arguments:  how many shots, and how many milliseconds between shots.

:: Sample usage:  shotshooter 3600 1000
:: That example would take screenshots every second for an hour,
::   assuming the computer could work that fast.

:: To kill the program sooner, use Task Manager (Ctrl-Alt-Del) > Processses > nircmd.exe.
:: IrfanView provides a fast way to play back results.

:: Create Screenshots folder on drive D

md \Screenshots
:: Go to the drive and folder where you put NirCmd.exe


cd "\Start Menu\Programs\Tools\Programming and Scripting\NirCmd"
:: Run NirCmd with the desired settings

nircmd.exe loop %1 %2 savescreenshot D:\Screenshots\scr~$currdate.yyyy-MM-dd$-~$currtime.HH_mm_ss$.png

:: Go see what you've created in D:\Screenshots

start explorer.exe /e,"D:\Personal Projects\View These Weekly"

Having created Shotshooter.bat, now I wanted to run it.  I had already used Ultimate Windows Tweaker (UWT) to install a right-click option to open a command window in any folder I would select in Windows Explorer, so all I had to do was open a CMD window where I had saved Shotshooter.bat, and just type "shotshooter 1200 30000" and hit Enter.

(If I hadn't installed that option with UWT, I could also have gone to Start > Run > cmd and then navigated manually to the proper folder with change-drive (e.g., D:) and change-folder (e.g., cd "folder name") commands like those used in Shotshooter.bat, above.  Note that quotation marks are necessary if folder names contain spaces or possibly if they are too long.)

Those parameters of 1200 and 30000 meant that Shotshooter.bat would use NirCmd to take a screenshot every 30000 milliseconds (i.e., every 30 seconds), and would take a total of 1200 screenshots (i.e., would run for 10 hours).  It would save them in D:\Screenshots, and now I would have to decide what to do with them.

When I was writing up these notes, I didn't recall whether NirCmd could also produce JPGs or other image formats.  It probably could.  But after getting used to IrfanView, it probably would have been easier to just use IrfanView (File > Batch Conversion/Rename) to do a mass conversion of PNGs into JPGs if necessary.

The problem with a straightforward slideshow was that, if I allowed a few seconds for each screenshot, I could easily wind up with an hourlong show that would feature extended periods of no change.  On this particular day, I had gone to the store and out to lunch, so presumably nothing was happening during those periods.  The changes that I would want to see might pop up for only a few seconds in that hourlong show.

It seemed I had better get rid of the PNGs that merely repeated the same unchanging screenshot for long periods of time.  To do this, I tried a couple of approaches.  After making a backup of my Screenshots folder, I started with Awesome Duplicate Photo Finder (ADPF).  I adjusted its settings to examine PNGs and told it to search only the D:\Screenshots folder.  It felt that, out of my 1,200 screenshots, 1,160 were potential duplicates.  Closer examination revealed that, while many of those files were not what ADPF considered 100% identical, hundreds were.  Unfortunately, ADPF did not offer a way to bulk-delete the 100% matches.  I did not take the time-consuming approach of just letting ADPF guide me through the 580 pairs of images comprising those 1,160 alleged duplicates, making manual choices as to whether I should delete one of the two images it showed me.

I tried another approach.  Among the many free duplicate file finders, I had long used DoubleKiller.  (Exact Duplicate Finder gave the same results as one type of DoubleKiller comparison, but offered fewer comparison options.)  For some reason, a CRC and size comparison in DoubleKiller gave me only 161 duplicates.  I suspected DoubleKiller was being too precise.  Doing an unreliable size-only comparison, it still found only 340 duplicates.

Following the advice on a page that recommended five duplicate file detectors, I downloaded and installed Dup Detector.  It was not easy to understand, but some tinkering I was able to get it to work.  The first time I ran it, I told it to search only for 100% matches.  (By default, it was set to search for "Dup if within 98.5% to 100% match.")  The thing that made it work was to go into its Options > "Automatic and Semi-auto delete setup," highlight the "Delete left image" criterion (the only criterion I needed to use in this case) and use the "Swap up" and "Swap down" buttons to make "No delete" come after "Delete left image."  Then, to make it run, I had to start with Get data > Build.  Also, because of the number of files, I thought I had better start with Find > "Find dups setup (method and restrictions)" set to find 9999 pairs.  I still wasn't sure, at the time when I was writing these remarks, whether that was a good number to put there.

When I ran Dup Detector to search for only 100% matches, it found that about half of the PNGs were duplicates.  I eyeballed some of them, using IrfanView to flip through them quickly with just a right-arrow keypress.  There did appear to be a lot of exact duplicates.  I ran an automatic delete to get rid of those dups.  Then I ran a DoubleKiller search for matches that had both identical sizes and identical CRC checksums.  DoubleKiller still turned up 79 pairs of duplicates.  Apparently there had been more than 9999 pairs, first time around.  (If there were more than 100 exact duplicates, then there would be more than 9999 possible pairs.)  To check this, I ran another Dup Detector search for 100% matches.  It found a bunch more.

It seemed, then, that the best strategy would have been to run DoubleKiller first, so as to get rid of one item in each exact pair.  I did that now.  Then I ran Dup Detector, looking again only for 100% matches.  It found none.  I tried again, this time with a search for 99.9% to 100% matches.  Again, it found 9999 pairs.  Some looked identical, in the program's necessarily reduced and imperfect matchup screen, but the differences in others were visible -- more than 0.1% different, I would have thought.  I tried another search, this time adding a decimal point -- looking, that is, for 99.99% to 100.00% matches.

While that was underway, I ran another of the simple comparisons available in ADPF.  It said that, of the 851 pictures remaining (out of the original 1,200), it found 811 similar pictures.  In the bottom pane, I clicked twice on the Similarity column heading, so as to see what it considered the 100% matches first.  I couldn't tell any difference between the ones that I looked at.  I wasn't sure why ADPF had not considered them identical.

Before doing anything with that ADPF comparison, I went back to Dup Detector.  It had completed its 99.99% search.  It was still finding 9,999 pairs.  I had already noticed that, unlike ADPF, Dup Detector was not able to show the right portion (maybe one-sixth) of my widescreen screenshots.  I also noticed that the first line of the Dup Detector report, in the top left corner of the screen, said that it was comparing 99.9% (not 99.99%) matches.  So unless there was a bug in that report, apparently one decimal place was as precise as it got.

I preferred ADPF's visual comparison, so I went back to it.  In the bottom pane, it looked like about two-thirds of its similar pictures were at the 100% level.  That would apparently mean I would have to do hundreds of manual comparisons:  Picture 1 might match Picture 2, and also Pictures 3, 4, 5 . . . I went down to the 99% matches.  These, too, were identical, as far as I could tell.  I had noticed, in IrfanView, that the only thing that had changed since the previous screenshot, among some screenshots, was that the system clock, in the bottom right-hand corner of the screen, had moved ahead by one minute, so maybe that sort of thing kept ADPF from catching them at the 100% level.  ADPF wasn't showing the taskbar or other outer edges of the screenshots, so I couldn't tell for sure.

With hardly any exceptions, I found that even the 95% matches in ADPF were virtually identical.  The only differences that I could detect, in the 50% of less of matches where I did see a difference, was that a different window might be foregrounded -- that, in other words, its title bar would be a different color in one screenshot than in the other.  In other words, the ADPF matching levels seemed more realistic than those in Dup Detector:  this was the kind of difference that I would expect to be detected at the 96% level, well before the 99.9% level.  A 99.9% match, I felt, should involve no more than the tiniest flyspeck of difference.

For my purposes, I did not get regular, visible differences in matches, in ADPF, until I was down at the 91% level.  There were few matches at that level, so I decided to err on the safe side, manually deleting duplicates down to the 95% level.  But in the process, I stopped along the way to re-run Dup Detector.  It seemed I might be able to calibrate it against ADPF.  In other words, I first deleted all of the 100% matches in ADPF, and then ran Dup Detector.  There were about 360 of those, or about 45% of the 811 similar pictures detected by ADPF.  Deleting them was pretty fast, once I got the keystroke combination worked out; it probably took 6-7 minutes.

Rerunning Dup Detector at the 99.9% setting, after deleting ADPF's 100% matches, still produced 9999 matches from the 482 screenshots remaining.  I expected it to produce matches that were extremely difficult to tell apart.  This was not the outcome I got.  There were a number of rather obvious (although still very minor) differences.  It seemed, at this point, that Dup Detector's supposed 99.9% match was not realistic and, for my purposes, not meaningful.  In general, it seemed that Dup Detector had been useful only for purposes of automated deletion of 100% matches, though possibly that function would have been served equally well by an easier DoubleKiller comparison in terms of file size and CRC.  It didn't look like Dup Detector had anything more to offer me at this point.

In the interests of automating future comparisons, I looked around for a free bulk CRC calculator, but ultimately got better results searching for an MD5sum program in CNET.  I thought about FSUM but finally went with the somewhat higher-ranked MD5summer.  Both would do batch work and yield text-file output, but FSUM was command-line.  MD5summer did give me the option of saving its calculated sums as a text file, which I then imported into Microsoft Excel.  Using text parsing functions (e.g., MID, LEFT, TRIM), I extracted the 32-character checksum, sorted, and used a formula to compare cells.  MD5summer had evidently identified only 322 duplicates in 161 pairs.  Possibly the reason the duplicates were found only in pairs was that I had run Shotshooter (above) in 30-second intervals:  there would be only two screenshots per minute, before the system clock changed, producing a screenshot with a different checksum.  That had probably been the case with the DoubleKiller CRC checksum results as well.

One workaround would have been to see if I could conceal the system clock before starting, or I could batch-trim the PNGs in IrfanView (File > Batch Conversion/Rename > Advanced > Crop) before running the checksum, but in this case I wanted the clock to be visible on the output.  (I could also have used cropping, or could have drawn circles and arrows on my PNGs, using something like Photoshop, to narrow the focus to particularly interesting changes on the screen, so as to reduce the percentage match that ADPF or other duplicate detectors would calculate.)  I could have done the batch-trim with a copy of the snapshots backup folder, so as to produce a list of files to be deleted without harming the originals (or, in this case, the copies).  But what if the only thing that changed (in some future application) was a small item in the center (i.e., not at the edge) of the screenshot?  I could use the spreadsheet to identify not only the duplicate pairs but also the time periods during which every minute had a matching pair, so as to lead me toward large stretches of time when nothing would change, and then maybe a manual comparison in ADPF would be manageable for the rest.

So those were possibilities for future projects.  At present, having deleted the ADPF matches down to the 95% level (leaving a few near-duplicates where I could quickly see a difference), I continued on a bit further in ADPF, deleting some additional near-duplicates.  At the 91% level, almost every pair of near-duplicates contained visible differences, so I stopped there.  So I was done with ADPF.

I now had 442 fairly distinct screenshots, out of the original 1,200.  I would have had more if I hadn't abandoned the computer for several hours while Shotshooter was running.  I viewed a bunch of them in IrfanView, again using the right-arrow key to move quickly to the next.  I had already set IrfanView (Options > Properties > Browsing/Editing) to go to the next file after I deleted one -- or maybe it did that anyway, by default.  It now occurred to me that some of my ADPF work might have been faster, and that differences might have been easier to see, if I had just paged through the screenshots (or at least some of them) in IrfanView, using the Delete key (alternately, the X button, up by the menu bar) to delete apparent duplicates.  When I ran into a stretch where there seemed to be many duplicates, I stopped hitting Delete (in case, by deleting too fast, I would accidentally delete one after the screen did change) and instead just selected and deleted many at once in Windows Explorer.

Holding down the right-arrow key in IrfanView allowed me to page quickly through obviously similar or dissimiliar screenshots.  There were still quite a few, requiring as many as 90 virtually duplicate screenshots to be deleted in one case.  It seemed that ADPF may have been fooled, not only by the system clock (which still seemed to be the only thing that was changing, in many cases), but also by relatively complex screenshots (e.g., showing photographic or Google Earth images rather than just text documents, spreadsheets, and Windows Explorer sessions).  That is, to my way of thinking, many of these images were 99% similar, but ADPF hadn't even considered them 91% similar, so possibly its comparison engine was miscalculating similarities in some conditions.

After these other steps, I took a final trip through the snapshots, in IrfanView, and deleted a few more that were very similar to the ones immediately preceding them.  I wound up with 207 snapshots, out of the original 1,200, that seemed to represent fairly well what I had been doing over a 10-hour period.  Again, the number could have been substantially larger -- maybe around 300-350 -- if I hadn't spent a few hours away from the computer.

Now I wanted to put these snapshots into some kind of slideshow.  I rarely made slideshows.  On a few of the screenshots, I decided to use Adobe Photoshop Elements to draw circles and lines to draw attention to changes, from one slide to the next, that might otherwise escape attention.  Then I figured I would use IrfanView (File > Slideshow > Save slideshow as EXE/SCR) to create a slideshow.  (I could also have used something like PowerPoint, except that apparently that would have required me to create 207 slides and then import a photo into each.)  But IrfanView wouldn't let me add circles and arrows or, as seemed increasingly appropriate, a voiceover, to explain what was going on.

I tried using both Adobe Premiere Elements and CyberLink PowerDirector, but neither of them wanted to let me export to a full widescreen format.  Saving it in a reduced format (e.g., 720x480) lost so much detail that it was hard to read what was being displayed.  They also created huge files.  These programs -- especially Premiere Elements -- were also pretty terrible at giving me a simple way of arranging slides.  I wound up just creating an IrfanView EXE slideshow in full-screen mode -- and you know what?  It was beautiful.  Visually, it was perfect.  It looked exactly like the regular computer screen from which I had created all those screenshots.  And it was only 91KB.  Tiny!  The only drawback was that it couldn't incorporate a voiceover and lines and arrows.  So the output side of this project was still in development.

Wednesday, November 2, 2011

Things to Do Before Killing Yourself

The bulk of this post has been relocated to my religion blog and revised there.

The part about the book by Professor Xavier Cortez, 100 Things To Do Before Killing Yourself In The Midst Of A Murderous Rampage, has been revised and converted into a review of that book on its Amazon site.

Sorry for any inconvenience.

-- RW, Dec. 12, 2012