Ray Woodcock's Latest: freeware

Showing posts with label freeware. Show all posts

Saturday, February 25, 2012

Windows 7: Setting and Maintaining Accurate System Time

I wanted to keep two computers' clocks set the same, for purposes of synchronization, so that they would have an accurate sense of whether the version of File X on computer A was newer than the version of File X on computer B. I had previously installed (or, more accurately, just added a copy of) Judah Levine's portable NISTIME 32 in something of a rush, when installing Windows 7, and, later, had vaguely recognized that it was not working right and/or I had not set it right. Now I decided to work out the kinks in this function.

NISTIME-32BIT.EXE

I started with the National Institute of Standards and Technology (NIST), from which programs like NISTIME 32 would draw the current time. It developed that NIST had a program called nistime-32bit.exe. It turned out to be the same as NISTIME 32, just slightly updated. The webpage's instructions were to start by going into File > Select Server and then Query Server > Now. Somewhere I saw advice to choose a server near me. I was tempted to choose two different ones, one for each computer, so as to have accurate time in case there was some terrible disruption of the national timekeeping system. Then I realized that this could have the effect of making rivers run upstream, where my files were concerned, to wit: new could be replaced by old. Being up-to-date on the latest developments in American chronology suddenly seemed less important than making sure I didn't accidentally overwrite today's crossword puzzle.

When I went to the Query Server > Now menu pick, I got a dialog indicating that NISTIME 32 was prepared to adjust my computer by 0.953 seconds. I told it to go ahead. I also went into Query Server > Periodically and told it to update the computer every 12 hours. Query Server > Server Status confirmed these settings. File > Help in Choosing Dirs told me to hit File > Save Config to save my settings. This gave me "File Error: Cannot open file to save configuration." That problem may have been caused by nesting the program too deeply in a subfolder. I moved it elsewhere and tried again. Now it seemed to confirm that it had saved my settings, and it created NISTIMEW.CFG in the same folder as the program's portable executable (nistime-32bit.exe). I exited and restarted, and it remembered what I had told it. But I had to remember to hit File > Save Config; it would not remember anything.

But then, when I did go into Query Server > Periodically, specified 12 hours, and hit File > Save Config and then File > Exit, I could not get it back. The program refused to become visible. I tried a couple of times, and then looked at Windows Task Manager (Start > Run > taskmgr.exe) > Processes tab. Taskmgr showed four separate instances of "nistime-32bit.exe *32." I selected them and clicked End Process, one by one, and then ran nistime-32bit.exe again. It returned to taskmgr.exe, but not to the screen. I minimized all windows, one by one, but, no, it was not lurking anywhere. There didn't seem to be a taskbar or system tray icon for it. It was here, and yet not here. I killed the processes again, now that I had started one or two new ones. I renamed NISTIMEW.CFG to be something else, and now it would start, and it saved new settings in a new NISTIMEW.CFG. Apparently the config file had gotten corrupted. I had originally created that file manually in lowercase (nistimew.cfg); possibly something about the program needed the uppercase filename.

But now, same thing again. Exiting and restarting gave me a hidden program: visible in Task Manager's Processes tab, but not visible onscreen. When I right-clicked on nistime-32bit.exe *32 in Task Manager and selected Properties, I got an error: "Windows cannot find [pathname] nistime-32bit.exe." I ended the process again. I created a shortcut to the .exe and tried starting it that way. I had no reason to think that would make any difference, and in fact it didn't. I tried moving all of the files from the folder where I had put nistime-32bit.exe, and placed them all instead in C:\Windows, with a shortcut to the executable in my Start Menu. That wasn't the answer; I still got lurking program sessions that appeared in Task Manager but nowhere else. I deleted the CFG again and tried again. Now it ran. I went directly to File > Save Config without making any changes. It indicated that it had saved the config file. I exited and restarted the program. It ran.

Now I saw something that may have explained the config file problem. The server list had changed. The Colorado server that I had selected previously was no longer listed in File > Select Server. I had previously gone into File > Update Server List, and that had generated a message: "New server file is C:\Windows\NIST-SRV.LST." It did that again now, when I designated a new server. I hit File > Save Config and then File > Exit, and then restarted the program. Now it was running normally. I moved the three files (the exe, cfg, and nist-srv.lst files) from C:\Windows back to the folder where I really preferred to have them. It seemed that the server list had not properly updated when the files were in that folder originally. I restarted and went through the same steps -- update server list, choose a new server, save config -- and now I was exiting and restarting without a problem.

But no, I spoke too soon. When I restarted, saved a 12-hour periodic refresh, and exited, it would not restart. Deleting the config and moving the other files back to C:\Windows did not fix it. The problem seemed to relate specifically to the attempt to set up recurrent time checks. I was doing something wrong, or perhaps the program had a bug, or maybe it was not suited for 64-bit Win7. I went to the NIST webpage cited in the program's Help > More Help and sent an email to the Webmaster link at the bottom of that page, pointing them here.

The Built-In Windows Time Sync Option

I decided to look for an alternative time-updating program. I ran a search and discovered that there was apparently some kind of automatic time-updating arrangement built into Windows. The advice there was, however, that "The W32Time service is not a full-featured NTP solution that meets time-sensitive application needs." That was consistent with the fact that my two computers' timeclocks tended to be somewhat inconsistent with one another. I had not tried to see how inconsistent they could be, or how long they could remain that way. I did see an indication somewhere that Windows defaulted to a weekly time update, so maybe it would verify that it was accurate to within a minute, or something, every week or so.

That appeared to be steered by Control Panel > Date and Time. That dialog could also be opened by right-clicking the clock in the system tray and choosing Adjust Date/Time. Or, as I now learned from Eric Phelps, it could also run from the command line via "rundll32.exe shell32.dll,Control_RunDLL timedate.cpl." The latter option would facilitate the option of opening the Date and Time dialog for manual adjustment via, say, a batch file that would open it automatically (to the correct tab) every day, week, or whatever. (Later, I found a How-To Geek webpage that said I could just run "w32tm /resync" as administrator to resynchronize the clock without even going into the Date and Time dialog. That, too, could be incorporated into a scheduled batch file.)

The Date and Time dialog > Internet Time tab > Change Settings option gave me a choice of synchronizing with time.nist.gov, which I understood to be the most accurate (though others in that list, not counting time.windows.com, appeared to be cousins of NIST). I noticed that the dialog told me, here, that "This computer is set to automatically synchronize on a scheduled basis." The previous sync site, as I saw on the other computer, was time.windows.com." I wasn't sure how synchronizing with that site could have left my two computers with different times -- differing by seconds, that is, not by minutes -- unless maybe time.windows.com was just not that worried about the seconds. Or maybe it was trying to synchronize when my router was doing its daily self-restart, and was therefore not getting access to the online clock? I wasn't sure. (Note: Fouzan said that this whole process wouldn't work if the computer was on a domain.)

Curious about the timing, I went into Start > Run > taskschd.msc > Task Scheduler Library. There were maybe 15 items in the list, and none of them were obvious time sync tasks. So another possibility was that some bug or tweak, brought into my system somewhere along the line, was preventing the creation or execution of the scheduling function. Another emerging possibility was that, as stated in a How-To Geek webpage, time.windows.com (which my systems had been using by default) had "a ton of problems with uptime." So possibly I had already fixed my problem, just by switching the machines to use time.nist.gov in the Date and Time dialog. (I did notice, as soon as I made that switch and clicked the update button, that both computers' clocks showed exactly the same time.)

Other Possibilities

I ran another search and found a Gizmo recommendation for Dimension 4 as a time correction utility. It occurred to me, at this point, that possibly I had fixed my problem, just by switching away from time.windows.com (above), and that maybe I should just let things slide for a week or two. I decided mostly just to record some notes, here, for possible future reference. So instead of installing Dimension 4, I just dragged the icon for its webpage from my browser's Address bar over to the Time subfolder in my customized Start Menu. If I ever needed it, I could follow the link at that time.

There also appeared to be more to know than I had realized, regarding Task Scheduler (taskschd.msc). In Task Scheduler's left-hand pane, I went down the tree into Task Scheduler Library > Microsoft > Windows > Time Synchronization. Now I saw that my machine was indeed set to synchronize time at 1 AM every Sunday. I saw advice from Tina Sieber on a way to adjust and improve the scheduling via Task Scheduler. Tina seemed to believe, however, that using a separate program might be the simpler and more accurate approach. Tina pointed toward two other programs, Atomic Clock Sync and AtomTime. The webpage for the latter seemed very old. I was not sure how it would fare in a 64-bit Windows 7 world.

For now, the solution seemed to be simply to go into the system's clock and change its time source to NIST. My monthly batch file brought up the NIST/USNO timepage on the first of every month, so I could observe, later, whether my two computers were again diverging from one another and/or from the time on that webpage. If they didn't stay in line, I would have two options. One would be to add one of the foregoing command lines to my daily or weekly batch files, to permit manual and/or automatic checking and/or resynchronization. Another would be to try one of the several freeware utilities just mentioned, particularly Dimension 4 or Atomic Clock Sync.

Thursday, November 24, 2011

Freeware: A Thanksgiving Tradition: First Cut

Summary

I decided that Thanksgiving would be a good time to revisit, annually, the question of what freeware I was using, and what an appropriate contribution would be. I was continuing to develop my customized Start Menu as a repository of links to all of the websites, installed programs, and portables that I used. So a search of the Start Menu, plus a list of Firefox add-ons, seemed to give me a substantial if not complete list of freeware programs for which some contribution might be appropriate. I developed that list in a spreadsheet. At least for the time being, I excluded some programs (e.g., those that I had used previously but wasn't using anymore; those provided by corporations like Google and Microsoft), so as to focus on the ones that seemed most currently entitled to compensation. I decided on appropriate values for each program, and also decided to do writeups or reviews. There were a number of them, so I set up my computer to reopen the spreadsheet weekly as a reminder. I hoped to be caught up by the next Thanksgiving.

Discussion

My computer, like many, was running a variety of free and paid-for programs. The motives behind the free programs seemed to vary. Some programmers evidently hoped their creations would become famous, at which point they could begin selling the software rather than giving it away. Some supported their work via advertisements. Some wanted to help others; some just shared a tool that they had invented to address their own needs.

Whether the inventor asked for payment or not, it seemed only fair to pay them something for their work. There were, however, some problems with that thought. One was that paying them would cost money. Most of us, at one time or another, have been tempted not to pay even when we could easily afford it. In a less piggish vein, there was also the reality that many of us could not afford to pay a fair price for all of the many free tools that a computer system might be running. We might instead be inclined not to use them, with inferior results for everyone concerned.

A related problem was that it was not clear how much to pay. Few freeware writers seemed guilty of asking too much. To the contrary, even the developers of incredibly useful programs tended to ask far less than their programs were worth. Maybe they were humble, or were underselling themselves; maybe they didn't want to appear too demanding or ridiculous. For whatever reason, it appeared that freeware compensation provided on an honor-system basis would preferably draw upon an estimation of each program's comparative value, regardless of what the programmer proposed to charge for it.

There was another side to that question of how much to pay. If I wanted to use Microsoft PowerPoint, I would have to buy a copy. Depending on Microsoft's internal decisions, I might be able to buy PowerPoint by itself, or I might have to buy a copy of the entire Office suite. This would be true regardless of whether I wound up using PowerPoint all day, every day, or actually only had a one-time need for it. In the for-profit market, this issue tended to be worked out on the macro level -- Microsoft's profits depended on charging at a balanced price across a large number of potential purchasers -- but not on the individual level. That is, I would pay the same price as someone whose usage was very different from mine. But in the honor-system freeware world, I could choose whatever payment plan made the most sense. I could buy it outright, or set aside money on a per-use basis, or pay an annual license-like fee, or adopt some other basis, as I chose.

Over the past several years, I had written up a couple of blog posts on the question of how to calculate how much I had used various pieces of freeware. There didn't yet seem to be a widely used system that would help me in this. I had eventually decided that maybe this would be something to deal with once a year, during the Christmas holiday season, but that didn't work out. That season tended to be busy, and it also wasn't usually overflowing with spare cash. So then I came to the idea of pinning this inquiry to Thanksgiving instead. As I thought about it, that actually seemed like a better connection. Freeware was a gift, to be sure; but it was a gift to be thankful for. And if I made it an annual thing, it could boil down to a couple of relatively simple questions: how thankful am I, based on my past year's usage, and how do I express that? The answer to the latter question could range from gratitude to cash payment, depending on the situation; I would have to work that out.

For starters, I decided that it would be OK to do this calculation just once a year. Yes, there would be programs that I had used during the year but had then discarded, and I might forget or unintentionally minimize their importance to me as of Thanksgiving. But I didn't think that would be a major problem, and I also felt it would be unwise to try to do it more frequently. An annual tradition could become something to be proud of; but an expectation that I would do this every month could convert the whole thing into a chore.

The next step, I thought, would be to figure out what I was using. In my case, there seemed to be two principal locations for freeware: Firefox add-ons and my Start Menu. The Firefox part was easy enough: I could just go into Firefox Tools > Add-ons for a list of the extensions and themes in use. Alternately, as someone advised, I could type "about:support" in the Firefox address bar to get a printable report. The Start Menu was also easy enough to see: I could just go to the Windows 7 Start button and write down all of the programs visible there. My Start Menu was an especially concentrated location for the programs that I would use because I had customized it to include not only links to installed programs but also the complete program folders for portables. I also had a project underway to convert my Firefox bookmarks to links in the Start Menu (for websites that I considered tools, such as Softpedia) or to items for my Reference list (for informational sites like Wikipedia). So it seemed that Firefox and the Start Menu would pretty much capture the list of programs I was using.

I decided to create the list in an Excel spreadsheet. (I was using Excel 2003.) I had columns for the name of the program, the version, and the serial number, if any. I got a good start on this by copying and pasting the results of that Firefox about:support list. But I had tons of stuff in my Start Menu. I didn't want to copy all that information manually. It seemed advisable to automate the process, if possible.

Next, I extracted relevant information from my customized Start Menu. This could be done manually. The following comments describe my attempts to automate that process somewhat. I began by using Windows Explorer to visit the folder where my Start Menu was located. I had moved my customized Start Menu to a drive other than drive C, so as to share it across my network and to back it up along with my other data; but as far as I could recall, the way to find the Start Menu folder in a more virgin version of Windows 7 would have been to right-click on the Start button and choose Open Windows Explorer. Once I had the top level of the Start Menu, I went to the address bar in Windows Explorer and selected and copied its path. Then I opened a command window (Start > Run > cmd) and typed two commands. First, C: (or whatever the drive letter was, for where the Start Menu was located), and then "cd " followed by the pathname that I had just copied from Windows Explorer. (To paste into a command window, I had to right-click on its top bar and then choose Edit > Paste.) Since this pathname had spaces in it, I began and ended it with quotation marks. Example: cd "C:\Folder\Start Menu" and then Enter. Now I ran a few commands. Of course, I could save these in a batch file to simplify things in the future. The commands were as follows:

dir *.lnk /s /b > "D:\Folder Name\SMProgs.txt"
dir *.exe /s /b >> "D:\Folder Name\SMProgs.txt"

These commands would fill SMProgs.txt with directory entries for every shortcut and executable file in my Start Menu folder. Since the second command was almost identical to the first, the fast way to enter it was just to press the up-arrow and then use the left arrow to go back and add a second ">" symbol and change LNK to EXE. (I chose the /s and /b options for the DIR command based on information obtained by typing "dir /?" and I was able to view the full printout of resulting information by highlighting the cmd window and pressing WinKey-LeftArrow to make the cmd window tall.) These two commands created a file called SMProgs.txt. I opened that file and copied and pasted its contents into an empty Excel spreadsheet. I did search-and-replace operations to remove the .exe and .lnk extensions, and then ran a formula down an adjacent column to automatically detect exact duplicates. (That is, sort by the column to be tested, and then use a formula like "If A1 = A2, put an X here, otherwise put nothing." Of course, the results produced by such commands would then change if I sorted the Xs together, unless I first used an Edit-Copy, Edit-Paste Special-Values combo to convert the formulas into values.) After deleting exact duplicates, I used a reverse-text function with FIND and MID commands to extract the filename and folder into separate columns. I now realized that the preceding duplicate-detection step was probably unnecessary, as I now sorted on the filename column and deleted duplicates again. So, for example, I now had only one entry for a file called Microsoft Excel. But I still had more than 1,500 rows in the spreadsheet. Further sorting, editing, and filtering gave me a list of about 450 actually installed programs.

The automated steps had helped somewhat. I hoped the process would become faster if I did it again in subsequent years. But from this point forward, it was a manual process. Using that list of 450, I added spreadsheet columns to mark purchase dates for programs I had already purchased, to exclude those that I did not intend to pay for (e.g., free Microsoft utilities), and to indicate those that I had actually used, as distinct from those that I might have tried but didn't remember, or had installed because I thought I might need them someday. In the resulting list of about 150 programs, I looked at the list of about a dozen that I had used but would probably not use anymore. I barely remembered some of these programs, but a few had been really useful in Windows XP.

At this point, I had to decide what I owed. I felt there was probably not much of an obligation to the people who had written programs that I had only used on a trial basis, though at least I could write reviews for the benefit of others who might use those programs, if I remembered enough to say something helpful. So I started with that thought. My reviews could be on sites like Softpedia or CNET (or Newegg or Amazon, for purchased programs), or perhaps a discussion here in a blog post would be appropriate. I probably would not bother doing a writeup if there were already many reviews, especially since these programs were increasingly outmoded. It occurred to me that it might also be helpful if I wrote reviews of purchased programs. I decided to treat the question of hardware reviews separately. So now I went back down my list of 150 programs I had used and, in a new spreadsheet column, marked those for which I had enough experience at least to write a brief comment or review. The result was roughly 50-50: I could say something about half of the programs, and not about the other half. I looked at the latter and, not surprisingly, found that I felt no particular obligation to pay anything either. These programs were generally on their way into my life, or out of my life, but had not yet been and might never be useful to me, aside from possibly a brief exploration at some point. No doubt the list would change somewhat by the time another Thanksgiving rolled around; I planned to revisit them again next year.

So I focused on the 75 or 80 programs that I had used enough to write something about. There was a question of what to write, and where to write it. I had reviewed some commercial programs on various websites (e.g., Amazon, CNET), and had also written about my use of some programs in posts on this blog. While any serious writeup would be better than no writeup, I preferred writing posts on my own blog, for several reasons. One was that, here, if I added links to other sites, they would not be removed. I could also describe a process, and the program's performance in it, in much greater detail than would be acceptable in a typically brief review on someone else's website. Of course, I also appreciated the opportunity to build up my own site while I was discussing someone else's product. There was the additional concern that writing on a website that might have more visitors (e.g., Amazon) could help to make it more appealing than another that I might actually consider better (e.g., Newegg). In the past, I had sometimes posted reviews across a number of commercial and sharing websites (including e.g., TigerDirect, Major Geeks). This could have the drawback of confusing potential users who might think that such websites were sharing reviews among themselves, seeing that they would encounter exactly the same review on multiple sites. It could also appear that I was propagandizing. And it could be time-consuming for me to post reviews on six to ten websites, when they requested not only a review but various stars, statements of pros and cons, bottom-line summaries, and so forth. I decided, as I had decided previously, that the best approach, where possible, was to do a writeup on my own blog that would provide detail and information beyond what would be allowed on a commercial site, so that I could find it, link to it, and expand on it in the future.

Going down the list again, in another spreadsheet column, I marked off those programs that I had already reviewed or discussed in some detail (e.g., IrfanView) and those for which there did not seem to be much need for a review because they were already well known and I was not using them in any noteworthy way (e.g., Skype). That cut the list in half again, down to about 40 programs that I hadn't yet reviewed and felt I probably should, for the benefit of the programmers and/or of the users. I added a line to my WEEKLY.BAT file to open this spreadsheet, as a reminder to write something about one of these programs each week. There would be weeks when I didn't do it, but I hoped that, come next Thanksgiving, I would have substantially reduced this particular obligation.

With the topic of reviews out of the way for the moment, I had to face the matter of money. The first problem I tackled, in this area, was to decide which programs called for payment. Among programs that I had used in the past but no longer used, some had been worth paying for. I wasn't sure how to recall or reconstruct which programs those might be. I decided to defer that question for the time being, so as to keep this project manageable, and focused strictly on those programs that I expected to continue to use, that I had not yet paid for, and that were of a type for which payment could reasonably be expected. I excluded those for which I did not yet know enough to write much of a review, on the theory that, in those cases, I was still in something like a shareware trial period. That is, it seemed unlikely that I would have bought these programs for which I did not have much present use. Filtering the spreadsheet for these criteria yielded a list of about 50 programs.

Now, given this list of programs for which I should pay something, how much should I pay? One answer would have been that I should buy the Pro version -- should upgrade from the freeware version, that is -- for those programs that offered that option. Before reaching that conclusion, though, I decided that payment should ideally be on a sliding scale. If I were rich, I would want to buy the company or support the individual that had done such good work, in hopes that they would do more of the same. If I were well-off but not truly rich, I might think that I should pay five or ten times the asking price for the professional version, or maybe buy and distribute five or ten pro licenses, so as to make up for some others who had not yet gotten around to paying, or who couldn't afford it. If I had only enough money to take care of myself but not enough to cover others, the answer might be to just buy the pro version, with one caveat: I probably should pay more if the programmer priced it too low. It vaguely seemed to me that the pro versions of programs I had bought in the past year or two had tended to be around $40, so I tentatively decided that my target payment + contribution for a pro version of a significant program should be in that range, if I could afford it. In my experience, less significant shareware programs tended to cost around $10-20. In the commercial market, of course, major programs could cost $100 or more.

Where money was tight, there would be an option of trying to pay or contribute a few dollars per program, so as to cover all programs at the same time, or instead singling out a few for more generous reimbursement. This question was already decided, in the case of those programs where I needed the pro version and therefore just had to pay what they asked. But for these ~50 programs, it was up to me. I decided to start at the low end, with a target of $5 for relatively trivial and $10 or $15 for more significant Firefox add-ons -- for the ones, that is, that saved me money and/or time -- and I assigned these values to those programs in a Target Price column in my spreadsheet. That accounted for a total of 19 rows in the spreadsheet and a value of $155. Among the remaining programs, I decided that, on the other end of the spectrum, Firefox had been, for me, an incredibly valuable and complex program easily worth $100 to me, even though it humbly suggested contributions of $5 to $30 -- and that I would probably have had to pay $100 or more to get it, if its major competitors (i.e., Google Chrome and Microsoft Internet Explorer) weren't supported by mammoth corporations that apparently saw the browser as a way to control access to the Internet for profit. Given that view, I probably wouldn't have paid more than around $20 for Opera, which I used only occasionally. With thoughts like these, I proceeded to ascribe target values to each of the other ~50 programs on my list -- the question being, again, not what would be the lowest price I could get it for but, rather, what was it worth more realistically, considering such factors as my own need and encouraging software development. Most of the other programs on my list wound up in the $10, $15, and $20 categories. I would have to adjust those values if further investigation revealed that there were pro versions I didn't know about, at whatever price they might be selling. I had also kind of rushed through my estimate rather than focusing on each program. But for purposes of rough estimation, taking account of everything from Firefox to its add-ons, I estimated a value of $750 for these ~50 programs, for an average of about $15 each.

I wasn't in a position to spend $750 on software right then. I also didn't want to donate $20 to some program and then find out, later, that they had come out with a pro version or had gone to a for-profit model and were now going to charge me another $30. I decided that this question of payment would be best completed item by item, as I revisited the spreadsheet on a weekly basis -- so that, hopefully, I would be caught up by the next Thanksgiving. For now, I decided to start by paying for a few programs and add-ons that I had been using for years and for which I was most appreciative. As I began to focus on those programs, I realized that this drawn-out, weekly approach would probably counteract a bit of stinginess that had set in as I was rushing through my valuation of those programs -- that, in other words, I would probably tend to bump the prices up closer to an average of $20 as I proceeded. And so it seemed I was set for the coming year's worth of weekly returns to the spreadsheet, with writeups and contributions as circumstances warranted.

Thursday, April 21, 2011

Repairing Damaged JPGs

I was using Windows 7. I had run a test and had determined that I had a bunch of damaged JPG image files. Apparently this could happen sometimes when files were saved on a CD or other drive with an iffy file table. In my case, it did not help to try to open the files in question on a different computer. It was also not a case of recovering data from a damaged memory card, for which a tool like ZAR digital image recovery might be needed. This was a situation of already having the files on the hard drive, but not being able to view them. So: how to repair them?

One possibility was to buy PixRecovery for $50. They had a demo version, so I downloaded and tried that. I had used Bulk Rename Utility to rename the corrupted JPGs so that their names ended with " - corrupted" without a JPG extension so that I would be able to pull them out if any of them got mixed with good JPGs. But unlike IrfanView, PixRecovery was not able to detect them until they did have a JPG extension, so I had to rename them back again. PixRecovery did not appear to be able to process the JPGs in bulk; I would have to fix them one at a time. On the first one I tried, I got "No data to recover detected." They did give me an option to "order a paid file review" for $199 per file. I tried another file. The program didn't remember the output directory I had just specified for the first one, so I had to trace back through the directory structure to find it again. This time, I got a message, "Recovered with demo restrictions." It didn't show me the actual picture, though, even with a watermark or stamp on it; it just showed me a JPG saying, "The image has been repaired in demo mode by PixRecovery." So I couldn't verify that the picture was fully restored; I would just have to take their word for it until I paid and tried it. PixRecovery also gav me a Corrupted Data Analysis Report for the "restored" photo, with a statement of recoverability (i.e., Low, Average, or Good). This seemed like something they could have provided on a batch basis for all files -- at least in the paid version, if not in the free -- so that the user would not have to go through the manual steps for each photo regardless of recoverability.

Among sites offering to provide a file examination for a fee, VG Jpeg-Repair offered an online service that would evaluate up to 100MB of JPGs for 1 Euro. Alternately, it sounded like the user could pay them 20 Euros and get an evaluation of an entire set of JPGs, and then pay around $1 per JPG for the ones that they could repair. I didn't investigate this too closely at this point; this just seemed to be the general idea. I recalled seeing other pay-per-file sites, but didn't look into those either at this stage. An eHow article pointed toward several other data recovery services, including Ontrack Data Recovery, Total Recall Data Recovery and ACE Data Group. Previous exposure and brief examination of these sites suggested that they were more oriented toward recovering data from damaged drives, though no doubt they could recover photos too -- but that they could be very expensive.

I found a review of Jpeg Repair Picture Doctor ($100) that made it sound like software to avoid, in the sense that it could trash good photos and pretend that it had restored bad ones. On the other hand, it apparently had a batch process and a trial phase, so if there was a good backup, it seemed like a way of possibly reducing the number of corrupted JPGs to restore. Another review said that the best JPG repair option was to use PhotoRescue. It looked like it was for recovering lost data from drives, not for JPG repair. They offered Wizard 3.1 for everyone ($29), Expert 2.1 for power users ($29), and Advanced 2.1 for ultimate experts ($99). I tried the Expert 2.1 demo. It was indeed oriented toward recovering drives. I couldn't figure out how to use it for fixing JPGs.

Another program, JPEG Recovery Pro ($50), seemed to be offering a 15-day trial that would at least show me low-quality watermarked copies of the photos after recovery. They also had a Basic version ($40), but it seemed to lack some features that would be useful when editing numerous JPGs. I downloaded and tried their Pro version. When I ran it, I got an error: "Access violation at address 00846DC1 in module 'JPEGRec5.exe." Possibly it was due to the fact that I installed without first shutting down all other programs. I uninstalled and tried again. That wasn't it. I tried on a different folder. It worked. Apparently the file and folder name combination was too long. I moved the folder to a higher-level location, so that the full pathname would be shorter, and tried again. Nope. I removed spaces from the folder name. No. I made the folder name shorter than 9 characters. No. I removed the files from that folder to another high-level folder with an eight-character name. Still got the error. I put a copy of one of the corrupted JPGs in a different folder. The program ran on that folder, in the sense of detecting several JPGs there, but it did not detect this particular JPG. So, hmm, this program was a possibility, but I'd have to tinker with it to make it work.

There may have been other possibilities. I did not fully explore the results of my search. But at this point it did start to seem that, if I wanted to download a program and do my own file recovery, it appeared that it would have to be a manual, one-by-one recovery process, whether using PixRecovery or some other program. I ran across references to JPEGsnoop and other programs that likewise seemed to require the user to do bit-by-bit editing of the JPG file in ways that were sometimes described as difficult. It appeared that JPEGsnoop might provide a relatively easy way to locate where the errors were.
I looked at the corrupted JPGs in IrfanView again. For the ones I looked at, the error was the same: "Can't read file header! Unknown file format or file not found!" I did a search and found people going through various struggles. One suggestion was to try opening the corrupted JPGs in a different image editing program. IrfanView was often recommended, but these had already failed there, so now I tried Gimp. It allowed me to select and try opening all of the corrupted JPGs. None opened, so Gimp did not seem superior to IrfanView in this task. Gimp did, however, produce more detailed error messages. It showed only the first several onscreen and then redirected the rest of them to stderr. I wasn't sure where that was. The several that did appear onscreen indicated that several files had similar problems: "starts with 0x31 0xd9" or "starts with "0xaa 0xe9." These seemed to mean I would have to edit the files manually to correct those starting errors. Microsoft said that stderr meant the command prompt window. It seemed I could capture the full log of errors by starting Gimp from the command line and redirecting that output to a text file. Right-clicking > Properties on the Gimp icon that I clicked on to run the program told me where the Gimp .exe file was. I opened a command window there and typed the needed command. I was running the portable version of Gimp, so in my case the command was:

start "" "W:\Start Menu\Programs\Multimedia\Images\Editors\GIMPPortable\GimpPortable.exe" > D:\GimpLog.txt

So then, when Gimp started, I tried again to open all those JPGs. When Gimp was done trying and failing, I opened D:\GimpLog.txt. Unfortunately, there was nothing in it, so apparently I hadn't done that quite right.

I still didn't have a plan for what I would do if I did find out exactly what the errors were, so I paused the error output project to think about that. I decided that these were old files, and there was really no urgency to this project. There was always the possibility that, in the next year or however long it would be until I would get back to it, someone would come along with a cheap or free program or other great solution that would really take care of it, without all that manual editing. Therefore, I shelved this project for the time being.

Tuesday, March 29, 2011

Windows 7: Mouse Pointer Spotlight

I was using Windows 7. I noticed that, in some online demo videos, people were able to have their mouse cursor highlighted, as if it were a flashlight. I wanted to be able to do that. I ran a search to find a freeware option. I had to filter this search to find sites that seemed to be helpful. I found there were some shareware options, including PointerFocus, MouseLight, and SpotOnTheMouse. There were also some freeware options that looked interesting but didn't give me quite what I wanted, including the freeware MouseShade and Sonar.

At some point, I became aware of the obvious, which was that I could probably just tinker with the mouse cursor or maybe install a different kind of cursor, making it visible in the same way that a spotlight effect would. Built-in options in Windows 7 (Control Panel > Mouse, Pointers tab and Pointer Options tab) included choosing a larger cursor and adding tails. To expand on that, I ran a search, glanced at a webpage offering a free download of 7500 various kinds of mouse pointers, and then went to a Microsoft webpage explaining how to change the mouse pointer's appearance. The basic idea here was that, in that same Control Panel > Mouse > Pointers tab, I could click Browse and see a boatload of cursor (.cur) and animation (?) (.ani) files that existed in C:\Windows\Cursors.

My guess was that I could probably do a search for additional .cur files and download them to that folder, and then they would hopefully be visible in that Pointers tab. The Open Cursor Library looked like it might have a lot of cursor options. I also looked cursorily at webpages on creating your own cursor, creating a cursor from an image, previewing a cursor, and using a custom cursor file. But then I realized that -- damn, being a curser and all -- I had to get back to work.

Google Search: Freeware: Get Rid of Unwanted Sites

I was using Google to search for freeware. My search produced a bunch of websites that did not give me what I wanted. This post briefly describes some steps I took to get better search results.

First, I modified my search to eliminate some sites that were giving me a combination of freeware and shareware. For purposes of this particular search, I figured that any freeware of good quality would have been noticed and commented on by a number of people. People selling software had a number of strategies to obscure the fact that they were not offering freeware. Not to blame them -- they worked hard on their software, and they wanted to make some money for it -- but what I was searching for wasn't important enough to buy. If there was a freeware solution, great; if not, I'd just skip it. So the modified search I used was this (assuming I was searching for software related to "mouse" and "cursor"):

mouse cursor freeware -shareware -"free download" -"free trial" -"free to try"

This got through one set of unwanted results, but I wasn't done. Now I was getting a bunch of websites that offered all kinds of freeware, but none providing what I was looking for specifically. They just seemed to put up any freeware that seemed remotely related, and that wasn't helping me. I was doing this search in Firefox, and I knew of two ways to get rid of these sorts of sites in Firefox. (There probably were similar solutions in Google Chrome, but I didn't check.) One approach was to install the Web of Trust (WOT) add-on, and look for its colored rings next to the search results. These, I had found, were helpful but sometimes alarmist.

Another approach, which I used in conjunction with WOT, was to install the OptimizeGoogle add-on, and start to build up its list of list of filters. The steps here were, first, to install the add-on, and then, in Firefox, go into Tools > Add-ons > Extensions tab > OptimizeGoogle > Options > Filter. My list of filters was still growing, but at this point the list was as follows:

http://*.recipester.org/*
http://*.neevia.com/*
http://*.headkeys.com/*
http://headkeys.com/*
http://*.all-freeware.com/*
http://*.softducks.com/*
http://*.topshareware.com/*
http://*.top4download.com/*
http://*.xentrik.com/*
http://*.fileguru.com/*
http://*.software.informer.com/*
http://*.filebuzz.com/*
http://*.bestfreewaredownload.com/*
http://*.windows7download.com/*
http://*.freedownloadscenter.com/*
http://*.winsite.com/*
http://*.easyfreeware.com/*
http://*.brothersoft.com/*
http://*.filetransit.com/*
http://*.macshareware.com/*
http://*.fileheap.com/*
http://*.informer.com/*
http://mac.download3000.com/*
http://*.mac360.com/*
http://*.freemacware.com/*
http://www.downv.com/Mac-software-download/*
http://*.downloadatoz.com/*
http://*.ptf.com/*

I saved that list (Export from the filter list) for when I would have to reinstall Firefox. With this list in place, Google searches in Firefox that found any of these websites would now give me a small, greyed-out line to let me know that my results were being filtered, but would focus on the remaining sites. This, I found, reduced distraction and saved time in other searches.

Wednesday, September 1, 2010

Portable Applications in Windows XP: Which Ones to Use

I was assembling a list of preferred Windows XP applications. Part of my goal was to replace installed apps with portable apps where possible. Doing so had the advantages of getting back up and running much more quickly, whenever I would have to reinstall Windows; reducing system problems due to misbehaving applications; and having my favorite tools available, in my preferred configuration, when I had to use someone else's computer. With the emergence of cloud computing, among other things, there had been a trend toward the virtualization of applications -- toward, that is, making apps less dependent upon a single piece of hardware.
So now I was engaged in a search for applications that would replace some of those I had traditionally installed on my Windows XP systems. In my search, I found there were many sources of portable apps, including the categorized and somewhat ranked set in Andrew Lee's Portable Freeware Collection. I used these sources to supplement and revise the set that I had started with, from PortableApps.com. It seemed likely that I would continue to add preferred portable apps from various sources indefinitely. In that sense, this post can only be a step along the way. Even so, by this time I had accumulated enough applications to say a few things on the subject.

Microsoft Office and Alternatives

Perhaps the single most important thing to say was that there were substitutes for Microsoft Office. This was important because it permitted some freedom from dependence upon Microsoft programs. I had already achieved some such freedom by switching to Ubuntu as my operating system and by running Windows XP within VMware virtual machines (VMs) on Ubuntu/WinXP dual-boot machines. Through that step, I had multiple alternatives, whenever Microsoft Windows and/or Office failed me: I could run Microsoft Office, or the freeware OpenOffice alternative, on either Windows or Ubuntu, either natively or virtually. After that, there were very few times when I was substantially unable to get work done because of some software failure arising from Microsoft software.

The next step in that regard was to move away from treating either Microsoft Windows or Microsoft Office as my primary operating or office system. Where Windows was concerned, I had recently concluded that the computing world, and I, had not yet come up with a superior end-user alternative for PC-based (as distinct from e.g., Apple) systems. There were far more, and far more useful, software applications available on Windows than on Ubuntu or other varieties of Linux. But in the case of Office, it did increasingly appear that there were superior alternatives -- notably OpenOffice. Then, too, I was among those keyboard-oriented users who were put off by the mouse-oriented ribbon that debuted in Office 2007. In any event, I had had too many experiences with malfunctioning computer installations that desperately needed to be reinstalled, but that I was currently unable to reinstall because it would take too many hours to install and adjust Office and other heavyweight programs.

The current search for portable apps pushed me further in the direction of seeking an alternative to Microsoft Office. As described in a separate post, people were struggling with the non-portability of Office 2003 and 2007. Some pursued the option of downloading a presumably bootleg copy of Microsoft Office that was somehow converted into portable format. That option (as noted in that other post) entailed considerable risk of viral infection, instability, and inflexibility. Another option, which I did pursue, was to create my own portable version from my own copy of Office. That, too, did not turn out well, primarily because the software capable of doing it properly was still too expensive. Since I considered portability a real benefit, I was thus even more motivated to take seriously the OpenOffice alternative, which was freely available in Windows-based portable form.

Other Top Portable Apps

Of course, there is more to the world than office productivity software. A complete set of portable apps must depend on the needs of the individual user. But it may be possible to select some, from among the estimated 150 categories displayed in the Portable Freeware Collection, from which a typical user might want to draw at least one portable app.

In preparing this list, I excluded many categories that did not interest me and/or in which I was not knowledgeable. Among these excluded categories (to name a few) were those having to do with iPods, mp3 tags, games, and IM. Also, in some instances, I took more than one example from a category (especially from categories with grab-bag names like "miscellaneous"), or took something other than the top-ranked program. I have also used my own category headings, rather those supplied by the PFC website. In short, this list is offered for purposes of interest, not precision.

Audio and Video

XMedia Recode
Audacity
IrfanView
Duplicate Music Files Finder
VirtualDub Portable
VLC
XBlender

Images, Graphics, Scanning

Photoscape
Dia
Softi FreeOCR

CD/DVD

ImgBurn
Virtual CDRom Control Panel
WinToFlash
Folder2Iso
BonkEnc

Files and Folders

Multi File Tool
7-Zip
TreeSize
Duplicate File Finder
ICE ECC
FreeCommander
FileCommander
Undelete Plus
Bulk Rename Utility
Index Your Files
Unlocker

Backup and Synchronization

Toucan
FastCopy
ozSync

Downloading, FTP

Free Download Manager
FileZilla
WinHTTrack
uTorrent
VDownloader

Security & Privacy

PortableTor
Blowfish Advanced CS
ClamWin Portable
KeePass
CCleaner
Magical Jelly Bean Keyfinder

Web Browsers

Firefox
Opera
FireTune
Opera Settings Import & Export Tool

PDF

Scan2PDF
Foxit Reader
Swift PDF

Registry Editing

Registry Commander
Regshot
RegFromApp
RegScanner

Remote Collaboration

ShowMyPC
TeamViewer

Bookmarks

TrayURL
AM-DeadLink

Fun & Entertainment

Raindrop
Sumotori Dreams
Wallpaper Randomizer

System Information

SIW
PC Wizard
Autoruns

Calendar & Time

Sunbird
TimeSync

General Reference

Convert
WeatherMate

HTML & Text Editing

KompoZer
HTML Portable Editor
Notepad++

Onscreen

ClipX
PNotes
SysExporter
Virtual Magnifying Glass

Phone & Email

Thunderbird
Skype

Math & Statistics

SpeedCrunch
R

Productivity & Desktop Layout

OpenOffice
WinTabber
VirtuaWin

Other

RamBooster
Stalled Printer Repair
EjectUSB
JkDefrag
AutoIt
Don't Sleep
PStart
Scribus
DSpeech

Monday, May 10, 2010

Tweaking Ubuntu 10.04

I had upgraded from Ubuntu 9.10 to Ubuntu 10.04 (Lucid Lynx). Now I wanted to make some adjustments. For this purpose, I would be drawing upon the first and second lists of adjustments I had made when I had installed 9.10.

The first of those other things was to install a PAE-enabled 10.04 kernel, so that I would have access to all of my system's RAM. I also found that VMware Workstation 7.0 was requiring me to enter my serial number. I had not entered it after installing it on 9.10; I had just been using the trial serial number. Now, unfortunately, Workstation was not cooperating: it was saying "Unknown error entering serial number." The solution to this problem was to run Workstation as root (i.e., type "sudo vmware" at the Terminal prompt) and enter the serial number there.

Since I was doing an upgrade of a previous installation, I did not have to reinstall packages, but I did have a problem with Software Sources. I have described that one in a separate post. (Even if I had needed to reinstall packages, the "installed-software" trick described in the lists of adjustments (above) would have made short work of it.) On my laptop, I was doing the same upgrade, and there I made some problems for myself when I had to interrupt the upgrade. Here on the desktop, the next task was (as described in separate posts) to try to get the printing and scanning features of my Brother MFC-7340 printer working from within Ubuntu. After doing that, I tackled a problem that I had wanted to solve for years: how to import the list of AutoCorrect entries that I had created in Microsoft Word into OpenOffice.org Writer, running on Ubuntu. That took some time, but then I was able to get back to the project of working through those notes from previous installations.

The upgrade from Ubuntu 9.10 to 10.04 had preserved much of the configuration I had already set up. So as I worked through the Ubuntu tweaking steps described in one of my previous posts, I had to deal with only a fraction of the issues addressed there. One was to prevent icons for mounted drives from appearing on the desktop. That called for Terminal: “gconf-editor” > /apps/nautilus/desktop > unclick volumes_visible. But it was already unclicked, and yet I did have icons for mounted drives visible on the desktop. I went into Applications > System Tools > Ubuntu Tweak > Desktop Icon Settings and clicked Show desktop icons and left everything else unchecked, but this made no difference. I tried again, this time using “sudo gconf-editor.” Ah, yes. The volumes_visible box was checked for root. Unchecking it removed the icons from the desktop.

I had not noticed, but somewhere along the line, I had evidently installed nautilus-open-terminal via Synaptic, or possibly it came installed by default. As its name suggested, this tool added a right-click (context menu) option in Nautilus, "Open in Terminal." Unlike Windows Explorer, this option was only available for folders in the file listing in Nautilus -- in the right pane, that is, not in the left pane that would typically show the folder tree. In that left pane, this and other options were absent.

Had another little problem. During the upgrade to 10.04, the Firefox icon changed, on the left-side (formerly top) panel, to a red do-not-enter or not-allowed kind of icon; and in Applications > Internet, it changed to a grey question-mark box. I fixed this by going to the panel, right-clicking on Applications > Edit Menus > Internet > Firefox > Properties, clicking on the icon, and selecting /usr/share/pixmaps/firefox.png. Speaking of Firefox, I wanted Ubuntu to open Firefox and Google Chrome on startup. I wanted to add these to a script that would run at startup, since I suspected that I would be coming up with other things that I wanted to have happen at startup too. It seemed pretty technical – beyond my current ability, anyway – but it looked like I might be able to just write a script and put it into /etc./init.d. I typed “sudo gedit” and then created a file called /etc./init.d/a_startup_script.sh. I put, into it, the line that I got from e.g., right-clicking on Applications and choosing Edit Menus > Applications > Internet > Firefox > Properties: “firefox %u.” I saved it and typed “chmod +x /etc./init.d/a_startup_script.sh.” Then I rebooted. This achieved nothing. So creating a general-purpose startup script remained a goal for the future.

I also went down Gizmo’s Freeware list of tweaks. I had already done most of the ones I wanted, but there were a few others. One was to install Windows TrueType fonts in Ubuntu. I decided to extend their advice somewhat. I took a look at System > Preferences > Appearance > Fonts. I could see that a lot of Windows fonts were not present on the list there. So in Windows XP, I went to C:\Windows\Fonts. I selected and copied everything to another, temporary folder called UbuFonts. In UbuFonts, I sorted by file type and deleted the ones that were not TTF files. Back in Ubuntu, I typed “sudo nautilus” and went to /usr/share/fonts/truetype. It already had a folder called msttcorefonts, but with only a fraction of the fonts that I had just copied from C:\Windows\Fonts to UbuFonts, and the font files there seemed older and smaller. I made a backup copy of the msttcorefonts folder and then copied everything from UbuFonts into /usr/share/fonts/truetype/msttcorefonts. Now System > Preferences > Appearance > Fonts had a much wider selection. I changed the fonts to Tahoma 10 and the monospace to Courier 10. Tahoma allowed me to see more information on each line onscreen.

Another tweak from Gizmo called for some playing around with Compiz. This was a bad idea, as described in a separate post. Another tweak of interest was to clean up the GRUB boot menu. I typed “uname -r” and saw that I was using the 2.6.32-22-generic-pae kernel. In Synaptic, I searched for linux-image, clicked at the top of the left-hand column to sort by those that were installed, and marked for removal all numbered items other than that kernel. In this case, that included just two items: linux-image-2.6.31-21-generic-pae and linux-image-2.6.32-22.generic (i.e., not pae). I did another search for linux-headers and marked all non-2.6.32-22-generic-pae items there too. In this case, trying to remove linux-headers-2.6.32-22 threatened to remove linux-headers-2.6.32-22-generic-pae as well; but I wanted to keep that, so I didn’t remove linux-headers-2.6.32-22. On restart, I saw that GRUB now listed just the 2.6.32-22-generic-pae kernel and its recovery mode, along with memtest and Windows XP (it was a dual-boot machine). I went back into Ubuntu without a problem. I did later have a VMware problem that might have been related to this, though.

Gizmo also pointed me toward a number of recommended freeware apps. These were for the KDE (not GNOME) desktop. I thought it might be time to try KDE, if only to check out these programs. One was the Wally wallpaper changer, which I installed through Synaptic. Getting Wally (and the whole KDE desktop) involved a total of 91 files. Other interesting pieces of Gizmo-recommended software I got through Synaptic: gtk-recordMyDesktop and Dolphin file manager. Downloaded directly from creators’ websites, I got Wink, FreeFile Sync, and Parted Magic. I did consider using Dropbox as well, because of its good reviews (by e.g., PC Magazine, Online Backup Tools, Laptop, and alternativeTo; but I decided that Windows Live Sync had important advantages even though I would have to run it in a virtual machine when I was booted into Ubuntu.

(Note: a few days later -- possibly the first time I tried it after installing KDE -- Google Desktop search was no longer responding to its default Ctrl-Ctrl hotkey. That is, its Quick Search Box was not coming up. Something I saw on some webpage, as I was trying to fix that problem, made me wonder whether the KDE installation was to blame. A tip that fixed it was to open the Google Desktop Search preferences and change the hotkey to Ctrl-F1.)

The last thing to do, in this round of tweaking, was to clean out unnecessary stuff from the drive on which I had installed Ubuntu, so that I could make a backup image in case I needed to reinstall – so that I could just restore the image, that is, instead of having to go back through all these steps. This, I thought, called for something like the TreeSize utility that I had used in Windows to see where I might have files or folders taking up huge amounts of space. Among what seemed to be several possibilities, I found an actual Linux version of TreeSize, so I downloaded that – but I also discovered Ubuntu’s Applications > Accessories > Disk Usage Analyzer, whose Treemap Chart was especially interesting. These revealed that my Ubuntu installation was not presently very large, so I didn’t have to worry too much about shrinking it for this particular image. It also revealed that by far the largest space hog, within that installation, was the Google Desktop index of stuff on my hard drive, at 5.2GB. I did see that the Google Chrome cache was also taking 400MB. I went into Chrome’s Settings (the wrench icon at the upper-right corner) > Options but didn’t see any way to control that, other than to just clear the cache. I decided to leave it for now, and changed some other settings while I was there. And that was it. Ubuntu 10.04 was tweaked, at least for now.

Wednesday, December 30, 2009

Sorting and Manipulating a Long Text List to Eliminate Some Files

In Windows XP, I made a listing of all of the files on a hard drive. For that, I could have typed DIR *.* > OUTPUTFILE.TXT, but instead I used PrintFolders. I selected the option for full pathnames, so each line in the file list was like this: D:\FOLDER\SUBFOLDER\FILENAME.EXT, along with date and file size information.

I wanted to sort the lines in this file list alphabetically. They already were sorted that way, but DIR and PrintFolders tended to insert blank lines and other lines (e.g., "=======" divider lines) that I didn't want in my final list. The question was, how could I do that sort? I tried the SORT command built into WinXP, but it seemed my list was too long. I tried importing OUTPUTFILE.TXT into Excel, but it had more than 65,536 lines, so Excel couldn't handle it. It gave me a "File not loaded completely" message. I tried importing it into Microsoft Access, but it ended with this:

Import Text Wizard

Finished importing file 'D:\FOLDER\SUBFOLDER\OUTPUTFILE.TXT to table 'OUTPUTTXT'. Not all of your data was successfully imported. Error descriptions with associated row numbers of bad records can be found in the Microsoft Office Access table 'OUTPUTFILE.TXT'.

And then it turned out that it hadn't actually imported anything. At this point, I didn't check the error log. I looked for freeware file sorting utilities, but everything was shareware. I was only planning to do this once, and didn't want to spend $30 for the privilege. I did download and try one shareware program called Sort Text Lists Alphabetically Software (price $29.99), but it hung, probably because my text file had too many lines. After 45 minutes or so, I killed it.

Eventually, I found I was able to do the sort very quickly using the SORT command in Ubuntu. (I was running WinXP inside a VMware virtual machine on Ubuntu 9.04, so switching back and forth between the operating systems was just a matter of a click.) The sort command I used was like this:

sort -b -d -f -i -o SORTEDFILE.TXT INPUTFILE.TXT

That worked. I edited SORTEDFILE.TXT using Ubuntu's GEDIT program (like WinXP's Notepad). For some reason, PrintFolders (or something) had inserted a lot of lines that did not match the expected pattern of D:\FOLDER\SUBFOLDER\FILENAME.EXT. These may have been shortcuts or something. Anyway, I removed them, so everything in SORTEDFILE.TXT matched the pattern.

Now I wanted to parse the lines. My purpose in doing the file list and analysis was to see if I had any files that had the same names but different extensions. I suspected, in particular, that I had converted some .doc and .jpg files to .pdf and had forgotten to zip or delete the original .doc and .jpg files. So I wanted to get just the file names, without extensions, and line them up. But how? Access and Excel still couldn't handle the list.

This time around, I took a look at the Access error log mentioned in its error message (above). The error, in every case, was "Field Truncation." According to a Microsoft troubleshooting page, truncation was occurring because some of the lines in my text file contained more than 255 characters, which was the maximum Access could handle. I tried importing into Access again, but this time I chose the Fixed Width option rather than Delimited. It only went as far as 111 characters, so I just removed all delimiting lines in the Import Text Wizard and clicked Finish. That didn't give me any errors, but it still truncated the lines. Instead of File > Get External Data > Import, I tried Access's File > Open command. Same result.

I probably could have worked through that problem in Access, but I had not planned to invest so much time in this project, and anyway I still wasn't sure how I was going to use Access to remove file extensions and folder paths so that I would just have filenames to compare. I generally used Excel rather than Access for that kind of manipulation. So I considered dividing up my text list into several smaller text files, each of which would be small enough for Excel to handle. I'd probably have done that manually, by cutting and pasting, since I assumed that a file splitter program would give me files that Excel wouldn't recognize. Also, to compare the file names in one subfile against the file names in another subfile would probably require some kind of lookup function.

That sounded like a mess, so instead I tried going at the problem from the other end. I did another directory listing, this time looking only for PDFs. I set the file filter to *.pdf in PrintFolders. I still couldn't fit the result into Excel, so I did the Ubuntu SORT again, this time using a slightly more economical format:

sort -bdfio OUTPUTFILE.TXT INPUTFILE.TXT

This time, I belatedly noticed that PrintFolders and/or I had somehow introduced lots of duplicate lines, which would do much to explain why I had so many more files than I would have expected. As advised, I used another Ubuntu command:

sort OUTPUTFILE.TXT | uniq -u

to remove duplicate lines. But this did not seem to make any difference. Regardless, after I had cleaned out the junk lines from OUTPUTFILE.TXT, it did all fit into Excel, with room to spare. My import was giving me lots of #NAME? errors, because Excel was splitting rows in such a way that characters like "-" (which is supposed to be a mathematical operator) were the first characters in some rows, but were followed by letters rather than numbers, which did not compute. (This would happen if e.g., the split came at the wrong place in a file named "TUESDAY--10AM.PDF." So when running the Text Import Wizard, I had to designate each column as a Text column, not General.

I then used Excel text functions (e.g., MID and FIND) on each line, to isolate the filenames without pathnames or extensions. I used Excel's text concatenations functions to work up a separate DIR command for each file I wanted to find. In other words, I began with something like this:

D:\FOLDER\SUBFOLDER\FILENAME.EXT

and I ended with something like this:

DIR "FILE NAME."* /b/s/w >> OUTPUT.TXT

The quotes were necessary because some file names have spaces in them, which confuses the DIR command. I forget what the /b and other options were about, but basically they made the output look the way I wanted. The >> told the command to put the results in a file called OUTPUT.TXT. If I had used just one > sign then that would have meant I wanted OUTPUT.TXT to be recreated every time a match was found. Using two >> signs was an indication that OUTPUT.TXT should be created if it does not yet exist, but otherwise the results of the command should just be appended to whatever is already in OUTPUT.TXT.

In cooking up the final batch commands, I would have been helped by the MCONCAT function in the Morefunc add-in, but I didn't know about it yet. I did use Morefunc's TEXTREVERSE function in this process, but I found that it would crash Excel when the string it was reversing was longer than 128 characters. Following other advice, I used Excel's SUBSTITUTE command instead.

I took the thousands of resulting commands (such as the DIR FILENAME.* >> OUTPUT.TXT shown above), one for each file type (e.g., FILE NAME.*) that I was looking for, into a DOS batch file (i.e., a text file created in Notepad, with a .bat extension, saved in ANSI format) and ran it. It began finding files (e.g., FILENAME.DOC, FILENAME.JPG) and listing them in OUTPUT.TXT. Unfortunately, this thing was running very slowly. Part of the slowness, I thought, was due to the generally slower performance of programs running inside a virtual machine. So I thought I'd try my hand at creating an equivalent shell script in Ubuntu. After several false starts, I settled on the FIND command. I got some help from the Find page in O'Reilly's Linux Command Directory, but also found some useful tips in Pollock's Find tutorial. It looked like I could recreate the DOS batch commands, like the example shown above, in this format:

find -name "FILE NAME.*" 2>/dev/null | tee -a found.txt

The "-name" part instructed FIND to find the name of the file. There were three options for the > command, called a redirect: 1> would have sent the desired output to the null device (i.e., to nowhere, so that it would not be visible or saved anywhere), which was not what I wanted; 2> sent error messages (which I would get because the lost+found folder was producing them every time the FIND command tried to search that folder) to the null device instead; and &> would have sent both the standard output and the error messages to the same place, whatever I designated. Then the pipe ("|") said to send everything else (i.e., the standard output) to TEE. TEE would T the standard output; that is, it would send it to two places, namely, to the screen (so that I could see what was happening) and also to a file called found.txt. The -a option served the same function as the DOS >> redirection, which is also available in Ubuntu's BASH script language, which is what I was using here: that is, -a appended the desired output to the already existing found.txt, or created it if it was not yet existing. I generated all of these commands in Excel - one for each FILE NAME.* - and saved them to a Notepad file, as before. Then, in Ubuntu, I made the script executable by typing "chmod +x " at the BASH prompt and then ran it by typing "./" and it ran. And the trouble proved to be worthwhile: instead of needing a month to complete the job (which was what I had calculated for the snail's-pace search underway in the WinXP virtual machine), it looked like it could be done in a few hours.

And so it was. I actually ran it on two different partitions, to be sure I had caught all duplicates. Being cautious, I had the two partitions' results output to two separate .txt files, and then I merged them with the concatenation command: "cat *.txt > full.lst." (I used a different extension because I wasn't sure whether cat would try to combine the output file back into itself. I think I've had problems with that in DOS.) Then I renamed full.lst to be Found.txt, and made a backup copy of it.

I wanted to save the commands and text files I had accumulated so far, until I knew I wouldn't need them anymore, so I zipped them using the right-click context menu in Nautilus. It didn't give me an option to simultaneously combine and delete the originals.

Next, I needed to remove duplicate lines from Found.txt. I now understood that the command I had used earlier (above) had failed to specify where the output should go. So I tried again:

sort Found.txt | uniq -u >> Sorted.txt

This produced a dramatically shorter file - 1/6 the size of Found.txt. Had there really been that many duplicates? I wanted to try sorting again, this time by filename. But, of course, the entries in Sorted.txt included both folder and file names, like this:

./FOLDER1/File1.pdf
./SomeotherFolder/AAAA.doc

Sorting normally would put them in the order shown, but sorting by their ending letters would put the .doc files before the .pdfs, and would also alphabetize the .doc files by filename before foldername. Sorting them in this way would show me how many copies of a given file there had been, so that I could eyeball the possibility that the ultimate list of unique files would really be so much shorter than reported in Found.txt. I didn't know how to do that in bash, so I posted a question on it.

Meanwhile, I found that the output of the whole Found.txt file would fit into Excel. When I sorted it, I found that each line was duplicated - but only once. So plainly I had done something wrong in arriving at Sorted.txt. From this point, I basically did the rest of the project in Excel, though it belatedly appeared that there were some workable answers in response to my post.

Ray Woodcock's Latest

Pages

Saturday, February 25, 2012

Windows 7: Setting and Maintaining Accurate System Time

Thursday, November 24, 2011

Freeware: A Thanksgiving Tradition: First Cut

Thursday, April 21, 2011

Repairing Damaged JPGs

Tuesday, March 29, 2011

Windows 7: Mouse Pointer Spotlight

Google Search: Freeware: Get Rid of Unwanted Sites

Wednesday, September 1, 2010

Portable Applications in Windows XP: Which Ones to Use

Monday, May 10, 2010

Tweaking Ubuntu 10.04

Wednesday, December 30, 2009

Sorting and Manipulating a Long Text List to Eliminate Some Files

Support This Blog

Total Pageviews

Archives

Pages

Saturday, February 25, 2012

Thursday, November 24, 2011

Thursday, April 21, 2011

Tuesday, March 29, 2011

Wednesday, September 1, 2010

Monday, May 10, 2010

Wednesday, December 30, 2009

Support This Blog

RSS Feed - Subscribe to My

Total Pageviews

Archives