Showing posts with label test. Show all posts
Showing posts with label test. Show all posts

Saturday, June 2, 2012

Batch Verifying or Validating Scattered WAV Files

I had previously looked for ways to test MP3 files.  Now I wanted to test some WAVs.  This post describes one approach that seemed to work.

Initially, I tried using IrfanView, before rediscovering that IrfanView did not do audio file conversions.  After some searching around, I tried the command-line option in Boxoft WAV to MP3 Converter (freeware).  This looked promising.

To run the Boxoft converter, I used this syntax:

WavToMp3Cmd "D:\Folder\File Name.wav" -O"D:\Test\Output Filename.mp3"
This syntax seemed a little awkward, because (as shown) there was no space after the -O. (That's an oh, not a zero.)  Of course, I had to use quotation marks, as shown, because I had spaces in the folder and/or file names.  I had to run this command in the folder where the Boxoft program was installed (i.e., where WavToMp3Cmd.exe was located). That is, I had to run my batch file from there. That installation folder, on my x64 system, was C:\Program Files (x86)\Boxoft Wav to MP3 (freeware).

The foregoing command was pretty bare-bones.  I would have had to add more parameters, as described in the Boxoft help file, if I had wanted to produce high-quality output MP3s.  I was just creating the output MP3s to see whether the input WAVs were valid.  I would be deleting the output MP3s when the test was done.

I wasn't sure how the Boxoft converter would deal with a flawed WAV file.  To find out, I created a TXT file, changed its extension to WAV, and ran the foregoing command on it.  Boxoft produced an unplayable Test.mp3 in the output folder, so in that sense the test failed:  I would have preferred that it would not produce junk files.  Fortunately, Boxoft did register an error message on the command line. I captured that error to a log file by adding some stuff to the foregoing command.  That command now looked like this:
WavToMp3Cmd "D:\Folder\File Name.wav" -O"D:\Test\Output Filename.mp3" >> D:\Test\Log.txt
That was all on one command line.  Now I would have a Log.txt file that would capture error messages produced by the converter.  The next step was to work up my list of commands.  My WAV files were scattered in various folders around the computer, so as described in more detail in another post (and in several prior posts listed at the start of that one), I used a DIR command to identify the files I wanted to test (in this case, they were of the form 2010*.wav), and then I used Microsoft Excel formulas to create the relevant commands.  So now, for each file to be tested, I had an Excel row containing something like the command shown above.  I copied all those commands into a text file in Notepad, renamed that file's extension to be BAT (so that it would run), saved it in the Boxoft folder as noted above, and ran it.

In 10 or 15 minutes, the batch file was done; I had tested about 2,800 WAVs.  I verified that the number of output MP3 files equaled the number of input WAV files.  I wasn't sure how Log.txt would register errors, so I copied and pasted the entire log into a Microsoft Word document and reduced its size by doing global search-and-replace operations to remove the lines pertaining to the Title, Artist, and other settings that didn't matter to me.  (Word's special character to remove line breaks was ^p.)  I changed the remaining output to fit on one line (that is, I changed "^pSave to" to be just "Save to"), so that the file name and the result (e.g., "Success") would all be on one line.  I pasted this remaining text into Excel, did a FIND for lines containing "Success," and sorted the list according to whether "Success" appeared.  There was only one non-Success line.  It said, "Set the error bit rate. Please run WavToMp3Cmd to get help."  In the Word doc, I searched for that text.  I played the original WAV file whose name appeared just before that text.  It played OK.  I looked for the output MP3 file of similar name.  It wasn't there.  In this case, Boxoft hadn't created the failing file.  But then why did my file count agree?  Oh, of course, because the Log.txt file was in the output folder.  I wasn't sure why Boxoft had balked at the input WAV; but now that I had manually tested it and found it was OK, I didn't really care.

It occurred to me that Boxoft had produced an error when confronted with the obvious impossibility of converting a TXT file to MP3; but what about if some of the input files were mislabeled as WAVs when, in fact, they were MP3s or WMAs or some other kind of audio file?  I wasn't sure whether Boxoft would produce an error message or warning for such files.  I ran a version of the foregoing command on an input WMA, not renamed to WAV.  That crashed Boxoft.  I tried again after renaming the WMA to WAV.  That crashed Boxoft too.  I also tried with an input MP3.  Same results:  crashes.  I re-checked that the command did work with a genuine input WAV.  So apparently Boxoft would have crashed if my list of WAVs to be tested had contained an MP3 or WMA that was accidentally misnamed as a WAV.

To sum up, it seemed that the Boxoft command line converter, using a command with syntax shown above, was able to test large numbers of WAV files by converting them to MP3s, with error messages captured in a log file.

Monday, February 13, 2012

Windows 7: Testing/Verifying/Validating PDFs

I had previously gotten the impression that I could test PDFs by using IrfanView to convert them to JPGs.  (This was different from the approach I had recently taken to merge scattered JPGs into multipage PDFs.)

In a dry run, IrfanView had balked at a bad PDF, but had converted the good ones.  So the scenario was that I would run the conversion; check the output folder; verify that it had the correct number of files; and if the numbers of files didn't match up, I would go hunting for whatever was missing.

Now I had a bunch of PDFs that I wanted to test, so it was time to try out that theory.  I had found it helpful to use an Excel spreadsheet to work up the list of files to test.  (The post on multipage PDFs, above, contained some discussion of how I used Excel to create and massage lists of files.  More information appeared in an earlier post on renaming thousands of files and in another recent post on using a batch file to sort files.)

The PDFs that I wanted to test were scattered across different folders.  The conversion scenario would have me create JPGs from these PDFs, where the JPGs would all be in one folder.  That way, I could easily count them, keep them from cluttering up other folders, and delete them after I was done counting them.  A problem with all those JPGs converging into one folder was that I might have two identically named source PDFs in different folders.  For instance, there might be something called File001.jpg in D:\FolderB, and another completely different File001.jpg in E:\FolderQ.  The JPGs resulting from these two different PDFs would either overwrite or fail to come into existence, depending on how I set IrfanView's conversion process.  This would screw up my count and would potentially fail to test some PDFs.  I could surmount this problem by batch renaming those files into unique names, as long as I kept the list of what I had renamed so that I could rename them back when I was done screwing around.  That approach would involve time-consuming extra steps, though, so I was hesitant.  (For more information, go to this webpage and search for "ZZZ_00001.jpg.")

There was another problem, as I thought about it.  A three-page PDF would presumably convert into three one-page JPGs.  So my file count would get messed up that way too.  I could probably opt to convert instead from mulitpage PDFs into multipage TIFs, but I wasn't sure what would happen if one page on a JPG was junk.  I would have to experiment to see if the TIF would swallow it or barf.

These reveries were interrupted by an actual test.  I tried using IrfanView to batch-convert three PDFs to JPG.  It gave me errors:  "Can't load D:\Current\Text\x1.pdf" (and likewise for x2.pdf and x3.pdf).  One was a single-page PDF, so multipage issues weren't the problem.  I couldn't understand it.  I had previously used IrfanView for this purpose.  The PDFs opened OK in Adobe Acrobat (and would presumably do so in a free or less expensive alternative to Acrobat).

It occurred to me that maybe I could use Acrobat > Advanced > Document Processing > Batch Processing.  I tried that on my test files, saving them as RTFs rather than JPGs to circumvent multipage issues.  It was surprisingly slow, and the resulting RTF files were empty.  Not a promising start.  Going back to somewhere near the original plan, I tried using Acrobat to convert to JPG instead of RTF, and that worked.  As expected, each PDF page became a distinct JPG.  For instance, x2.jpg became x2_Page_01.jpg and x2_Page_02.jpg and so forth.  I would have to do further filename massaging in Excel, or maybe run a series of DEL commands (e.g., DEL *02.jpg, DEL *03.jpg, etc.) to see if the number of output JPGs (or groups thereof) matched the number of PDFs tested.

I wondered, at this point, why I couldn't just batch print the PDFs being tested -- print them to PDFs in another folder, that is, and do the file count and then delete the prints.  Would a junk PDF print?  I created a junk PDF by taking a copy of one of my test files, opening it in Hexedit, looking a little ways down in the ASCII column for Root # 0 R (in this case, it was Root 124 0 R), changing it to Root 00 0 R (inserting 30 as the hex value for zero), and saving it.  Then I made Bullzip my default PDF printer, changed its settings so that it would print without stopping to ask questions (via the Options shortcut in the Bullzip program folder), selected all four of my test files, and went to right-click > Print.  It printed three of the four.  I didn't have the settings right -- it still asked for filenames -- but the test worked:  for the file I had just made into a junk PDF, Bullzip (or, actually, Acrobat, my default PDF reader) gave me an error message ("There was an error opening this document.  The root object is missing or invalid" -- which was, of course, exactly what I had changed), and no output PDF was created.  So this approach of trying to print to PDF would work to identify at least some kinds of defective PDFs.

Unfortunately, that error message didn't specify which file was defective.  So that approach would require me to subtract the files that had successfully printed from the larger set of files that I had requested to be printed.  That might be a pretty fast process, if I used the same output filenames, paused for five or ten minutes, and then used Windows Explorer to copy the output PDFs over the original PDFs and then sorted by timestamp.  The PDFs with the visibly older timestamp, after that maneuver, would presumably be the ones that had failed to produce anything that would overwrite them.  This approach would wipe out my originals, which I would not want, so it would probably be best done using copies of the originals.  If there was some need to reverse the timestamp, I could probably fiddle with the system clock before step 2, so as to produce artificially ancient output PDFs.

Another approach might have been to use an Acrobat-type program to concatenate many if not all of the PDFs being tested into one PDF.  I wasn't sure if junk PDFs would concatenate.  I selected my test files > right-click > Combine supported files in Acrobat.  Acrobat said, "There was an error encountered while combining files.  Do you want to open the combined file or return to the file list and try again?"  I told it to open the combined file.  Acrobat's Bookmarks pane showed bookmarks for each of the good files, but no bookmark for the bad one.  So that would be one way of getting a list of good PDFs.  Of course, the concatenation process could be slow, especially because the resulting document could be huge.  The size of that document might also cause the Acrobat-type program to crash.

But this still wasn't giving me a testing approach that would test PDFs in place, without requiring me to relocate them to a single folder where I could manipulate them.  I could try to work up a batch command that would print the PDFs on my list to a common output folder from where they were, but in that case I wouldn't have two simple lists to compare.  Unless I could persuade the batch command to report its errors to a log, I would apparently have to go through the printing process manually, making sure to write down or attend to each PDF that didn't print.

I ran a search, to see if Bullzip could escort me out of this situation.  This strategem led, strangely enough, to the Bullzip manual, to which I probably already had at least a link in my Bullzip program folder.  But the manual -- besides being no fun -- seemed to be oriented toward installation rather than usage.  A search in the Bullzip forum led to the suggestion that I look at a bioPDF webpage.  bioPDF seemed to be telling me that I could use a program called PrintTo.exe to do the job.  But where could I find this PrintTo.exe program?  I wasn't seeing a link to it there on the bioPDF site.  A search of files on my computer turned up nothing.  It didn't register when I typed "printto /?" on the command line.  Softpedia didn't have it.  And yet a search produced indications that Bullzip users were using PrintTo too.  Baffling.  Another search led, directly or indirectly, to a FineThatFile webpage where I was able to download printto.exe as part of a zip file containing other stuff.  I unzipped it and ran "printto /?" in the folder where printto.exe was unzipped to.  Turns out it was a biopdf.com product after all.  The syntax was simply "printto filetoprint printername" -- using the default printer if printername was not specified.

So, alright.  Bullzip was already my default printer, so I would be test-printing those PDFs to some temporary folder with a simple command:  printto filetoprint.  There didn't seem to be an option to specify an output folder on the command line.  Apparently I would have to do that in Bullzip.  It took some tinkering, but eventually it came together.  It didn't look like printto.exe was eager to print JPGs, but that was alright; I didn't need that now.  Right now, I was just doing PDFs.  I did get it to print designated PDFs to a designated output folder from the command line without pausing, except in case of overwrite; I did want to be notified about that.  Printto.exe had to be present in the folder where the command or batch file was running, I assumed, but that was manageable.  When it got to my bad PDF, printto.exe gave a command-line error:  "ERROR:  Invalid file name specified."  I had forgotten to put that file's name into quotes.  (Unlike the others, that name had a space in it.)  I added quotation marks and tried again.  This time, when it got to the bad PDF, it gave me the error (above), "There was an error opening this document."  When I clicked OK, my little test batch file continued to print the next file in line.  So it looked like this was going to work.

Regrouping, then, the situation was as follows:  I had set up Bullzip to print PDFs to a specified folder called Bullzip Output, without pausing for any dialogs except error messages and overwrites.  I had downloaded printto.exe, and it was now sitting in the folder where I had also saved a file, created in Notepad, called Printer.bat.  Printer.bat contained commands of this nature:

printto "D:\Folder Name\File to Test Number 1.pdf"
Printer.bat contained one line like that for each of the PDF files I wanted to test.  What was supposed to happen next was that I was supposed to be able to double-click on Printer.bat, or type "Printer.bat" on the command line, and it would try to print the PDFs I was testing.  It would put the resulting PDFs (that is, the Bullzip printouts of the PDF files being tested) into the Bullzip Output folder.  Unless it encountered corrupt files or potential overwrites, it would work -- slowly -- through the list of PDFs that I wanted it to test.  I hadn't seen an option to steer the error messages to a log instead of showing them onscreen.  A log would have been better:  the batch file wouldn't sit idle, waiting half the night for me to wake up and fix a problem.  And maybe Bullzip or some other PDF printer offered that.  I just hadn't seen it.  It would be something to look into next time.  Hopefully Printer.bat would not encounter many corrupt PDFs.

I decided to run Printer.bat from the command line, so that I could watch what was going on.  One problem emerged almost immediately:  after Bullzip created a PDF, Acrobat would open up, even though I had checked the Bullzip option that said, basically, do not open the document after creation.  So, fine, the document would not open, but Acrobat would.  It would just sit there with a blank page, and that was fine, except printto.exe would not proceed with the next file to be processed until I killed Acrobat.  I wondered if things would work differently if I designated a different PDF default reader.  To test that, I right-clicked on a PDF file at random and went into Open With > Choose Default Program > Always use the selected program to open this kind of file > Browse.  I browsed to FoxitReader.exe (which may not have been its original name) and selected it.  I double-clicked on a random PDF to make sure that it would now open in Foxit rather than in Acrobat.  When I tried running Printer.bat again, I got a rapid series of error messages.  The gist of these messages was, "No application is associated with the specified file for this operation."  There were no files in the Bullzip Output folder.  Operation failed.  Now what?

I guessed that I was getting those error messages, not because Foxit was not registered as the default PDF reader at this point, but because something about its status as a portable rather than an installed program was confusing printto.exe.  I was surmising, in other words, that printto.exe needed a PDF reader to be installed.  This suggested that Acrobat was opening after each printing of a PDF, not because of some failure in Bullzip, but because printto.exe needed that.  So then could I perhaps insert a batch file command that would kill Acrobat after printto.exe gratuitously started it up?  Or could I install some other (non-portable) PDF reader that would respond differently than Acrobat had done?  Or would it perhaps help, somehow, if I left Acrobat open to another PDF file before running Printer.bat?

Trying that last possibility, I restored Acrobat as the default PDF reader, opened a PDF in it, and tried Printer.bat again.  This time, Printer.bat took off like a shot.  It processed a couple dozen PDFs almost instantly.  Then it slowed way down, but it seemed this was only because Bullzip was printing the PDFs much more slowly than printto.exe was printing them.  Evidently having a PDF already open in Acrobat was the solution.  Don't ask me why.

Well, now that we had worked out the terms of engagement, printto.exe and Bullzip seemed to be poised to execute the balance of their little pas de deux with grace.  Every few seconds, another PDF would be printed into the output folder.  Ah, but then the overwrite warnings began popping up.  I didn't have time to rename one existing PDF, so as to make room for its brother, before another duplicate warning would interfere with the manual renaming process.  I could have let them go until the end, but I was afraid there might be a lot of overwrite warnings, and the computer would crash or I would get corrupted results.  This was a pretty clumsy operation in the end.  It did seem that it would have been advisable to detect duplicte filenames before starting, and to assign duplicates a temporary or visibly alternate filename for this process.

But anyway, when this was done, I had 147 items in the output folder.  There had been no error messages.  Sadly, Printer.bat contained 148 lines.  Somewhere, we had a laggard.  I took a stab at finding it.  Failed.  I guessed I had allowed something to get overwritten, but my approach was way too sloppy to figure out which test output PDF got wiped.  I decided that I probably already knew it was a good PDF, since I'd gotten no error message while printing.  So that was the end of this test.

Tuesday, February 7, 2012

Windows 7: Verify or Validate MP3s

I had a folder full of MP3s. I wondered whether any of them were bad.  In a brief previous search, I found MP3 Diags as a possible MP3 tester.  This post describes what I found when I tried it out.

Trying Out MP3 Diags

I was using MP3 Diags 1.0.07.  It had a Mac-type interface that I disliked.  That is, instead of a menu with words that had meaning to me, it showed big, gaudy icons that mostly meant nothing until I hovered my mouse over them to get the tooltip.  (The icons shrank to a more pleasant size when I shrank the program to fill only part of the screen.)  The tooltips were visible only when MP3 Diags was in focus onscreen; I couldn't see them while I was typing these words.

With those initial reactions, I placed myself squarely among those who plunge into a program without first reading its documentation.  I was admittedly more inclined toward the WFM philosophy than toward the RTFM philosophy.  Windows programs did generally seem capable of using menus, including cautionary pop-ups, putting risky functionality into Advanced tabs, and otherwise steering stressed users into safe or pre-warned channels of behavior.  I appreciated that the MP3 Diags programmer had provided many pages of documentation, among which he warned that the program was not really designed for those who were looking for a pushbutton solution.  But it had to be clear that many users would never see that warning, or would perhaps mistakenly think they understood it when they did not, and that was the basis on which I approached the program.

After hovering over all of the icons, I chose two that seemed relevant for starting purposes.  First, when I hovered my mouse over the gearlike icon at its top left corner, the tooltip said, "Scan folders for MP3 files [Ctrl+S]."  From the Windows world, a gear icon would normally mean Tools; it seemed to me that a different icon would have been better, with a tooltip that said, "Select folders to scan."  That option let me use checkboxes to designate a particular folder.  The MP3s in that folder were not large; they averaged about 400K each.  When I checked the boxes, the program was ready to begin scanning.  I canceled out of the gear icon and went to the wrench-and-screwdriver icon, at the top right corner of the program, that seemed more fitting to its purpose:  adjust settings.

Then I had to go back into the gear icon to run the program.  On a fairly up-to-date computer, it seemed to diagnose about 12 MP3s per second.  When it was done, it put up this notice:

Your files are not fully supported by the current version of MP3 Diags. The main reason for this is that the developer is aware of some MP3 features but doesn't have actual MP3 files to implement support for those features and test the code.

You can help improve MP3 Diags by making files with unsupported notes available to the developer. The preferred way to do this is to report an issue on the project's Issue Tracker at http://sourceforge.net/apps/mantisbt/mp3diags/, after checking if others made similar files available. To actually send the files, you can mail them to ciobi@inbox.com or put them on a file sharing site. It would be a good idea to make sure that you have the latest version of MP3 Diags.

You can identify unsupported notes by the blue color that is used for their labels.
At that point, the MP3 Diags screen consisted of three panes.  In the top third of the screen, MP3 Diags listed the MP3s that it had tested, with columns indicating which Notes applied to them.  In the middle third of the screen, MP3 Diags provided explanations of those notes.  In the bottom third, MP3 Diags seemed to be showing me details about the single MP3 that was currently highlighted at the top.  For instance, the bottom pane said this about one file:
1:16, MPEG-2 Layer III, Single channel, 22050Hz, 40000bps CBR, CRC=yes, frame count=2923; last frame located at 0x5d2d1
That last pane didn't seem too important for present purposes, so I focused on the middle pane.  It looked like MP3 Diags had found four kinds of things worth commenting on, in the files that I had submitted for diagnosis.  None of these were error messages; they were all just informational.  They read as follows:
fa  -  No ID3V2.3.0 tag found, although this is the most popular tag for storing song information.

ob  -  No supported tag found that is capable of storing song information.

ab  -  Low quality MPEG audio stream.  (What is considered "low quality" can be changed in the configuration dialog, under "Quality thresholds".)

an  -  No normalization undo information found.  The song is probably not normalized by MP3Gain or a similar program.  As a result, it may sound too loud or too quiet when compared to songs from other albums.
(I was not able to copy and paste these messages from MP3 Diags, and therefore had to retype them here.)  I didn't know where the letters (e.g., "fa") came from; I would have found it more helpful to see error codes grouped into areas of concern (e.g., quality, tags, playability).  The point seemed to be that there were no problems with the MP3s per se, as distinct from their tags and their quality and their normalization:  they would play without errors.  That was my concern.  These weren't songs, as MP3 Diags seemed to assume; they were just old recordings of speech, and as such did not need to be recorded at a high bitrate.  I just wanted to know whether anything had gotten corrupted.

I had been reading that middle pane with its "File Info" button clicked.  I clicked on its "All Notes" button instead.  It showed me errors like these:
aa  -  Two MPEG audio streams found, but a file should have exactly one.

ac  -  No MPEG audio stream found.
These were obviously much more worrisome, and I didn't seem to have them, so that was good.  Ah, but now that I focused on the top pane, I suspected that the four notes shown above (fa, ob, ab, and an) might apply only to the one MP3 that was highlighted in the top pane.  Was I supposed to page down through all of the MP3s to do a manual check of which ones might have which errors?  I tool-tipped the icons at the top of the screen again, looking for some kind of reporting function.  I tried the "Filter by Notes" option.  Unfortunately, since the error notes weren't grouped or hierarchically arranged under main topics (e.g., playability), it appeared that I would have to read all 23 notes and choose the ones that were worrisome -- as distinct from, say, just clicking on a main category and optionally selecting or deselecting subcategories.  Some of the categories were presented in colored print, for unknown reasons.  For instance, "aa" was brown, "an" was black, and "dj" was blue.  The brown wasn't much different from the black; I had to look close to make sure.  I wasn't entirely sure which categories to worry about:  was "dj" ("Unsupported ID3V2 version") ominous, or did it just mean something to do with tags, which were irrelevant for purposes of these MP3s?  I had to look at Wikipedia to see that ID3V2 related to information about the file, and apparently not its actual audio contents.

The error codes I chose to filter by were aa, ac, ad, ae, ak, bg, cb, ia, ib, ja, kb, kc, kd, and of.  As that last example illustrates, it was awkward to write about some of these codes (e.g., an, of) without using quotation marks, since their letters formed English-language words.  I wasn't sure, but a quick search suggested that these codes might be peculiar to MP3 Diags, and therefore easily changed, rather than being promulgated by some official MP3 authority.

Once I had selected those codes to filter by, the top pane in MP3 Diags changed.  Now I could see that various files had various issues.  It was also more obvious, now, that both the middle and bottom panes were providing information specifically about the file highlighted in the top pane.  I would have liked to see a count of how many files had error "aa," for instance.  Another useful feature would have been an option to display files in order of the nmber and/or seriousness of their errors, so that those with six or eight errors would come before those with just one.  It would also have been interesting to see whether all of the files having a certain error were clustered in the same folder, in which case I would think maybe I should just replace that whole folder with good copies from a backup.  I could refilter by folders, but apparently I couldn't filter, sort, or output a report by both folders and error codes.  In fact, it seemed I couldn't output a report at all, which meant that I wouldn't be able to write a batch file to mass-delete or mass-zip any files that might be bad.  If I selected a group of files in the top pane, it seemed that the only error codes I would see in the middle pane would be those pertaining to the first file in the group.  There were no right-click options for selected files.  And if I clicked on an error code column (e.g., ab), the program would not re-sort the files according to their values in that column.

Later, I would look in more detail at some of the error results (below).  But at this point, MP3 Diags had helped me to clarify my thinking about Mac-like software.  A while back, someone had asked me why I didn't just use Apple hardware and avoid the hassles I was experiencing with PC stuff.  I had my reasons, but it was still a question worth keeping in mind.  I knew I didn't really care for Macs, but hadn't thought much about why not, exactly.  If MP3 Diags was any indication, it now seemed to me that one answer to that question would be that much Mac stuff is lacking in what would be considered basic functionality, among PC programs (although, as this post demonstrates, that's a problem in much PC software too).  At its worst, Mac software seems to promote the idea that simplicity is superior.  And it is, if you don't need to do anything complex.  But again, that was just a passing reaction, based (at the moment) on one program that was surely not a fair representation of Mac software at its best.

It did seem quite possible that the programmer of MP3 Diags would find such remarks puzzling if not bizarre.  I realized that my reactions could well be very far from his intentions.  Again, my purpose was to convey a sense of how the program felt in use, a sort of walk-through from a new user's perspective.  Another way to phrase the message was that idiosyncratic design may be best saved for those situations where it is really necessary.

Alternatives to MP3 Diags

While I had my objections to MP3 Diags, I was not seeing any immediately obvious alternatives.  Not to say there were no contenders.  My previous search had also led to MP3valCheckmate MP3 Checker, and MP3Utility.  I did some random flailing around, looking at these and following leads to others.  An eHow webpage suggested that Dr. Tag's MP3 Repair Tool.  In a search for "MP3 validation," The SnapFiles list of "Misc. MP3 Software" included MP3 Diags, MP3val, and also MP3Test ($17 trial).  Another search led to an old thread that mentioned Foobar 2000, a highly recommended MP3 player that apparently had some kind of MP3 checking capability (or maybe it just wouldn't play bad MP3s), as well as dBpowerAmp ($14), Audiotester, Mr. Question Man (also sometimes called Burrrn), and EncSpot.

Among all those programs, there didn't seem to have been much testing.  Some were old and had not been updated.  In most cases, the ratings at cites like CNET and Softpedia were based on just a few votes (sometimes just one).  There were miscellaneous accounts -- for instance, MrSinatra reported questionable results from Audiotester.  But I felt I was swimming in very murky waters.  It seemed my best strategy might be to look at the ratings in Softpedia and CNET, supplemented by a search for reviews and a general sense of which of these tools had been most extensively used and recommended.  Based on various comments I had seen, I decided to focus on MP3val, Checkmate, MP3Utility, MP3 Repair Tool, Foobar 2000, and Mr. Question Man.

Of those six programs, only Foobar2000 was listed on CNET (rated 4 out of 5 stars by 129 voters, excellent by editors).  Softpedia listed MP3val (3.5 stars by 26 voters), Checkmate (3.0 stars by 24 users), MP3Utility (3.6 stars by 21 voters), Portable MP3 Repair Tool (2.7 stars by 30 users),  Mr. Question Man (3.5 stars by 24 users) -- and Foobar2000 (4.6 stars by 1646 voters).  I decided to start with Foobar2000.  If that didn't lead where I wanted, I would try looking at MP3val, MP3Utility, and/or Mr. Question Man.

But then I ran another search.  I was curious as to whether Winamp would provide at least the same functionality as Foobar2000.  The search inadvertently pointed me toward two other MP3 checking programs, both on Softpedia, that looked like they might actually be more widely used than some listed in the previous paragraph.  Those two programs were MP3-Check (3.8 stars by 63 voters) and MP3 Checker (3.9 stars, 25 users).  As for Winamp itself, a couple of searches seemed to indicate that it did not share Foobar2000's ability to test MP3s.  These results suggested that, if Foobar2000 didn't do the job, I should look next at MP3 Checker and MP3-Check, before turning to the others listed above.

Verifying MP3s in Foobar2000

I had expected Foobar2000 to give me a funky, multicolored player interface like Winamp.  Instead, I got a straightforward space where files would be listed.  I didn't see where it would have the ability to verify MP3s, so I went back to that old thread and saw that I would first have to load some files and then right-click on them and select the appropriate option.  I did that.  Loading files took a while.  Foobar2000 indicated that it was "processing" them.  It seemed to be handling only a few per second.  When it was done, unfortunately, a right click revealed no testing options.

On the other hand, a right-click on an MP3 in Foobar2000 did present the possibility of doing a mass conversion.  Foobar2000 offered ten output conversion formats, including WAV.  (It also offered ways of modifying the output, e.g., crossfade, skip silence).  This raised the possibility that I could do a mass conversion.  Presumably a bad file would not be capable of being converted to another format.

Or at least a bad file would not produce good sound upon conversion.  I visualized an MP3 testing program of the future.  It would extract sound samples from several different points in the MP3, including beginning, middle, and end, and would compare them against designated reference files containing samples of the kinds of noise that the tested file should contain.  Perhaps this imaginary program would concatenate copies of MP3s not fitting the profile, display the waveform, and underneath it show the name of the file, and the location within the file, from which the presently viewed sound sample was taken.  That way, users could eyeball the program's judgments as to which MP3s were conforming or nonconforming, and could have some hands-on assurance that the program was accurately detecting acceptable vs. screwed-up MP3s.

I didn't have a program like that.  But, as I say, I did have the option of doing a bulk conversion.  Apparently Foobar2000 and, as it turned out, Winamp would do it.  Cool Edit 2000 (no longer available) would do it, and so, probably, would some other audio editors.  I was afraid that conversion could take a long time, but Foobar2000 converted a test group of 20 of these small MP3s within just a few seconds, and the files played successfully.  So this was one possible route.  I guessed that Foobar2000 would convert a blank file without objecting, though, and would otherwise fail to provide some of the warnings that I had seen in MP3 Diags (above).  I decided to look at other possibilities.

MP3 Checker and MP3-Check

I looked at the Softpedia and CNET pages for Convivea's MP3 Checker.  They essentially repeated what I saw on the MP3 Checker homepage.  It sounded like a simple and capable program.  The latest version was 1.08, released on July 22, 2006.  I also looked at the homepage and the Softpedia and CNET pages for MP3-Check by AudioMoves.  On this very preliminary basis, I was leaning toward MP3-Check over MP3 Checker because the MP3-Check homepage provided a more detailed description of what it did, and because its webpage and the product both seemed to have been updated within the past year or less.

So I downloaded MP3-Check 1.40 from Softpedia and installed it.  It appeared to be designed primarily to check for MP3s that might have problems with quality or with their tags.  These were not my concerns at present; I knew that some of the MP3s I would be testing might have bad or nonexistent tags, low bitrates, low sample rates, low volume, or might use joint stereo -- to cite the five criteria that MP3-Check allowed me to select and, to varying degrees, to adjust.  I turned off all of those options except the tag check, which did not clearly appear capable of being turned off.  I checked the option to Create Status Logfile.  Then I ran the program on the same folder that I had tested with MP3 Diags (above).  I could see indications in the program's status bar that it was checking files very quickly.

When MP3-Check was done, it put up a little notice indicating how many files it had checked.  It calculated an average time of about 55ms per MP3, for these little MP3s that I had it check.  When I clicked OK on that notice, it opened its logfile.  The log showed me the names of the files it had checked, and their sample and bit rates.  The log was tab delimited, so I could copy and paste directly into Excel and sort by the various columns (e.g., bitrate), to see if anything looked odd.  There were several files with very low bitrates.  I dug those out and listened to them.  Two of them were corrupted, and I was able to replace them with backups.  But those were the only bad files I was able to find this way.  MP3-Check did indicate that a large number of my files had tag problems, as I expected.  It also indicated that there was one "Unordinary MP3" in my list.  I wasn't sure what was the matter with that file.  It seemed to play OK.  Otherwise, I was done with MP3-Check.

MP3-Check vs. MP3 Diags vs. Mr. Question Man

I wondered if MP3 Diags had detected the two or three bad files that I had just identified using MP3-Check.  I hadn't closed MP3 Diags yet, so now I went back to look at the results it had shown me.  Looking through those results was not easy.  There was no Ctrl-F option to find a specific file by name.  There was also no Ctrl-A option to select all files in the list.  PgUp worked, but Home didn't, so I would have to page or scroll to get to the top of the list.  Once I was there, there didn't seem to be any Shift-End or other key combination that would let me select the whole list that way either.  I could select the whole list by starting at the top and doing a Shift-PgDn until I got to the bottom, or by scrolling all the way down, but of course this would take a long time with a large list.  And what was the point?  Once I had the whole list selected, Ctrl-C worked for only one item at a time, and there was also no right-click option, by which I could copy the list and paste it into a spreadsheet or Notepad for further searching and comparing.  Worse, my attempts to highlight the whole list with Shift-PgDn caused MP3 Diags to freeze up.  It came back to life after five or ten minutes, but this was not encouraging.

I thought maybe I could find those few bad files, if MP3 Diags had detected them, by narrowing my filter to just those categories that might have caught a corrupted file.  But I thought I had already done that, when I decided to disregard tags and such (above) and focus instead on audio stream issues. 

It seemed that MP3 Diags was a good start on a potentially great program.  But I was uncomfortable with what seemed to be its core idea:  never mind about the details; just click on the proper selection and we will take care of fixing your MP3s.  It was a nice dream, and their colorful GUI seemed to support it, but so far the program had not won my confidence.  I knew it was quite possible for a magical, black-box program to make things worse.  I appreciated that these were all freeware programs, and that their creators had kindly made them available to the rest of us.  It was just that I was looking for one really good program to do this job right and not make more headaches for me.

At some point in the process, I took a look at Mr. Question Man.  Its webpage had not been active since 2006, and its description on Softpedia did not make clear whether it would repair MP3s, as distinct from merely providing information about them. But I went ahead with it anyway, on the strength of its positive user votes, few but mighty.  It did turn out to be informational only.  I appreciated its option for configuring the Isolinear Optical Chips Latency, on the Settings tab devoted to PapalaPapIHaveToCustomizeEverything.  Funny program.  I was thinking the writer should try political journalism.  S/he might have a positive impact there.

MP3 Checker

I downloaded and ran MP3 Checker.  It made a good initial impression, with practical options like "Do not report MP3s with minor glitches as BAD" and "Move MP3s with errors to the quarantine directory" (typos corrected here).  I didn't like that I couldn't resize its window.  And in the time it took me to write that last sentence, it had finished its scan of a rather substantial number of MP3s; it took the focus onscreen; and since I was in the middle of typing, my keystrokes seemed to be just what it needed to shut itself down.  It was like a miniature tornado had ripped across my screen, kicking up a little dust but apparently not doing anything significant.

I started it up and ran it again.  It had remembered all of my previous settings except the folder where I wanted it to look for MP3s.  This time I made a point of finishing my typing before I started it, so I wouldn't inadvertently shut down its closing announcement again.  What I saw this time -- again, after running for less than a minute -- was not good.  It said that it had completed the scan and had scanned 0 files, processing 0 total bytes of data, verified 0 good MP3s, and detected 0 bad MP3s.

There had to be something wrong.  A couple dozen users had given this thing fairly good marks.  It had seen my MP3s -- I could see it listing their names in its status bar.  It believed it was doing something with them.  I was missing something somewhere.  Maybe its users had been running Windows XP -- maybe somehow that made a difference?  I had no idea.  A discussion thread conveyed the impression that MP3 Checker was a casual project and seemed to produce erroneous results.  With all due respect to other reviewers who liked it, I concluded that this was not the program for me.

MP3val

As noted above, several other MP3 validation tools had received ratings in the vicinity of 3.5 stars from a couple dozen users each.  MP3val was one of those.  It came with the executable file mp3val.exe, which would run from the command line with several options including -f (try to fix errors) and -si (suppress info messages).  I chose to run it as a portable GUI program.  There were virtually no settings options.  Basically, I just pointed it to the folder containing the MP3s and told it to scan them.  It loaded the list of files in that folder, and then gave me options of scanning or repairing all or selected files within that folder.  I told it to scan them.  It ran down the list in a spreadsheet-like layout with two columns:  file name and state (i.e., condition).  In a small pane at the bottom, it displayed what I guessed I would have seen if I had run the command-line version:  the name of the file it was analyzing, a warning that the file contained no supported tags, and a statement of the file's properties (e.g., its number of frames, type of MPEG, number of tags, CBR).  It was proceeding quickly but not instantly.  It did appear to be doing a genuine scan.

When it was done, I tried to sort the list of files by clicking on the header of the State column.  That didn't work.  The procedure was, instead, to go into View > Scanned Files with Problems.  It showed me a list of troubled MP3s.  When I selected one, the pane at the bottom told me what the problem was.  Here were some of the warnings displayed there:
MPEG stream error, resynchronized successfully.

No supported tags in the file.

VBR detected, but not VBR header is present.  Seeking may not work properly.

It seems that file is truncated or there is garbage at the end of the file
The manual (a simple HTML file included in the portable folder) listed a number of other possible error messages.  I would soon be comparing these error messages against those generated by MP3Utility (below).

The manual did not provide an indication of how I might save the scan results into a file.  Ctrl-A didn't work, but I was able to select all of the files in the Problem list by using Ctrl-Shift-End from the top.  But once they were selected, Ctrl-C didn't work to copy them, and a right-click just gave me options to delete, scan, or repair the selected files.  I looked in vain for a log file that it might have created in its program folder; apparently its list of problem files was saved either in RAM or in a temporary directory somewhere.  (It belatedly occurred to me that I could perhaps access a log to get the information I had been unable to extract from MP3 Diags, above, but it was not clear to me what its MP3Diags.dat files was trying to say, so for practical purposes that workaround didn't work.)  It probably would have been possible to get the list of problem files from the screen, using a capture-and-OCR program like Aqua Deskperience (which I had bought) or JOCR or SysExporter, though that route would be painful with a long list of files.  It would probably be possible, and surely easier, to use the command-line version of the program to get a list of files.

MP3Utility

As noted above, I had found two other MP3 validators that had averaged 3.5 stars or better from at least 20 users.  These were MP3Utility and Mr. Question ManThe Softpedia page for MP3Utility, a portable, seemed to indicate that the program's last revision was in 2009.  The webpage and, even more, its Readme.txt also provided an encouraging amount of detail.  I got the impression of a careful, thoughtful effort to identify and handle flawed MP3s.

MP3Utility offered few but potentially useful options, such as the possibility of adding it to the right-click context menu.  When I ran it on my test folder, it identified 16 bad MP3s, moved them to a designated folder, and allowed me to save its log file for later reference.

I took a look at those results.  For 15 of those 16 bad MP3s, the logged error was of this type:  "First sync error at approx.1:31 (80% through audio)."  For the other one, the error was, "Can't locate first valid frame header within 5,000 bytes of beginning of file."  The MP3Utility Readme said that the program would identify several kinds of errors, which I summarize as follows:
Unable to open file (file is protected by another application or was moved after being initially loaded into MP3Utility)

File too short, or End of file encountered in first audio frame

Can't locate first frame header

Last audio frame truncated (can be ignored in almost all cases)

Last audio frame too long (can probably be safely ignored)

Error reading frame header xxx (i.e., sync error; serious error except possibly when it occurs at end of file)
So the logged results of the search of my MP3s did not match up exactly with this list, but apparently the only errors in my MP3s were in the last category:  I had 16 files with sync/header errors.  The log stated that it had "found errors/warnings in 16 files," so evidently MP3Utility didn't think any of my files had any of the other errors listed here.

Comparison of Errors Found

How did the results from MP3Utility compare against the results of other programs?  I had made a slight mistake, for comparison purposes:  I had gone ahead and replaced two bad files after running MP3Check, as noted above.  But otherwise the errors identified by MP3Utility should have matched up exactly with the errors identified by the other programs (above) that did produce a list of bad files.  The programs of particular interest, at this point, were MP3 Diags, MP3-Check, and MP3val.

I looked at MP3 Diags first.  I wondered how its relatively extensive list of "notes" would match up with the list of errors just given.  It seemed that MP3Utility was superior on the first one, "Unable to open file," insofar as there was no acknowledgement of any such possible error in MP3 Diags.  In other words, MP3 Diags would apparently give the user the impression that all files had been checked, even if some of them were locked or not found.  This incorrect information may have seemed of no concern from the MP3 Diags perspective, since the program appeared to be oriented toward giving the user a complete (long) list of errors and then, after running a fix, presumably declaring most if not all of them to be repaired.

The second MP3Utility error, "file too short," appeared at first to be divided into at least three categories in the MP3 Diags errors:  "no MPEG audio stream found" (which MP3 Diags labeled an "ac" type error), "invalid MPEG stream - fewer than 10 frames" (type "ak"), and "File contains null streams" (type "kd").  But possibly I had misunderstood what a "stream" was.  I thought a stream was the audio data, as distinct from some sort of header and/or tailer that would contain non-audio data (e.g., tags).  But it turned out that, at least in mp3HD format, you could have an MP3 that would have two data streams.  Apparently that was not possible when MP3 Diags was created, else its programmer would not have included the "aa" error message, which stated that a file should have exactly one audio stream.  Then again, some MP3 Diags error descriptions (e.g., "kc") did seem aware of this.  Anyway, MP3 Diags identified a number of MP3s with these problems:  many with an "ac" error, many with an "ak" error, and six with a "kd" error.

MP3 Diags did not provide a way to right-click or double-click on a file listed in its onscreen error report, so I searched manually to check some of the files listed under those three error categories.  First, I noticed that three of the "kd" files were reported as having "ak" errors as well.  These seemed to be exceptionally troubled MP3s.  All three of these were in the folder containing Bad MP3s that MP3Utility had segregated.  So, good, the programs seemed to agree about those.  How about the "kd" files not containing "ak" errors?  Those were in the Bad MP3s folder too.  So MP3 Diags and MP3Utility seemed to agree that "kd" errors were bad (though possibly MP3Utility had moved one or more of those files to the Bad MP3s folder for some other reason).  But obviously MP3Utility did not share the MP3 Diags concern with many other files, else the Bad MP3s folder would have contained far more than just 16 files.  It did not appear that MP3 Diags was trying to produce a careful technical analysis that researchers and others could use for multiple purposes.  If that had been the case, it would presumably have been possible to export the MP3 Diags error results to a log file.  Fairly or not, I was reminded of those anti-malware programs that seemed to exaggerate the number and significance of threats to one's computer security.  Spot checks of several other files containing both "ac" and "ak" (but not "kd") errors did not lead to any obvious problems:  the files seemed to play OK.

As noted above, the next MP3Utility error, "Can't locate first frame header," was apparently serious enough to qualify a file as a Bad MP3.  I couldn't tell which MP3 Diags error would be similar to this one.  The only MP3 Diags errors that referred explicitly to headers were "bg" and "cb," but both of those were described as being issues that would matter only to "some players."  The one Bad MP3 in which MP3Utility found this error was not included among those listed by MP3 Diags as having either "bg" or "cb" errors.

I decided not to examine the next two errors in the foregoing list of errors that MP3Utility would identify, since the Readme said those could probably be safely ignored.  This took me to the last item in the list.  The idea there seemed to be that MP3Utility had found a frame header, but it couldn't be read, and this caused or was related to a sync error, and that was serious.  MP3 Diags did not have any error messages referring to "sync errors" per se.

Since I was not doing too well in an attempt to compare apples to apples, between MP3 Diags and MP3Utility, I decided to try another strategy.  My idea was to filter the MP3 Diags output for those errors that sounded most serious, according to the MP3 Diags error descriptions, and then compare the results against the set of files that MP3Utility had moved to the Bad MP3s folder.  But as I went down the MP3 Diags list, I couldn't really tell if any were serious.  The one exception was "ac" ("No MPEG audio stream found") -- but as noted above, I had sampled some "ac" files and they played, so I didn't understand how MP3 Diags could say that it found no MPEG audio streams in them.  I mean, they were MP3 files, and MP3 is a kind of MPEG.  I tried renaming one of them as a WAV and playing that in IrfanView, and that produced an error ("Can't read file header"), and the error went away when I changed it back to an MP3 extension.  It did seem to have an MPEG stream.

In a modification of that alternate strategy, I filtered the MP3 Diags list for all error messages that sounded like they could involve significant problems in playback (except for "kd," which I had already examined separately, above).  The ones I selected were "aa," "ac," "ad," "ak," "bg," "cb," "ib," and "kc."  MP3 Diags didn't give me a union/intersection choice -- that is, I couldn't indicate whether I wanted the program to display only those files that had *all* of these problems -- so instead it showed me all of the files that had *any* of these problems.  And as noted earlier, MP3 Diags also didn't give me a way to sort this list according to the numbers or types of errors.  I paged down through the list and manually selected a half-dozen files that had at least seven of these problems.  They all played OK -- including a couple that MP3Utility had placed into the Bad MPGs folder.

My conclusion about MP3 Diags, at this point, was that -- for whatever reason -- it was displaying large numbers of error messages that didn't seem to have significance for purposes of playback (as distinct from, say, tag editing).  In so doing, its very limited options meant that users would be at considerable risk of missing the potentially small number of files with real problems.  As noted above, this could make sense for the user whose available time was commensurate with the number and length of files being checked, but it could be overwhelming for others.

Possibly users would eventually see the files with major problems, if they proceeded to let MP3 Diags repair their files.  That is, maybe there would be only a few seriously troubled files left in the list, after the repair process ran.  But I didn't want to let MP3 Diags have its way with my files if I didn't actually need to do that.  After all, the problematic files seemed to be working OK, and these other programs weren't telling me that I needed repairs on large numbers of MP3s.  So I didn't get to that stage of seeing what files would remain on the MP3 Diags list after it ran a repair.

Moving along, then, how about a comparison among MP3Utility, MP3-Check, and MP3val?  As noted above, MP3-Check had identified only a couple of troubled files, and I had replaced them.  I hadn't kept their names, but I had a backup of the tested folder, so I ran MP3-Check there.  Ah, yes, now I remembered which files they were -- now that I was looking at them again.  There were four of them.  The MP3-Check log showed them as being recorded at the very low bitrate of 8kbps; and when I compared them against copies from an old backup, I could hear that there was something very wrong with them.  And in part, maybe that was the point of programs like MP3 Diags -- maybe there were lots of little problems that you wouldn't notice.  Maybe the file would sound like it was fine, until you compared it against another version, or played it with a different player.  (The minimum bitrate threshold in MP3 Diags could not be set low enough to distinguish those 8kbps files from others recorded at 16kbps, which was a setting that some old or otherwise limited audio devices would use.  That is, it was unlikely that those four files should have been stored at 8kbps, and my listening test had revealed that there was something wrong with them; but it was quite likely that I would have some files recorded at 16kbps.)  I was listening to these files in IrfanView, which was able to play almost anything; maybe I would have been having a very different reaction if I'd had to use other software.

Anyway, I had to run MP3Utility and MP3val against that backup folder too, to get a good comparison among the three programs.  The MP3Utility log showed me the same list of 16 bad files as before.  It contained none of the four files that MP3-Check said were recorded at 8kbps -- files that I had manually confirmed were corrupted.  MP3val gave me a far larger list of problem files, but it identified only one of those four.  It seemed, then, that if I used either MP3Utility or MP3val, I might want to supplement it with a program, like MP3-Check, that would determine each file's bitrate.

Next, I compared the list of 16 problem files identified by MP3Utility against the larger list generated by MP3val.  All were included in that larger list.  I suspected that the remarks about the philosophy of MP3 errors, as laid out in the Readme for MP3Utility, might give me some guidance in adjusting the MP3Utility options, such that it would detect more errors -- possibly the very same ones as MP3val had identified.  I was not feeling any particular need to look into this at present.  It appeared that MP3val would do everything that MP3Utility would do, so I tentatively decided to go with MP3val.

The Conversion Alternative

As just noted, MP3val identified only one of the four low-bitrate files listed by MP3-Check.  MP3val said that the problem with that file was, "This is a RIFF file, not MPEG stream."  This gave me an idea.  I checked a few of the other files in which MP3val had identified errors.  The "no VBR header" problem seemed very common.  It seemed that there might be advantages, for some purposes, in developing a concept of the ideal MP3 file -- what it would need to have in terms of tags and so forth -- and then building that into a bulk conversion process.  Then all of the files would pass almost any test, assuming I disregarded inappropriate tests like the bitrate threshold, suitable for music but not text, found in MP3 Diags.  In other words, I would run the conversion, and thereafter I would not have this motley collection of all sorts of errors, produced by various pieces of hardware.  Maybe truly flawed files would stand out more obviously in that sort of arrangement.

I decided that the conversion approach was interesting but unnecessary at this point.  It would perhaps be more appealing if, someday, I investigated the question of what standards were best for archival purposes.  That is, there were probably people out there, somewhere, who had decided that 64kbps WMA was the most stable, durable, reliable format for long-term voice data archives.  Or something like that.  Standards and formats would change and become obsolete from time to time.  Some such conversion might make sense, once I knew what I was doing.  But for right now, there was no point doing a mass conversion to 56kbps, or some other number drawn from a hat.

Fixing the Problems

The MP3 Diags manual contained some advice:  "If you like your files and they don't bother you, then you probably shouldn't change them."  Such advice, seemingly reasonable, rested on the assumption, consistent with some aspects of that program's design and help system, that people would be using MP3 Diags to do detailed exploration of a limited number of individual music files that they would be listening to in their full length.  Like most advice, unfortunately, there could be many situations to which it would not apply.  One need only visualize a paralegal who was under time pressure to verify and sort dozens of multihour deposition recordings, or a corporate peon who was expected to clean up thousands of tech support call recordings, not to mention myriad casual users who had the belief, perhaps mistaken, that it was possible to have a trustworthy program that could identify and fix major errors in their song files.  It seemed that users would be better protected by programs that used a standard design, so as to help them recognize when they were getting in too deep.  As this post demonstrates, most users would probably find it prohibitively time-consuming to try to read the help files and master the eccentricities of various unfamiliar MP3 validation programs, in a search for one that did what they wanted.  In other words, it could seem rather misanthropic to provide a big red "Fix" button along with a buried warning, "Never use the Fix button."

It presently appeared that MP3val had done a good if imperfect job of identifying problem files.  Given its decent reputation (and the fact that I had a backup), I put the BAD MP3s (removed by MP3Utility) back into the main folder, along with the other MP3s, and told MP3val to scan that folder.  When it was done, I told MP3val to repair all files.  When that was done, I went into MP3val's View > Scanned Files with Problems.  All of the bad files were reported as fixed.  I spot-checked a few.  They seemed to be fine.

I was surprised to see that, after MP3val ran, it still listed a number of files in the PROBLEM category.  The problem, in every case, was that "no VBR header is present."  Evidently this was something that MP3val could not fix.  I thought that a conversion approach (above) might solve that sort of problem.  There did not seem to be any urgency about it -- by this point, I was getting the sense that I had almost no MP3s that were absolutely unplayable and thus needed to be restored from a backup -- so I figured I could postpone repair of the "no VBR header" problem until I had learned more about archival formats (above).

I noticed, also, that MP3val had created .bak (backup) copies of the files that it had repaired.  This was apparently the meaning of the Preferences > Delete backup files option:  apparently MP3val worked by creating a backup file in the same folder as the MP3 file, making its changes, and then optionally deleting the backup.

Conclusion

I looked at a number of MP3 validation programs.  I had limited time and expertise in which to do a comparison.  I was able to eliminate some programs from consideration fairly easily.  Others were closer to the mark, and deserved a more careful look.  If I returned to this project in the future, I thought I might try to invest the time in reading the entire MP3 Diags help file and trying to get past its inflexible and alien interface.

For various reasons, MP3val presently seemed closest to what I needed.  It was possibly overkill in the sense of identifying numerous problems, apparently minor, beyond the apparently more significant problems identified by MP3Utility.  Even those more significant problems were not truly huge:  the files seemed to play normally in IrfanView, both before and after I used MP3val to fix them.  As far as I could tell, MP3 Diags was very much overkill, in the sense of identifying numerous problems that the help file then advised me to ignore, as long as my files were playing correctly.

MP3val and others did not draw my attention to a few files that had somehow gotten corrupted into a lower bitrate -- or something.  I was not sure what had happened to those files, but I could see and hear that they had lost quality compared to old backup copies.  MP3-Check provided bitrate information that I was able to copy and paste into a spreadsheet, where I could sort by bitrate to highlight such potentially problematic files.  There may have been a faster approach to this particular issue through an MP3 player/information program like Foobar2000.

Foobar2000, Winamp, and some audio editors were capable of doing batch conversions of large numbers of files.  If I did return to this project at some point, I thought it might be worthwhile to see if I could use a conversion program to fill in missing tags, homogenize bitrates, and otherwise convert these MP3s, created on different pieces of hardware, into a single, consistent format that would eliminate most if not all errors.  I would probably use MP3 Diags, MP3val, MP3-Check, and/or some other informational program to guide me in arriving at the ideal form for such a conversion.

I felt that it would probably make sense to postpone that conversion inquiry until some future point when I would investigate archival formats.  That is, I knew that formats were capable of becoming extinct, and I suspected that there was probably research and perhaps a consensus on what filetype, bitrate, and other characteristics were most likely to be supported into the indefinite future.  Ideally, I would be able too do just one conversion, check it with just one capable MP3 checker by that point, and put such concerns to rest.

Sunday, May 8, 2011

Windows 7: Verify That Data Files Are in Working Condition: JPG, MP3, PDF

I wondered if there was a way to test my data files, to make sure they would actually open without errors.  I posted a question on it, but that didn't get too far.

Eventually, I did find a couple of ways to test JPGs and other image files.  IrfanView was my favorite tool for this purpose.  I decided I wasn't really too concerned about corrupt spreadsheet and document files, since I rarely encountered anything like that.  I was in the habit of converting my documents into PDFs for storage.  JPGs, MP3s, and PDFs were probably the most numerous file types on my system, so I decided to focus on those for now.

For MP3s, a search led to a thread that identified a number of possible testing utilities.  I ran a search for several (i.e., MP3 Checker and Mpck, MP3 Diags, MP3Utility, and MP3 Validator) and came away with a preliminary impression that MP3 Diags and MP3Val were relatively popular.  None seemed to be listed on CNET.com, but I found MP3 Checker (2,111 downloads), MP3 Diags (3,387 downloads), and MP3val (1,925 downloads) on Softpedia.  It looked like MP3 Diags was being actively developed and had relatively good file correction possibilities, so I downloaded that.  The developer warned of potential data loss, so I decided I wouldn't necessarily use it to edit any MP3s until I was in a position to test them after the changes and make sure things had gone OK.  I ran it on a folder containing about 120 MP3s.  It ran for just a minute or so and identified problems with various songs (e.g., certain tags not found, low quality, two ID3V1 tags found when there should be no more than one).  It made these errors graphically visible, so that I could quickly see which files had the more worrisome kinds of errors (e.g., "Unknown stream found.  Since other streams follow, it is possible that players and tools will have problems using the file.)  In short, I liked MP3Diags.  Granted, I had not used it to fix anything.  But it made a good impression.

Now, how about testing PDFs?  A search did not yield much immediate help.  (A different search, later, was a whole different story, but I didn't get that in time for this.)  One suggestion was to automate printing them and see which ones printed.  It would have been possible to copy them all from various subfolders to a single subfolder, assuming duplicate names had first been resolved using something like DoubleKiller.  A Windows search would achieve this; so would a batch command using XCOPY.  From there, I could batch print to other PDFs, and perhaps the printing process would identify bad files.  I wondered whether an IrfanView conversion (as used in JPG testing) would do the same thing.  I tried opening a PDF in IrfanView and got an error:

Decode error !
Can't load Ghostscript or Ghostscript error.
Install Ghostscript from:
http://sourceforge.net/projects/ghostscript/
or
http://sourceforge.net
The latter appeared to be the current master site for Ghostscript.  I thought I did already have it installed, but perhaps not the latest version.  I tried using Irfanview (File > Batch Conversion/Rename) to convert several PDFs to JPGs, but got an error that way too:  "Error!  Can't load [filename].pdf."  To update Ghostscript, I downloaded and installed what looked like the Windows 32-bit version.  I tried the Irfanview conversion again, and that worked.  So it wasn't going to be necessary for me to explore the alternative of using PDFtoHTML, which would apparently require me to install Windows versions not only of Ghostscript and PDFtoHTML but also PDF2HTMLgui, which looked like it might be hard to find -- never mind the alternative of installing Xpdf, apparently an alternative to Ghostscript, or the approach of installing some relevant program (PDFtoHTML, I think) via GnuWin, which was going to be simplified by installing GetGnuWin.

Fortunately, I didn't even have to think about all that.  I just ran an IrfanView batch process on a bunch of PDFs.  To test if this was going to work properly, I inserted a bad PDF among those being processed.  To create a bad PDF, I searched for a hex editor, downloaded HexEdit, opened a copy of a small PDF file, looked in the ASCII pane (the right-hand one, in HexEdit; the one with occasional text rather than all numbers) for a reference to Root ## 0 R. (in my case, it was Root 9 0 R.), and change ## to 00 (so in my case, it came out being Root 00 0 R.), and then saved and tried it out.  Sure enough, when I tried to open the bad PDF, I got "There was an error opening this document.  The root object is missing or invalid."  So now I ran an IrfanView batch conversion of PDF to JPG, including that bad file (putting the output in a folder called X, which would cue me that I could delete the whole thing without looking at it).  Of the four files I tried to convert, three converted OK.  The bad PDF gave me an error in IrfanView and nothing in the output folder.  So then I would be able to just save the IrfanView report and examine the error messages; or if that failed, I could hunt around for a suitable folder comparison tool, or use a spreadsheet to compare folders, so as to work up a list of the PDFs that had failed the conversion process.

The spreadsheet approach, used with a command line process, might be best for those situations where the files I wanted to test were scattered across multiple folders.  What I would want, in that case, would be the ability to execute a batch command containing commands of this general form:
convert D:\Folder1\File35.pdf to D:\TestFolder\File35.jpg
convert D:\Folder2\File18.pdf to D:\TestFolder\File18.jpg
A problem there was that File35.jpg might already exist in TestFolder.  This could happen because there could be files named File35 in two different source folder, and now that would become evident when I tried to put them both (in JPG format) into one target folder.  One way to avoid that would be to begin with a DoubleKiller search for JPGs with duplicative names (having previously done a DoubleKiller search for files of any sort that had identical sizes and CRCs).  Alternately, I could test my spreadsheet-generated commands to see if they were going to produce duplicate filenames, and add a formula to change them as needed.  As for the "convert" part of that ideal command (above), I posted a question in the IrfanView forum, and then found a suggestion that the command I wanted would be like this:
i_view32 D:\Folder1\File35.pdf /c=d:\TestFolder\File35.jpg 
They said this (specifically, the "c=" option) would work to convert among all formats that IrfanView could handle except AVI, MOV, MPG, WAV, and MID.  (There would presumably have been PATH problems if I'd been running the portable version of IrfanView; the command line presumably wouldn't have known where to look for a non-installed program executable.)  Later, in response to the question I posted, someone said that, of course, I should have just gone into IrfanView's F1 (Help) > Contents tab > Overview > Command Line Options.  Which, when I finally did that, wow, there were a lot of them.  That help piece began with the advice to "See the 'i_options.txt' (IrfanView folder) for the most recent version of all command line options."  That file said I could use "/convert=" rather than just "/c=" and also that I should "See pattern help file page for more options."  At first, I thought that referred back to the F1 help page I had just come from; it had some conversion examples.  Those examples seemed mostly to show how I could include other command-line options at the same time.  One interesting option:  /filelist=txtfile would apparently use filenames contained in a file called "txtfile," so that apparently I would not have to repeat this command in full (on the command line or in a batch file) for each file being processed.  It said the conversion command would support wildcards.  Then I noticed that the pattern page was actually in a different place in IrfanView help:  it was under Options Menu > Text/Pattern Options.  There were variables or "placeholders" for a variety of components (e.g., $D was shorthand for the full path of the file being converted).  This seemed to mean that a command like "i_view32.exe d:\Folder1\*.jpg /c=d:\$D$N.pdf" would convert all the JPGs in Folder1 into PDFs.  I was going to have to play around a bit to understand clearly how that worked.

At the time when I closed this post, this process was still underway.  Additional steps I took were to run DoubleKiller for duplicative JPG filenames and then to do a directory listing of all JPGs on my drive.  To do that, in a CMD window, I went to the root folder (i.e., D:\ ) and typed this:  DIR *.jpg /a-d /s /b > JPG-List.txt.  That gave me a text file (JPG-List.txt) showing where all the JPGs were.  I put that into a spreadsheet and tested for duplicate output filenames.  Having already worked through duplicate filenames in DoubleKiller (by exporting the list of duplicates and generating batch-renaming commands in a spreadsheet), I did not find any duplicates now.  But I could tinker with filename extensions (e.g., .bmp, .jpg, .jpeg, .tif) and perhaps find that the same file existed under multiple names.  This was not exactly the same question as whether I had duplicates of the same photo; this was more a question of whether I had duplicate files under similar names.

Wednesday, September 1, 2010

Portabilizing Apps with Ceedo Personal

I was trying to create a portable version of Microsoft Office 2003.  That effort had led me to discover a positive review of Ceedo Personal and a favorable contrast against PortableApps.com by PC Magazine.  I decided to take advantage of a free trial to explore Ceedo.  This post described that exploration.

I was running this test on Windows XP SP3, running in a cloned virtual machine (VM) in VMware Workstation 7.1.  This VM was running a bit slow, but a VM generally would give me the option of wiping out everything and just making another clone, where I could start the experiment over from the beginning.

In the case of Ceedo, the VM did not seem to matter.  When I tried to install Ceedo in that VM, it insisted, instead, on being installed to a removable device.  That seemed unfortunate.  I had been working on developing a folder full of portable apps that I could use on my own computers and could also copy to a USB drive.  Running them from the hard drive was much faster than running them from the USB drive, so that's what I planned to do when working at home.  It now seemed that Ceedo was not going to cooperate with that plan.  But I hoped that a solution to this problem would emerge as we went along, so I went ahead with the installation on the USB drive.

During the installation process, I got a balloon pop-up telling me that the Ceedo Tray Icon Indicator would light up whenever I was using a program that was running under the Ceedo environment.  I guessed that this was a replacement for the previous approach that I had read about, where Ceedo would surround its own programs with an orange line.  Then Ceedo installed a toolbar at the top of the screen.  When installation was done, I had the option of taking a tour, which I did.  The gist of it was that Ceedo gave me that toolbar, which I hated, with its four buttons -- three of which were completely unnecessary, since they merely opened My Documents, Internet Explorer, and Outlook Express.  The fourth button opened something that looked like the Windows Start Menu, with Ceedo-specific choices (in addition to yet another set of My Documents, Internet Explorer, and Outlook Express).  There was no entry for Ceedo in the real Windows XP Start Menu, which made sense from a no-impact perspective; apparently the top toolbar was running entirely from the USB drive.  I used the top toolbar to open My Documents and no, the orange line had not been removed; it was there after all.  Since the only thing I needed from the top toolbar was the imitation start menu, and since I could get that by clicking on the Ceedo icon in the system tray (bottom right corner of the screen), I went into Ceedo's Options and instructed it to hide the toolbar when it was not in use.  I also changed some other settings.

I felt that Ceedo needed to give that imitation start menu a name.  For present purposes, I will call it the "Ceedo menu."  I went into that menu > Add Programs > Programs Directory.  It seemed to wish to open its own session of Internet Explorer, and apparently could not tolerate the fact that I already had Internet Explorer running.  It said, "iexplore.exe is already running.  Click 'OK' to open Ceedo's Internet Explorer and close the local Internet Explorer."  So I said OK.  Ceedo could perhaps instead give users the option of searching automatically for installed programs (or at least those having Start Menu entries) in the background; then, when users actually sought to open a file, they might have the option of doing so in those installed programs rather than mandatorily running them from the USB drive.  This would have avoided both the need to shut down the running program (Internet Explorer) and the slowness that I was experiencing when Ceedo did everything from the USB drive.  My present understanding was that speeding up Ceedo (if I could not run it from the desktop instead of the USB drive) would require buying a faster USB drive, such as the Kingston Vault (presently $40+).  In any case, the Ceedo menu did not stay onscreen during this process; it vanished as soon as I chose Programs Directory.  Programs Directory, itself, turned out to be just the Ceedo webpage listing the various freeware apps that you could apparently run from Ceedo.

I was more interested in seeing if Ceedo could portabilize my apps.  I experimented, first, with IrfanView.  To portabilize Irfanview, I went to the Ceedo menu > Add Programs > Argo Application Installer.  It offered to show me a list of programs supported by Argo, so I clicked on that option.  Nothing happened.  After playing around a bit, I found that it was trying to take me to a different list of applications than the one that I had just seen.  There weren't many items on it, and it didn't seem to contain any deep, dark secrets.  So apparently Ceedo was still in the process of trying to organize its website.

So anyway, back in Argo, I tried to point toward the IrfanView .exe file.  It was very slow in identifying the .exe files in My Computer.  It occurred to me that I wasn't sure whether it wanted the setup .exe or the installed, ready-to-run .exe.  I tried the IrfanView setup .exe.  That, in itself, was a bit perplexing, because there were two IrfanView setup .exe files -- the setup itself, and the plugins -- and I would want them both included in my IrfanView installation.  There wasn't an option, as there had seemed to be in JauntePE, to include materials that had been incorporated into a previous iteration of the portable app.  But anyway, on the next screen, Argo confirmed that I had guessed right:  it said, "The wizard will now launch the following setup file."  It gave me an option of installing in "reduced machine separation mode," which a webpage said would entail some permanent installation on the host PC in order to use that machine's resources.  Another page said, somewhat obscurely, that this "reduced separation mode" would enable the portable app to "interact" with apps on the host.  The idea seemed to be that you should choose this option only if you or the program actually needed that kind of interaction.

So I went ahead with the Argo process.  It gave me the IrfanView installation screen.  I went through the IrfanView installation process.  When that was finished, Argo was gone, and I had an IrfanView installation in the designated folder on the hard drive.  I realized then that maybe I should have designated a folder on the USB drive.  I couldn't tell if Argo had done anything in particular to make IrfanView portable, since IrfanView tends to be portable anyway.  I also couldn't tell what I should do to install the IrfanView plugins, other than (I guess) just run them through the Argo process and point to the same output directory.

So, OK, maybe IrfanView wasn't the best program to experiment with.  I tried again, this time with Microsoft Word 2003.  I started Argo, browsed to the Office 2003 installation executable file, and ran it.  After a while, I got this:

Microsoft Office 2003 Setup
Error 1719.  The Windows Installer Service could not be accessed.  This can occur if you are running Windows in safe mode, or if the Windows Installer is not correctly installed.
That was odd.  I had just installed and uninstalled Office 2003 in that same VM.  The installer had worked fine then.  But OK, I created another clone VM, in which there had been no prior Office 2003 installation, and booted it up.  This time around, I did something that perhaps I should have done last time:  I rebooted when I got the message that the hardware (i.e., the USB drive) had been recognized but might not work properly until I rebooted.  After I rebooted, I got a Ceedo Action Window that gave me the option of enabling Ceedo AutoDetect.  That, according to the Ceedo help file, was a "tiny" optional component, installed on the host, to detect whenever a Ceedo drive was connected.  I said yes, do this.  It took Ceedo a long time to load; and when the "Loading Ceedo" message did finally disappear, I was surprised to see that the Ceedo icon likewise disappeared from the system tray.  I went to the USB drive in Windows Explorer and restarted Ceedo manually from there, but it said, "Ceedo already running."  Yet it did start a new "Loading Ceedo" message anyway.  If Ceedo was running, where was it?

Eventually, I did get a Ceedo icon in the system tray, and when I clicked on it, I was able to go back into Argo and start the Office 2003 installation again.  I tried again to install Office 2003 and again got that Error 1719 error message.  This was occurring in a VM clone like those that I had been using repeatedly in recent days to test various programs.  To test it, I closed Ceedo, removed the USB drive from the system, and tried installing Office 2003 natively in that VM.  It ran without difficulty.

That concluded my test of Ceedo.  Moreover, since my investigation had not turned up any superior alternatives to Ceedo that would do the job, this concluded my search for tools that would give me an affordable, portable copy of Microsoft Office 2003.  So at this point I returned to the main project -- of developing a set of portable applications for Windows XP -- with the sense that I might need to consider alternatives to Office 2003.