End of This Blog. Transition to WordPress.
For reasons sketched out in a previous post, I do not presently plan to post any more messages in this blog, and will instead be posting my technical stuff in a new WordPress blog.
For reasons sketched out in a previous post, I do not presently plan to post any more messages in this blog, and will instead be posting my technical stuff in a new WordPress blog.
Posted by raywood 1 comments
I had a bunch of email (EML) files scattered around my hard drive. Some of them, I noticed, were displaying a lot of HTML codes. For example, when I opened one (using Thunderbird as the default EML opener), it began with this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7036.0"> <TITLE>RE: Scholar Program</TITLE> </HEAD> <BODY> <!-- Converted from text/rtf format -->
findstr /r /m /s "<!DOCTYPE HTML PUBLIC" D:\*.eml > D:\findlist.txtIt produced a dozen "Cannot open" error messages. The reason seemed to be that the filenames for those files had funky characters (e.g., #, §). Also, Findlist.txt contained the names of files that did not seem to have the DOCTYPE text specified in the command. DOCTYPE may have appeared in attachments to those files, but I didn't want to be flagging that sort of EML file. So despite a number of variations with FINDSTR and several Google searches, I gave up. I returned to Copernic, searched for the DOCTYPE text (in quotation marks, as shown above), and moved them manually. Copernic had a convenient right-click Move to Folder option, so that helped a little. So now, anyway, despite the imperfections of the process, I apparently had the desired EMLs in a single folder. I would just have to re-sort them back to where they belonged manually.
pdftk.exe - System Error The program can't start because libconv2.dll is missing from your computer. Try reinstalling the program to fix this problem.I moved the two files to C:\Windows and tried again. That worked: I got documentation. It scrolled on past the point of recovery. Typing "pdftk --help > documentation.txt" solved the problem, but ultimately it didn't seem to give me anything more than already existed in pdftk's docs subfolder. The next step was to put pdftk to work. It would apparently allow me to specify the files to combine, using a command of this form:
pdftk 1.pdf 2.pdf 3.pdf cat output 123.pdfMy problem was that, at least in some cases, the filenames I was working with were too long to fit on a single line like that, one after the other. I decided a solution would be to take a directory listing, put it into Excel, and use it to create commands for a batch file that would rename the emails and their accompanying attachments, with names like 0001.pdf. I would need to keep the spreadsheet for a while, so as to know what the original filenames were. The original filenames were my guide as to what files needed to be combined together. For this purpose, with one of the original filenames in spreadsheet cell A1, I put the ascending file numbers in cells B1, B2 ... (i.e., 1, 2, ...) and then, in cell C1, I put =REPT("0",4-LEN(B1))&B1&".pdf". Finally, in cell D1, I put ="ren "&CHAR(34)&A1&CHAR(34)&" "&C1. Then I copied the formulas from column D into Notepad, saved them as Renamer.bat, and ran it.
Posted by raywood 2 comments
Labels: attachments, convert, Emacs, emails, EML, HTML, pdf, PDFsam, pdftk, print, TexFinderX
In a previous post, I looked at replacements for Windows Explorer ("WinEx"), including especially FreeCommander. The runner-up, at that point, was Explorer++. Further experience with FreeCommander prompted me to take a closer look at Explorer++ after all. This post provides further information on these two utilities.
As I used FreeCommander, I was surprised to find that a few right-click (context menu) options were missing. For example, I often used LockHunter to find out why Windows was not letting me move or delete a certain file or folder. But in FreeCommander, I was no longer seeing the context menu question, "What is locking this file?" That option did continue to appear in Explorer++, as it had appeared in WinEx. One possible explanation was that FreeCommander did not offer a 64-bit version, whereas Explorer++ did, and I was using the 64-bit version of LockHunter.
Another problem in both FreeCommander and Explorer++ was that I no longer had the option to create a new text file in a specified folder. That option had been available in WinEx, as I recalled, via File > New > Text File. I was pretty sure there was a way to create a new text file in FreeCommander. It seemed to me that I had done so by accident, once or twice, while trying to do something else with a familiar command from WinEx. But I was not seeing that option on the menu nor in the list of shortcuts, and likewise in Explorer++. Workarounds in either program were to open a command window in the selected folder and type one of these options:
Posted by raywood 2 comments
Labels: alternative, Explorer, Explorer++, FreeCommander, replacement, substitute, windows
Posted by raywood 9 comments
Labels: 7, alternatives, files, folders, navigator, replacement, substitute, windows
I guess I have assumed that almost everybody loves Google, and those who don't are the bad guys. Microsoft, for example. Maybe it takes a huge corporation to stand up to another huge corporation. If so, Google is a champion for those who have disliked various things about how Microsoft got its start, what it did to increase its power, and what it has done with that power.
There comes a point, however, when the good guy turns bad. Maybe it doesn't have to happen. But power tends to corrupt. And even when it doesn't actually corrupt, it tends to create an impression of corruption. That impression may be able, by itself, to make people more or less as miserable as they would be in case of actual corruption and abuse.
Case in point. I have been blogging for years, here in Blogger. I wasn't necessarily eager to see Google acquire Blogger. But they were welcome to do so, for my purposes, as long as they left me alone. The deal was that I got to use their free blogging platform to put out various things that I wanted to write, and they got to use my work, my viewers, etc. to make money from advertising and whatnot.
Gifts can make people resentful when they stop. I would be unhappy with Google if they pulled the plug on my blogging enterprise, even though they're not charging me for it. I have spent years putting stuff here, linking one post to another and so forth. It would take a lot of work -- work that I might never do -- if they were suddenly to just shut it down or screw it up. I would feel that, after all, Google does have competitors, notably WordPress. If nothing else, I'd sooner be paying for a hosted website than to do all this work and then watch it get messed up.
What's sad is that I have been warned that they are quite capable of doing exactly that. It has already happened. Circa 2000, many people were using DejaNews as a convenient gateway to Usenet. Usenet newsgroups contained tons of free, helpful information on a vast array of subjects -- especially but not only computer-related, like this blog. Google acquired DejaNews. Evidently they felt that all that information would interfere with their desire to sell advertising related to webpages. For whatever reason, they basically destroyed Deja. That was a shame, for all those people who could have continued to use it to obtain useful information. And it was irritating to me, because all the things I had put out there, thinking I would always be able to access them, were removed from access as a practical matter, by me and most everyone else.
I was pretty unhappy with Google about that. That was the first big chink in their claim that they would "do no evil," as their corporate motto ("Don't be evil") has been widely reported. They had obviously ruined something useful, for purposes of increasing profits.
That stuff would not be coming back to mind now if I weren't having an off day with Google today. Here I am, working away on my blog, and suddenly it is no longer very functional in Internet Explorer. I have a nice little desktop arrangement, with various browsers, but now Blogger has suddenly ceased to work properly when I try to post or edit. Google lets me know that, instead, I should be using its own browser, Chrome, for this purpose.
That part happened several days ago. So, OK, I have been trying to post in Chrome instead. But I am finding that Chrome is not yet up to speed for this purpose. Google was eager enough to move me over to its browser -- the statements and signals have been out there for some time -- but, lo, it develops that Chrome is inserting white backgrounds. Whole chunks of my post are whited out. Why? I don't know. Probably they don't know either. I am having to go in and manually remove whiteing that I didn't put there. Why not just leave me alone, free to work on my blog in Internet Explorer, until Chrome gets its act together?
That seemed like a fair question, so I tried to present it to Google. Problem is, their "Contact Us" webpage is a lie. You cannot contact them through their webpage. Or at least I cannot. I tried today. I tried once before, with a problem so obvious and banal that it pained me to have to bring it to their attention. In that case, I gave up and wrote them a letter. It seemed ironic, and yet telling, that I had to use the U.S. Post Office to communicate a simple thought to one of the world's largest software corporations.
Like most people, I don't like being lied to. If you're not going to let me contact you, don't give me a "Contact Us" webpage. Call it "FAQs" or whatever. It's great that you can hire the best and the brightest, but that can backfire: you can create the impression that you think you're too good for the rest of us. It wouldn't be terribly smart to generate unnecessary resentment, would it?
It had never occurred to me, until today, to search for something that I have now searched for and found. Yes, as it turns out, there does exist something called IHateGoogle.org. I'm not really sure what it's about. I'm not resentful enough to dig into it. But, Google, keep it up: maybe someday I will be. You seem to be making a good start at it: today you tell me that as many as 1.4 million webpages convey that sort of feeling toward you and your actions.
Obviously, I am not the only person who has attempted to communicate with Google along these lines. People rarely get resentful when they feel they are being respected. If Google cannot make its own programs work together -- Chrome and Blogger, in this case -- it is welcome to keep them in beta. But forcing me to use them when I don't want to: at this point, that is a problem. Not just a software problem. As presented in this post, it is an indication of larger and more worrisome things.
I was using Thunderbird 11.0.1 in Windows 7. I had accumulated some emails that I wanted to export as individual EML files. An EML would still be readable in Thunderbird, and it would carry any attachments along with it. I had attacked this problem on several previous occasions. As before, I was not sure I would get all the way through from Thunderbird to EML to PDF. This post provides another contribution in the slog toward that outcome.
First Step: From Thunderbird to EML Format
Some of my previous efforts to export to EML and then convert to PDF had produced something of a mess. Exporting, itself, was easy enough. I was using ImportExportTools. It would give me EMLs with names containing some, but not all, of the information that I wanted in file names. Specifically, it would provide the date and time, the sender, and the subject; but it did not include the recipient. I could get it to produce a separate Index.csv file that would contain the full information, but that would just be a spreadsheet file. I could use that spreadsheet file to give me nice names for files; but which file was supposed to get which name? Matching them up had required a surprising amount of manual effort, last time around. I was hoping to make the process smoother, if I could.
It wouldn't help to print a PDF directly from Thunderbird. As far as I knew, that would require me to enter PDF filenames manually. I was looking for a mass-production kind of solution. About.com recommended mbx2eml, but it seemed to have some disadvantages, notably a very limited set of options for the resulting EML filenames -- which was the main problem. Generally, it did not seem that any solution had broken through into prominence, in either the T-bird to EML or T-bird to PDF category.
In my first try at this problem, I had tried Total Thunderbird Converter and Birdie EML to PDF Converter, but for various reasons had not been impressed with either. I did like Attachment Extractor, for when I got to that part of the project. My notes seemed to favor Universal Document Converter (UDC) ($69), if I wanted a direct T-bird-to PDF-solution. As I reviewed the struggles I'd had in that first try at this problem, and also in the second and third tries, I wondered if I should have focused more seriously on UDC. But it did not seem to have command-line capability or other automation features. It was basically a glorified PDF printer. Moreover, its default filenames did not include all the information I wanted.
My previous notes did not seem to mention that Thunderbird messages were apparently already in EML format, stored in Thunderbird subfolders. For instance, I had moved the messages that I was now seeking to export to a Local Folders subfolder called Export, and I could see that folder in Windows Explorer as Mail\Local Folders\Export.mozmsgs. But this was confusing: the number of EML files in that folder was not very close to the number of messages in the Export subfolder in Thunderbird. Anyway, the EMLs in Export.mozmsgs had seemingly random names that would be useless for my purposes.
So I went ahead with ImportExportTools. My first step was to eliminate duplicates. For this, I used Remove Duplicate Messages (Alternate). Then, in Thunderbird, I went to Tools > ImportExportTools > Export all messages in the folder > EML format. The first time around, this produced undesirable results (see below). But I didn't know that until I was partway through the second step.
Second Step: Adding Recipient to the EML File Name
I had my EMLs. But as noted above, I wanted to add the name of the Recipient to the filename, in the format Date-From-To-Subject. As a first step, I thought I would just try to append the Recipient's name to the end of the filename. Then I would figure out how to shuffle the words around to the desired order.
Given my limited knowledge of programming and such, I decided to try to achieve this with a Windows batch file. I struggled to figure out how to write a suitable one, and finally posted a question on it. One of the early answers to that question led to a separate pursuit -- a one-line batch file that would convert Word and WordPerfect documents to PDF.
The answers that I had received, at the point when I was writing up these notes, fell into two categories. One, which I found easier to understand (and, predictably, seemed less popular among the knowledgeable respondents), involved a simple loop that would call an external process. Basically, in plain English, it went like this:
FOR each EML file, run Process.By contrast, the approach preferred by most of the answering individuals would put all the steps inside the loop, instead of having a separate process afterwards. It seemed to be a matter of style. A second difference was that, in discussing the specific steps, they seemed divided between two general possibilities: with, or without, delayed expansion. Delayed expansion was apparently a response to a complication in how the FOR command worked. As I understood it, the computer would read the entire contents of a FOR command as soon as it hit the word FOR. So assigning a value to a variable inside a FOR loop would be too late; the computer would already have decided what value that variable had. The variable would have been immediately expanded to its value. Delayed expansion would postpone definition of the variable's value until later in the game. A variable would be marked for delayed expansion by surrounding it with exclamation marks (e.g., !VAR!). I wasn't familiar with delayed expansion, so I was in accord with some advisors' feeling that it would be better to proceed without it if possible. What they (especially Aacini) suggested was:
Repeat loop.
When list of files is exhausted, quit.
Process starts here.
Do various things.
End of process
@ECHO OFFI have double-spaced the lines for clarity, anticipating that Blogger will wrap some long lines. I haven't indented the way a programmer would, because of apparent limitations in the formatting options here in Blogger. Basically, this batch file said, give me a fresh output file called Fullnames.txt; and on each line in Fullnames.txt, type the contents of two variables. The first variable, %%f, was the name of the EML file under consideration, in all its Date-Sender-Subject glory. There would be one such filename assignment for each EML file in the folder; hence a FOR loop. The batch file would loop through all EML files in the folder.
IF EXIST fullnames.txt DEL fullnames.txt
FOR %%f IN (*.eml) DO (
SET firstfind=
FOR /F "delims=" %%l IN ('findstr /B /C:"To: " "%%f"') DO (
IF NOT DEFINED firstfind SET firstfind=now & ECHO %%f%%l >> fullnames.txt
)
)
Posted by raywood 6 comments
Labels: attachment, convert, emails, EML, EMLs, export, pdf, thunderbird