Monday, September 1, 2008

VMware in Ubuntu: Failed NVIDIA Fix; Other Fixes

This is a continuation in a series of posts on the process of transitioning from Windows XP to Ubuntu (version 8.04, Hardy Heron), on an AMD X64 system, using 64-bit Ubuntu. I decided that virtualization, and VMware Workstation 6 in particular, would be an essential tool in making that transition successful. The most recent step involved the successful restoration of a previous backup of the Linux system using Acronis True Image 11 Home backup software. Now that I had done that, I found myself having to reinstall a few programs that I had installed since the date of that backup, and also having to install some new stuff. This post describes those continuing steps. Now that I had a working system again, I tried VMware Workstation. It ran, but its memory was still set at 4384MB, which had been insufficient to run three 1GB virtual machines and two 512MB VMs (that is, three VMs in which I had allocated 1GB of RAM each in Workstation, and two more that were given 512MB each). My experimentation had suggested that 4500MB was sufficient for that purpose. But I had seen the hard drive working really hard to load and manipulate all of that stuff. I had gravitated toward a concept where I would use a 1GB machine to save the current status of a single project, mostly containing Microsoft Word and Excel files and Adobe PDF articles. I would just suspend the current state of a virtual machine in VMware Workstation when I was done with it for the present. I didn't expect to need to have three different projects open at once, since my projects were typically independent of one another. So I decided to make more RAM available to the underlying Ubuntu layer, where VMware ran, by cutting back to just two 512MB and two 1GB VMs open at any one time. At the bare minimum, that should have required at least 3072MB (2 x 512 + 2 x 1024) of RAM. But by that calculation, I should have been able to run two 512MB machines and three 1GB machines with 4096MB, when in my experience 4384MB had not been enough. I didn't know exactly where the cutoff was, but we seemed to need at least a 7% overhead but not necessarily as much as 10%. Using the 10% figure for now, I restarted Workstation as root ("sudo vmware") and allocated 3379MB in Edit > Preferences > Memory. I noticed that VM > Settings > Options > Shared Folders indicated that those partitions which I had not accessed in the current Ubuntu session received a "The path does not exist on this host" error message, so I needed to review the changes I had made to /etc/fstab to mount all partitions at bootup. A quick glance suggested that I might just replace it with the copy of fstab that I had just recently saved in the Files.tgz backup (above). Unfortunately, I could not just copy fstab from that files.tgz archive using Archive Manager (which had opened up when I had double-clicked on files.tgz in Nautilus (i.e., File Browser). I had to extract it to somewhere and the cut and paste it into /etc. I restarted the computer. Ubuntu loaded without a problem this time. I started up Workstation and powered up two 512MB and two 1GB machines. Before I could do that, I had to enter my serial number, having upgraded from trial to registered user since the date of the Acronis backup. When I tried to power up the second 1GB machine, I got an indication that I needed 3386MB, not 3379MB. So my 10% calculation had actually fallen 7MB short; the safer figure would have been 11%, or 3409MB (i.e., 3072MB of RAM for the actual machines, plus up to 11% for overhead). I could have gone with exactly 3386MB, but the wording of the error message led me to think that this could fluctuate somewhat. But anyway, on a superficial level at least, VMware seemed to be functioning fine following the Acronis restore process. It did not seem that having more memory available was making any difference in Workstation's performance. Meanwhile, I was having to sit and wait for machines to suspend or restore. So after some minor tinkering with a couple of VMs, I suspended them all, shut down Workstation, restarted as root, and upped the total RAM available to VMs to 4500MB. That was more like 10% above 4096MB, but I had already found that 4500MB was enough. These calculations were rough, and I could have researched the matter, but RAM was not that tight for me. I noticed the Desktop didn't have the right icons, so I moved whatever was on there to a holding folder and restored my desktop files and shortcuts from files.tgz. They weren't all working quite right, though. I figured I had probably better work through my notes, or at least skim them, starting from the date of the Acronis backup. Most of the relevant notes were contained in a single long post. The first thing was that my setxkbmap launcher, which I had placed on my desktop and occasionally used to fix some screen and keyboard problems caused by going into full-screen view in VMware, was not operational. I thought that would be easily fixed by clicking its Properties > Permissions > Allow executing file as program. Maybe it was; maybe it was working; but I wasn't able to move between Ubuntu desktops by scrolling my mouse wheel, and I had thought setxkbmap had been the fix for that. I didn't try to fix the Adobe Flash plugin problem this time around. It was a little confusing. I had thought that my 64-bit Ubuntu installation was unable to run it, but I had then noticed that it (or something like it) seemed to be running just fine in Ubuntu on the secondary computer. Maybe I had installed a fix there and had forgotten about it. Whatever. I felt I would have to work through that issue later on, unless the next version of Ubuntu fixed it for me. I hadn't yet completed the project of setting up desktop icons I could click on to run specified programs or to open specified files or folders each day, or week, or month. I had set that up on the secondary computer, but had not yet proceeded with it on the primary computer. So that was one more issue from the past two weeks that I would be dealing with later, but that I did not need to fool with right now in order to get my system caught back up to where it had been before it went south and I needed to restore the Acronis image. I also wanted to get Google Earth working again. I had the shortcut on my desktop, but it was not operational. It also wasn't listed in my Applications. I searched this blog for information on how I had done it last time. I found only a brief mention, with a link to a webpage that advised me to use wget to download the stable version. That reminded me to take a look in System > Administration > Synaptic Package Manager. I searched and found that I had apparently already installed googleearth-package -- which wasn't surprising, since I remembered wrestling with it several times -- with eventual success, as I recalled. I usually used it on the secondary computer, but it did seem that I had gotten it running on the primary computer as well. I hit Alt-F2 for a command box and typed googleearth. That didn't work. Neither did googleearth-package. In File Browser, I searched File System for google. Turns out there were tons of files naming Google Earth or googleearth or whatever, but I wasn't seeing anything that looked like a clickable icon. Since I wasn't too sure how to start the program, and also wasn't sure it had been functioning correctly even at the end of the previous installation, I decided to uninstall and reinstall it in Synaptic. But where had it gone? I couldn't find it in File Browser, and typing googleearth or googleearth-package accomplished nothing in Terminal. I searched File System again, this time for googleearth. The search turned up a googleearth-package folder containing a README.Debian file. When I opened that text file, I saw an instruction to run make-googleearth-package. I did that and, as promised, it created a Debian package which, in this case, was called googleearth-package_0.5.2_all.deb. The README said I coudl install that package with the command "dpkg --install." I typed "man dpkg" in Terminal to see the manual on this dpkg command. It said dpkg was the Debian package manager, and the syntax was "dpkg [options] action." "Install," it said, was an action, and the complete syntax for that part of the command was "--install package_file." So it sounded like what I actually needed to type was "dpkg --install (packagename).deb." (I had to use parentheses instead of angle brackets, in that sentence, because I had slowly learned that Blogger would interpret angle brackets as HTML commands, i.e., they would vanish from my typed text.) I tried that command in Terminal and it said, "Requested operation requires superuser privilege," so I did it again with "sudo" in front. (Just typing the up arrow, of course, would bring previous commands back to the cursor, so that I did not have to retype them in full; and Home would take me to the start of those previous commands to insert words like "sudo.") Unfortunately, this command yielded a "Cannot access archive: No such file or directory" error. In Terminal, using cd commands, I navigated to the folder where the .deb package had been saved (found by right-clicking on the item where it showed up in the File Browser search), namely, /var/cache/apt/archives, and retried the command there. This time it ran. But I couldn't figure out what it had done, or where it had done it. There weren't any Google Earth icons on the desktop or in Applications, and the README didn't say anything further. I did another search for Google across my entire File System in File Browser, and that turned up a bunch of things, but nothing I could identify as specifically the launcher for Google Earth. I typed Alt-F2 and typed googleearth in the Run Application box, but neither that nor googleearth-package would do anything except give me error messages. After scanning a few discussion threads, I went back into Synaptic and marked googleearth for complete uninstallation. Most of the googleearth entries were now gone from my search of File System, which I had left open during this process. I still had the .deb file and a GoogleEarthLinux.bin file, so I deleted both of those manually. Then I tried a different approach, which required me to type "sudo apt-get install googleearth-4.3." It didn't work, probably because I had failed to first enable the Medibuntu repository:

sudo wget http://www.medibuntu.org/sources.list.d/hardy.list -O /etc/apt/sources.list.d/medibuntu.list sudo apt-get update && sudo apt-get install medibuntu-keyring && sudo apt-get update
(The word wrap may be confusing. There is a total of just two commands there, even if they wrap to four or more lines in this post. In other words, enter all of the first quoted paragraph, above, on one line, and then enter all of the second quoted paragraph on the next line.) I did that, and then tried again with "sudo apt-get install googleearth-4.3." That gave me a Google Earth icon under Applications > Internet, and when I clicked it, Google Earth 4.3 ran. It gave me an error message: "Unknown Graphics Card" and "Google Earth is unable to identify your graphics card." They gave me a link to a webpage that offered driver download and install instructions. But when I clicked on that link, I got another error message:
Could not launch any web browser. Please make sure you have set the $BROWSER environment variable to the filename of the web browser we should launch!
Very few webpages came up in response to my search for this error message, and the ones I saw did not have an answer. I posted a question on that. The answer, which came back shortly, was as follows:
This is how you export a variable (firefox is used in the example): export BROWSER=/usr/bin/firefox This variable only keeps its value until you close the terminal you typed it in. To make it permanent, add it to end of your .bashrc file.
Now I had to find .bashrc. A search of File System turned up four possible candidates. Two were named bash.bashrc and two were named dot.bashrc. I looked at all four of them in gedit (Text Editor). They were all different, but their differences did not provide obvious clues to me. I looked at their properties and noticed that one of each was in /usr/share/doc/adduser/examples/adduser.local.conf.examples. "Examples" sounded like a reference source rather than an operating folder, as compared to /etc or /usr/share/base-files, which was where the other two were located. Between those two, I guessed that bash.bashrc in /etc sounded more official than dot.bashrc in /usr/share/base-files, primarily because /etc seemed to be where fstab and other actively used bootup-type scripts were located. So I typed "sudo gedit /etc/bash.bashrc," created a new last line in it, and typed "export BROWSER=/usr/bin/firefox" there. I tried starting Google Earth, but by this point I had been having the NVIDIA low-resolution problems described below. I got a low-resolution notice from Google Earth, and when I clicked OK on it, the system crashed and restarted. So I couldn't tell, at this point, whether the suggestion had worked or not. While I was waiting for the answer to that BROWSER question, I went ahead and entered the URL manually. Their search page didn't help me, so I entered my own search. One of the few pages that came up reminded me to check the obvious thing: System > Administration > Hardware Drivers. Somehow my NVIDIA accelerated graphics driver had gotten turned off. I enabled it. It said, "Needs computer restart," so I restarted. Or at least I tried to. The system turned off, but she no start again. I punched Reset, and it started to go, but then I got "No Signal," the hard drive light flickered a bit more, and then nothing, again. I did Reset again, and this time went into Recovery Mode in GRUB. The dpkg process there noticed that I needed some upgrades, apparently because I had now enabled the Medibuntu program repository, so it installed those. I also ran the xfix option, and then resume. So maybe it was the Medibuntu deal that was causing the boot problem. Reboot continued normally. Now, back in Ubuntu, I double-checked the hardware drivers, and -- what's this? -- the NVIDIA graphics driver was still not enabled. So, OK, I clicked it again, restarted again, and again got "No Signal" and then a blank screen. Was this going to be a permanent feature of my system from now on? I punched reset and did the recovery thing again. I noticed, when xfix was running, that it said, "Overwriting possibly-customized configuration." After I hit "resume," the system booted normally. Checked the hardware drivers again; enabled them again; restarted again; No Signal again; punched Reset again; recovery mode again. This time, I ran only the dpkg fix, not the xfix fix in recovery mode. No Signal! So the xfix fix was fixing the damage caused by trying to use the NVIDIA drivers. This wasn't what I desired. Reset; recovery mode; this time just do xfix. Ran it twice by mistake. Back in Ubuntu: one more for the road, just to confirm that xfix and the NVIDIA driver were opposing one another. Hardware drivers enabled again; restart required; recovery mode; try Resume without running xfix; No Signal. Reset; recovery mode; xfix only; resume; good boot; NVIDIA driver was disabled. I found a thread that said I should hit Ctrl-Alt-F1. I did that, and I was at a black-and-white command-line screen. It required me to enter my ID and password. The advice was then to type "/etc/init.d/gdm stop". That stopped the GNOME Display Manager. Next, I was supposed to "Go to the directory where the driver package is storeed and run "sh ./NVIDIA(driver package name).run". I had no idea where that directory would be. Here on the secondary computer, I did a search in File Browser for NVIDIA*.run, because that's how I would have done it in Windows, but of course that turned up nothing. I read the rest of that thread and it seemed this advice didn't work for everyone. I noticed they were talking about EnvyNG and downloading various NVIDIA drivers, and this was just where the system had gotten screwed up before, requiring a complete reinstall from my Acronis True Image backup. Before crossing that bridge, it was time to do another Acronis backup, so as to lock in the gains that I had achieved thus far. After doing the backup, I searched File System on the primary computer for NVIDIA- and found all kinds of stuff, but nothing in the nature of NVIDIA-*.run. With the search results still visible, I went into System > Administration > Synaptic Package Manager and searched for nvidia. I uninstalled several nvidia packages. This removed almost all of the items from the search box -- every installed nvidia package except some jockey programs. The NVIDIA drivers were no longer visible in the Hardware Drivers box. I rebooted the system. I decided to start with the EnvyNG approach, since that seemed easiest. So I searched for that in Synaptic and selected envyng-gtk for installation. I installed that and then ran it from Applications > System Tools, selecting Automatic Hardware Detection. When it was done, it wanted me to restart the computer, which I did. I got "No Signal" and then a dead computer. I did the routine: punch reset, select the recovery mode option in GRUB, run the xfix option, and then resume. I ran EnvyNG again, and this time selected manual installation of the driver and chose the second most recent one, 96.43.05, rather than 173.14.12. EnvyNG ran and allowed me to reboot again. In each of these many reboots, I would get a number of messages, some of which looked like error messages, from Ubuntu; but typically I did not understand them and/or they would flash by too quickly. So I had not been trying to fix each of them, though I began to think that maybe I should have been. Anyway, this NVIDIA driver failed too: No Display and a dead machine. I did the recovery mode routine again. I tried EnvyNG one last time, choosing the oldest driver shown: 71.86.04. This didn't work much better: I didn't get a dead machine, but I did get the "Ubuntu is running in low-graphics mode" message. Since the bootup did at least complete this time, I tried System > Administration > Hardware Drivers. It said "No proprietary drivers are in use on this system." So this was a bust. At this point, I got a notification, "Software updates available," so I ran that, to see what would happen. My guess was that these were packages I had incidentally uninstalled, when I was uninstalling all those nvidia packages earlier. After they installed, there was still nothing listed in Hardware Drivers. I rebooted, just for old times' sake, and I was still in the same boat: poor resolution, nothing in Hardware Drivers. I concluded that the EnvyNG approach was not going to work on this computer, so I used it to uninstall the drivers, and then I uninstalled it in Synaptic. It now occurred to me to try downloading the drivers directly from NVIDIA's website. For my EVGA 256-P2-N624-AR GeForce 7900GS video card, the recommended driver, provided by their recommended manual search, specifying 64-bit Linux, was indeed the version 173.14.12, released July 30, 2008, that EnvyNG had identified. Their recommended steps, for installing this driver, were basically (1) download the driver file, NVIDIA-Linux-x86_64-173.14.12-pkg2.run; and then (2) figure out where it went -- to the Desktop, in my case; and then (3) log in as root and navigate to that location -- in my case, it was "cd /home/ray/Desktop"; and then (4) type "sh NVIDIA-Linux-x86_64-173.14.12-pkg2.run" to install. But I wasn't out of the woods yet. I got an error message: "You appear to be running an X server; please exit X before installing." They directed me to "the section INSTALLING THE NVIDIA DRIVER in the README available on the Linux driver download page at www.nvidia.com," which as far as I could tell does not exist. I did a couple of Google searches and finally found chapter 2 of the NVIDIA Accelerated Linux Driver Set README and Installation Guide. Chapter 2, Installing the NVIDIA Driver, recommended that I exit the X server and kill all OpenGL applications; and for guidance in that, they directed me toward chapter 8, Tips for New Linux Users. It recommended editing /etc/inittab, which did not exist on my system. A search of the file system revealed nothing called inittab except a Perl script, "migrage-inittab.pl," which appeared to be designed to edit inittab if it existed. After several searches, I finally found someone who claimed that inittab had been replaced, in Ubuntu, by upstart. This was not too helpful: a search of my file system turned up a dozen or more files or folders whose names mentioned "upstart." Basically, these NVIDIA instructions were not tending to provide very useful guidance to me. I looked further. A Debian website said that, aside from 0 (halt) and 6 (reboot), runlevels were either 1 (single-user, troubleshooting mode) or 2 through 5 (all the same, all multiuser mode). It said you could determine your current runlevel by just typing "runlevel" at the Terminal prompt. I did. I got back N 2, meaning I was on runlevel 2. I found another thread with more advice that wasn't crystal-clear. I posted a question on it and got a response within 10-15 minutes. The essence of the instructions, in my case, was:
(1) Use Ubuntu's shutdown button and select Log Out. (2) Hit Ctrl-Alt-F1. (3) Log in. (4) Type this to stop Gnome: sudo /etc/init.d/gdm stop (5) cd /home/ray/Desktop (which is where the downloaded NVIDIA installation file was). (6) Type this to make sure the NVIDIA installer has executable permissions: chmod +x NVIDIA*.run (7) Run the installer: sudo ./NVIDIA-Linux-x86_64-173.14.12-pkg2.run
I did that and got this message:
No precompiled kernel interface was found to match your kernel; would you like the installer to attempt to download a kernel interface for your kernel from the NVIDIA ftp site (ftp://download.nvidia.com)?
It sounded like the kind of offer I couldn't refuse. I guessed this came up because I had uninstalled so much NVIDIA stuff. Next, it said,
No matching precompiled kernel interface was found on the NVIDIA ftp site; this means that the installer will need to compile a kernel interface for your kernel.
Again, I was the very picture of agreeableness as I clicked OK, that being the only option available. I also went right along when they suggested,
Install NVIDIA's 32-bit compatibility OpenGL libraries?
that being, again, the default (although they did give me a chance to say no). And now I had this:
Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up.
And of course you know what I said. That completed the installation. The advice from my posted question now continued as follows:
(8) When it has built and installed the new kernel module and libraries you should be able to restart Gnome (if you've got the Ubuntu LRM (Linux Restricted Modules) installed you might see an error report due to a bad script in /sbin/lrm-video - if that happens, ask me for how to fix it. sudo /etc/init.d/gdm start
I typed that last line at the prompt and, hard to imagine, I got -- would you believe -- No Signal! Dead monitor! I updated my posted question but, you know, this time I was not getting such a quick response. As of 14 hours later, nothing. It was time to review the situation. I needed to move this project ahead, as I had big plans for this whole VMware setup. I seemed to have a graphics problem that not everyone with this hardware and software combination was having. Something had evidently happened to my Ubuntu installation to make it graphically fubar. I could restore the Acronis backup again, but apparently the seeds of destruction had already been sown at the time when that backup was made. I could reinstall Ubuntu from scratch, and maybe that wasn't going to be such a big deal. I hated not to solve the problem. The previous thread had been in the x64 Ubuntu forum. I decided to post a revised version of my question in the Hardware & Laptops forum. I had a response in 40 minutes, containing a number of bits of advice. I didn't understand them all -- for example, the advisor recommended "boot the newer kernel," but I had no idea what that meant. I updated my question and then set to work on the parts that I could understand. One was to use nvidia-glx-new, so I went to Synaptic and installed that. The person had advised "remove the installed mess," but in my search for "nvidia" in Synaptic, this would now be the only installed package other than jockey-common and jockey-gtk, which were user interfaces for drivers. But Hardware Drivers still said, "No proprietary drivers are in use on this system." Nothing had changed in the display. I rebooted, to see if that would make a difference. It didn't. The person also recommended looking at a website, but I had already tried its advice and it hadn't worked. I was not having these problems with the secondary computer. It didn't appear to be a fundamental problem with Ubuntu or NVIDIA. The system was just messed up, and there didn't appear to be any more obvious solutions to work through. Another hour had passed and my posted questions had received no further responses. It seemed to be time to bite the bullet and reinstall Ubuntu, not from the Acronis backup, but from scratch. I reviewed the advice I had received earlier on how to do that, to make sure I had already put everything into place. It seemed safe to proceed with a reinstall. That became the subject of another post.

0 comments: