Showing posts with label array. Show all posts
Showing posts with label array. Show all posts

Wednesday, January 12, 2011

Windows 7: Upgrade Installation to Win7 Software RAID0 Array

I was trying to install an upgrade version of Windows 7 on a RAID0 array.  This post contains some notes on what I learned about the possibilities.

I had a new basic hard drive.  I started by installing Win7 on that drive.  The upgrade version of Windows 7 required a previous version of Windows to be installed.  It was not enough just to have the previous disc or serial number.  I was interested in upgrading from Windows XP.  To accomplish this installation, then, I had to install my copy of Windows XP and then upgrade from there.

Having done that, I used Disk Management (diskmgmt.msc) in Win7 to create a couple of Windows 7 software RAID0 arrays on two other empty hard drives.  Unlike other RAID solutions, Win7 was willing to create multiple arrays and single-drive partitions on a pair of drives being used in a RAID0 array.

I hoped to install Win7 into one of those arrays (which I called PROG-FUTURE), and to put my data into another.  Of course, since this was RAID0, I planned to have a good backup scheme for the data.

I went ahead and copied my data into that RAID0 data array.  Later, when it came time to try to install the Win7 upgrade to the PROG-FUTURE array, it seemed that this might have been a mistake.  An attempt to install WinXP to PROG-FUTURE got as far as the point where the installer recognized the various partitions on my drives.  It saw the entire hard drive as a single dynamic disk.  In other words, WinXP might have been willing to install to at least one of the two drives I was using for my RAID arrays.  It gave no sign that it would install itself in any array format to two drives simultaneously.

I was not sure whether an attempt to install WinXP, Win7, or any other operating system to a dynamic drive would run into problems.  There did exist a Dynamic Disk Converter program, and probably others like it, that would apparently be able to convert the dynamic disk to a basic disk format.  I could not say how well such programs would work.

It had occurred to me that perhaps I could use the Universal Restore feature of Acronis True Image Home 2011 (ATIH) to restore a working Win7 installation to the PROG-FUTURE array.  My attempts along those lines did not succeed.  As far as I could tell, ATIH was not capable of restoring a RAID0 array.

Another possibility was to use Ubuntu 10.10 to copy Windows 7 program files from a Win7 installation on a basic drive to the PROG-FUTURE array.  This did not appear feasible at this time, however, because Ubuntu evidently could not see the Win7 RAID0 array as such.  I also wasn't sure whether the resulting partition would actually boot.

An attempt to install directly from the Win7 upgrade CD to the PROG-FUTURE array failed early in the process, when I received this error message:

Windows cannot be installed to this hard disk space.  The partition contains one or more dynamic volumes that are not supported for installation.
It appeared, in other words, that Windows 7 could not be installed to a software RAID0 array created by Win7 itself.  I found a thread suggesting that there were ways to make it work, but it seemed that the process was tricky and prone to problems.  It appeared that the array would probably better be created from some other software or by using a RAID0 controller on the motherboard or on a separate controller card.  Another possibility that I had not heard of previously was native virtual hard disk (VHD) boot.

Monday, January 10, 2011

Acronis True Image Home 2011: Restoring Windows 7 to RAID 0: FAIL

I had installed Windows 7 on a regular hard drive -- what Win7 calls a "basic" drive.  I had made an image using Acronis True Image Home 2011 (ATIH).  Now I was trying to restore that image -- using the ATIH Universal Restore feature installed by the Acronis Plus Pack -- to a new, empty software RAID0 array that I had just created in Windows 7.  This post describes my efforts.

For perspective, I could perhaps restore an image of a Windows 7 installation onto a hardware RAID0 array, and I might be able to restore an Acronis image of a RAID0 array to a non-RAID partitionAcronis promised that the Plus Pack would enable me to restore to a striped software RAID0 array.  But now that I had made the purchase, a search led to a thread suggesting that the promise was false.

Here's how the effort unfolded.  After booting the ATIH CD, I went into Recover My Disks > Browse (wait) > select the .tib file to restore (or, in my case, the first of the five DVD-sized files comprising the backup).  I went on, OK, Next, and came to "Recover whole disks and partitions" and "Use Acronis Universal Restore."  It gave me the option of adding device drivers.  On the assumption, at that point, that drivers were not necessary in a software array, I clicked Next.  I indicated the partition I wanted to recover, and specified New Location as the dynamic volume that I had created for this purpose.  (Dynamic volumes were listed at the bottom of the screen -- I had to scroll to see them all.)  I clicked Next, and here is the message I got:

You are about to recover a partition containing OS files.  If the recovery destination is an existing non-active dynamic volume, then the system will be unbootable because activation of dynamic volumes is not supported.  Are you sure you want to continue?
I searched and found only a few links containing that statement about activation.  A post by an Acronis employee in an Acronis forum said, "The issue will be resolved in a future update of our software."  To see if the most recent update had resolved it, I went to the Acronis site and downloaded the latest build of ATIH.  I was doing this from an Ubuntu live CD, so I had to save the download (an .exe file) to a USB drive and jump it over to my laptop, running Vista, to install the .exe so I could burn an updated CD that would hopefully have better news for me in terms of restoring to a Win7 RAID0 array.  This was a lot of fooling around.

While that was happening, I went ahead with the next step, using my present copy of ATIH.  Alas, a new error:
There may not be enough free space on the system partition to boot up your operating system after recovery.
This was an odd message.  I had two 50GB partitions in my new RAID0 array.  The backup I had made was from just one partition of either 40GB or 50GB.  I clicked OK there and Acronis stopped.  It didn't try to see whether the restore would fit.  Evidently it was not set up to think in terms of RAID0 arrays.

I guessed that even the updated ATIH would not try to fix this.  But it was going to be a while before I could find out for sure.  The installation on the laptop was taking its sweet time.  After seemingly completing most of the installation, it aborted with this message:
Installation Incomplete

The installation was interrupted before Acronis True Image Home 2011 could be installed.
I didn't know who or what interrupted it.  The laptop was just sitting off to the side, doing its own thing.  I wasn't touching it.  I tried again.  Now I got a new error message:
The error was encountered while the installation.
That's really it.  That was what the message said.  It provided technical details that I didn't understand.  Toward the bottom, it said, "A possible reason might be that you do not have enough privileges."  OK, Vista.  Even though I was running as Administrator on the laptop, that was not enough, and possibly the good people at ATIH couldn't have noticed that until we were at the end of the installation process.  I was unfortunately not knowledgeable enough about the solution and was quite tired and not very patient with the idea of researching that question in order to resolve this tangent from a tangent.

I gave up on that Vista errand and just installed the upgraded ATIH on a Windows XP machine.  No permissions issues.  The upgrade took a while, but then it was successful.  Now I needed to remember how to install the Plus Pack.  My previous post about Plus Pack led me to an Acronis instructions webpage.  The basic process seemed to be to install ATIH, install the Plus Pack, and then go into the new Start Menu entry for Plus Pack and click on Acronis WinPE ISO Builder.  That required me to "Specify a path to the folder with the WinPE files."  A search for that exact statement yielded only a post where someone was trying to combine ATIH and Acronis Disk Director on one CD.  A different search led to an Acronis page instructing me to download the Windows 7 Automated Installation Kit (AIK) from Microsoft.  This was a 1.7GB download.  Was this really what people had to do if they wanted to use Acronis Universal Restore?

I went back to the search and tried a different Acronis page.  This page said that Plus Pack had three benefits, and it pointed me to three separate webpages describing those benefits:  it would support dynamic drives and GUID partition tables (GPT); it would facilitate Universal Restore between dissimilar hardware, including virtual machines; and the WinPE part would create bootable rescue media.  The page on dynamic drives led to a page on RAID support, cited above, that led in turn to a table, summarizing the kinds of RAID support provided by various versions of Windows.  It said that ATIH Plus Pack supported restoring to RAID0.

The page regarding Universal Restore said that it would work only if "You have created Acronis Bootable Media (standard, WinPE, or BartPE) after the installation of Acronis True Image Home 2011 Plus Pack."  So apparently there were three different kinds of Acronis bootable media.  To see more about that, I went back into the Start Menu, on the XP machine where I had just installed ATIH, and chose the option to start up ATIH.  In its main screen, I went to "Create bootable media."  So this was going to give me the standard variety.  I burned it to a CD.  Much easier than creating WinPE media.

With that CD, I was ready for the next step.  Acronis said that I would need drivers for the hard drive controller or the chipset, in .inf, .sys, or .oem forms -- extracted, if necessary, from .exe, .cab, or .zip files.  The last time I had played with Acronis Universal Restore, I hadn't understood this driver situation and, as I dimly recalled, part of the problem was that I didn't know which drivers I should get and how I should extract them.  It was clearer to me, now, that there was no getting around it:  I had to have exactly the right drivers.  I have written up that pursuit in a separate post, for those whom it puzzles as it puzzled me.

Not to say that I came to a clear understanding.  I just made a stab at it.

With the drivers collected in a folder, I booted the Universal Restore CD.  When I got to the Drivers Manager step, I clicked Add Search Path and pointed to that folder.  Oddly, when I did, Acronis reported, "No items to display," even though I had just put 16 driver files in there.  I guessed that this meant it had not refreshed its view at that point.  It did not have an option to do so.  I proceeded to designate what I wanted to restore and where I wanted to restore it.  Once again, sadly, I got that error message indicating that "activation of dynamic volumes is not supported," followed by that error indicating that there might not be enough space to boot the operating system after recovery, and once again that was the end.

The working conclusion, at this point, was that ATIH did not support Windows 7 programs partitions in RAID0.  I could install Win7 manually in that kind of partition, and perhaps I could use ATIH to back up a manual installation, but I could not use ATIH to restore any such backup to that location.

I clicked on the Help button in Acronis and browsed its contents, to see if I could get a clearer idea about all this.  They didn't seem to have any information on it.  I posted a question about this on an Acronis forum.  Responses to that question tentatively supported the working conclusion that ATIH does not restore images to Win7 software RAID0 arrays.

Acronis Universal Restore: Getting the Drivers

As shown in another post in this blog, I was in the process of using the Universal Restore feature of Acronis True Image Home 2011 (ATIH) to restore an Acronis drive image from one drive to another.  For this purpose, Acronis made clear that I would need certain motherboard drivers to succeed:

You [must] have drivers for the hard disk drive controller or chipset drivers for the new computer. These drivers are critical for booting the operating system. You can download the drivers for your motherboard on the Vendor's web-site. Please note, if you downloaded the drivers in *.exe, *.cab, *.zip format, you should extract them first. The driver files should have the *.inf, *.sys or *.oem extensions.
My motherboard was a Gigabyte GA-MA785GM-US2H.  I went to Gigabyte's download webpage and indicated that I wanted drivers for 32-bit Windows 7.  They offered me several kinds of drivers:  audio, chipset/VGA, LAN, and SATA RAID.  Plainly, Acronis was not telling me to get the audio or LAN drivers.  I definitely needed the chipset/VGA driver.  How about the SATA RAID driver?  I wasn't using hardware RAID.  I was relying on Windows 7 software RAID, and it did not require the floppy-based pre-Windows installation procedure that those SATA RAID drivers required.  The description of the chipset driver said "AMD Chipset Driver (include chipset\sata raid\vga driver)."  That sounded like the all-purpose thing Acronis wanted.  So I guessed that I probably needed only the chipset/VGA driver.

The chipset driver was a 62MB download called motherboard_driver_chipset_amd_7series-v2.0_win7-32.exe.  My first question was, how do I extract drivers from an .exe file?  I tried running it.  Fortunately, it unzipped itself.  Apparently it was packaged as an executable whose purpose in executing was just to let itself breathe.  Next, how to find the .inf, .sys, or .oem file(s) needed?  I focused Windows Explorer on the unzipped folder and typed *.inf in the search bar at the top right-hand corner of the Windows Explorer screen.  I hit Ctrl-A to select everything that the search produced, Ctrl-C to copy them all, and then went off and created a new folder (not on drive C) and hit Ctrl-V to paste these copies into there.  I did the same thing for .sys and .oem.  There weren't any .oem files, but I came up with 16 of the other two kinds.

Those may or may not have been the right drivers.  I hoped that ATIH would look at the folder, when doing its Universal Restore, and would see what it wanted in there.  Unfortunately, as described in another post in this blog under the title, "Acronis True Image Home 2011:  Restoring Windows 7 to RAID 0," this was pretty much where the matter died.  The responses to a question I had posed in Acronis's support forum were indicating that I could not restore to a Win7 software RAID0 array.

Thursday, January 6, 2011

Windows 7: Setting Up RAID

The two potential advantages of setting up hard drives in a RAID array seemed to be performance and/or safety.  Among the many types of RAID arrays, some of the more frequently mentioned were RAID 0, RAID 1, and RAID 5.

RAID 0 used a "striping" approach where data was divided between hard drives.  This was intended to speed up data reading.  The drawback was that failure of either drive would mean loss of all data, since the data was stored on the drives in a puzzle-like form that could not be reconstructed without all of the pieces from both drives.  If someone had a RAID 0 array containing 100 drives, and if the whole array would fail as soon as any one drive failed, the safety of the whole array would be only as great as the safety of its weakest drive.  So the performance improvements of RAID 0 meant greater risk of data loss.

RAID 1 used a mirroring approach where everything was written onto disk 1, but simultaneously copied to disk 2.  This would not speed things up at all; if anything, it would slow things down a bit, as the system coped with twice as much data.  The purpose of RAID 1 was safety.  Either drive could fail, and yet no data would be lost.  If the computer was set up so as to let the user know that a drive had failed, then s/he could just replace the failed drive and let the RAID array fill in the new replacement with the current state of the data.

RAID 5 used a striping approach, for speed, combined with a distribution scheme for safety.   A RAID 5 array would have at least three drives.  Failure of one drive would reduce performance but would not result in data loss.

As of this writing, Microsoft was not providing clear information regarding which versions of Windows 7 offered which kinds of RAID.  Informal reports indicated that only RAID0 and RAID1 were available in user versions of Win7, and that RAID5 (and perhaps other kinds) were available in server versions of Win7.  Acronis said that RAID5 was available only in Windows Server 2000, 2003, and 2008.  The approach I planned to take, then, was to set up a RAID0 array with good backup.

To set up the array, I typed diskmgmt.msc in the Start button > Search programs and files box.  That opened up Disk Management.  The disks I wanted to put into my array were Disk 0 and Disk 1.  First, I wanted to set up a 50GB array called PROGRAMS to contain Windows program files.  I right-clicked on the gray part of the Disk Management window where it said "Disk 0" and chose "New striped volume."  This opened the New Striped Volume Wizard.  I added Disk 0 and Disk 1 to the Selected box, specified 25600MB space (on each drive, for a total of 50GB).  I looked into allocation unit size briefly and decided to stick with the default.  As I had gathered elsewhere, I didn't have to worry about the question of converting the disks from basic to dynamic because that was unavoidable with striping, and now I was at the point where the wizard was going to take care of it.

There were some other things I wanted to put into RAID0 arrays for speed.  Examples included virtual machines and files on which I would be doing video editing.  I set up arrays for those.  But there were also some things that I didn't want to put into a RAID0 array.  These were items that were big and bulky, and therefore would take a lot of time to back up, but were not performance-intensive.  General-purpose storage would fall into this category.  If I wanted performance in that area, I could move those items out of storage and into a RAID0 workspace.  For storage, I considered using a spanned volume that would put half of the data on each of the two hard drives, so as to reduce the amount of data that I would have to restore if either drive did fail.  But it sounded like restoring data from a spanned volume could be tricky, so I thought I might just divide the stored data into two separate partitions, one on each disk.  To do that, I used the "New simple volume" option in Disk Management.

There didn't seem to be any point in trying to set up a RAID volume for an Ubuntu dual-boot.  This was software RAID, governed by the Windows operating system.  Windows allowed the user to allocate only part of a drive to RAID, and as far as I knew Ubuntu set aside whole disks in a hardware RAID array.  I did set aside some space for Ubuntu program installation, however.

With the partitions set up, the next goal was to restore my Windows 7 installation to the PROGRAMS array.  That became a whole ordeal in itself, as described in a separate post.

Saturday, October 30, 2010

Ubuntu 10.10: Streamlined RAID 0 Installation

I had previously installed Ubuntu 10.04 on a two-drive RAID 0 array.  I did that to make a Windows XP guest virtual machine (VM) run faster in VMware Workstation 7.1.  I had then run into some problems with that installation, and had abandoned it.  Now it was time to try again, but this time with Ubuntu 10.10.  This post describes the process in more streamlined terms, drawing from the previous post in which I logged the details of that earlier attempt.

This time, as before, I had two hard drives for the RAID 0 array, plus a third drive on which I had already installed Windows XP.  The two empty hard drives for RAID were each 320GB.  That third drive also held my /home partition (i.e., the contents of the /home partition from a previous Ubuntu installation), which contained many of my settings and adjustments for various Ubuntu programs.  In other words, my Ubuntu installation would not be like a Windows XP installation, where it would be necessary to reinstall all of my applications (except the portable ones) after reinstalling the operating system.  The third drive also held my Linux swap space, which I probably could have put into the array instead, along with a partition I called LOCAL, which would hold backup copies of the VMware virtual machines.  I was going to put the active VMs into the RAID 0 setup to make them run faster, but of course RAID 0 was riskier in the sense that failure of either of the two hard drives would mean the loss of everything in the RAID 0 array.

I started by downloading and burning the Ubuntu 10.10 alternate (or "alternative") CD.  I booted that CD and chose the "Check disc for defects" option.  This took five or ten minutes, and then it said, "Integrity test successful," and then rebooted.  So then I went through the "Install Ubuntu" option and took the basic steps (selecting my country, my keyboard type, etc.).  The meat of the RAID 0 process began about three minutes into that video by amzertech (speaking, here, of its Part 1, not Part 2), where it was time to partition the drives.  I went to Manual (i.e., not Guided), and this put me into the main "Partition disks" screen, the one beginning with "This is an overview."  I went down to the first of the two hard drives.  It referred to them as SCSI partitions, but it also recognized them as being sdb and sdc.  So it looked like I had correctly cabled that third drive to actually be the first in the system (i.e., sda, a/k/a SCSI3 according to the partitioner), so as to make Windows happy.

The general concept of the RAID setup process was that, first, you designate some free space on each drive as a physical volume for RAID, and then you combine those physical volumes from the two (or more) drives into a single software RAID device.  The following paragraphs provide the details.

First, following the video, I went down to the first of the empty 320GB drives that I was going to use for my RAID array.  In my case, unlike the video, there was not yet any "pri/log" line showing "FREE SPACE" that I could select, under the drive identification line on the screen, so I just highlighted the drive itself and hit Enter.  This gave me the option of creating a new empty partition table on the drive, and I went with that for each of the two drives.  Then I highlighted the pri/log line under the first 320GB drive, showing free space.  There, I hit Enter and chose "Create a new partition."  For its size, I typed "50GB" and made it a primary partition at the end of the drive.  I guessed that this meant the outside of of the physical disc, where I believed data transfers would be faster.  Instead of leaving "Use as" at the default ext4 setting, I highlighted and hit Enter and went down to select "physical volume for RAID" (enter) > "Done setting up the partition."  I went through the same steps with the second 320GB drive, which was sdc on my system.  So now, back on the Partition disks" screen, each of the two drives showed an entry that looked like this:

#1   primary   50.0GB   K   raid
So this would give me a total of 100GB for my Ubuntu program installation, and I would still have several hundred GB left over as free space.  Now, on the main "overview" screen, I went up to the line that said "Configure software RAID" > "Write the changes to the storage devices" > "Create MD device" > RAID0.  This put me at a list of "active devices."  I wanted sdb1 and sdc1 (i.e., I didn't want to use one of the partitions I had previously created on sda, my third hard drive).  These were the only partitions on drives sdb and sdc, so the choice was easy.  For some reason, they showed up here as being 49999MB rather than 50GB.  I selected sdb1 and sdc1.  I highlighted each of those two, hit spacebar to select them, and then tabbed to Continue > Finish.  This put me back in the "overview" screen, where I saw that I now had these new lines, near the top:
RAID0 device #0 - 100.0 GB Software RAID device
   #1       100.0 GB
              131.1 kB         unusable
I highlighted the line that began with #1 and hit Enter > Use as > ext3 (apparently still more reliable than ext4) > Mount point > "/ - the root file system" > "Done setting up the partition."  This put me back in the "overview" screen, where the line now looked like this:
   #1       100.0 GB     f   ext4      /
I decided to go ahead with the video's approach of putting the swap space on the RAID0 partition.  To do this, I went through the same steps as above, starting with the free space line on each of the two drives.  The only differences were:
(a) I allocated only 5GB on each drive for this partition.
(b) This time, I selected sdb2 and sdc2 (instead of sdb1 and sdc1) as my active devices for the array.
(c) Under "Use as," I chose "swap area" instead of ext3.
The result, back in the "overview" screen, was that I had these lines:
RAID0 device #0 - 100.0 GB Linux Software RAID Array
   #1       100.0 GB    F   ext3      /
RAID0 device #1 - 10.0 GB Linux Software RAID Array
   #1         10.0 GB     f  swap     swap
              131.1 kB unusable
At this point, I wanted to vary from the video by adding one more partition, where I would put my VMs and possibly other things.  I went through the same process as with the first RAID device (above), and I used all of the remaining space on the two drives except for about 1GB.  The active devices in this case (when I got to that point in the process) were, of course, sdb3 and sdc3.  Back in the "overview" screen, I saw that I now had RAID0 device #2 of 528GB.  I would never need all of that space for my VMs, but I had no other use for the space, and this RAID setup process was a one-shot deal:  designate the space in some useful form now, or leave it forever unallocated.

So now, the final step.  I needed to create a /boot partition on just one drive.  That was why I needed to save 1GB.  I could have made one of those last active devices (either sdb3 or sdc3) larger than the other, but there was no point:  as I understood it, RAID0 would use only the amount of space that they both had in common.  So I would wind up with 1GB unused on one of the two drives.  Anyway, to create the /boot partition, I selected that remaining free space on sdb (i.e., the first of my two RAID drives) and used it all up on another ext3 partition.  This time, I chose ext3 without first choosing "physical volume for RAID"; and after choosing ext3, I didn't go right to "Done setting up the partition."  Instead, I stopped first at the "Mount point" option, where I chose the "/boot" option.  Back in the "overview" screen, I saw that I now had my three RAID0 devices at the top of the list, and the /boot device as sdb4 down under the first of my two 320GB drives.

In the "overview" screen, I saw that there was too much information; some had scrolled off the bottom of the screen.  I arrowed down until I got to the very bottom of all that, where I chose the "Finish partitioning and write changes to disk" > Yes option.  This started me right into the Ubuntu installation process, where I just entered basic information (e.g., my name).  The installation was very straightforward, and it worked:  Ubuntu booted up.  I then went to System > Administration > System Monitor > File Systems tab.  There, I saw /dev/md0 as root directory and /dev/sdb4 as /boot.  (The video said that the swap would not be visible here, and it wasn't.)  So my next step, at this point, was to refine the basic installation to suit my preferences.  The description of that process appears in a separate post.

Monday, September 27, 2010

Dual-Boot RAID 0: Ubuntu 10.04 and Windows XP

I wanted to set up a SATA RAID 0 array that would function like any other dual-boot system:  I would turn on the computer; it would do its initial self-check; I would see a GRUB menu; and I would choose to go into either Windows XP or Ubuntu 10.04 from there.  This post describes the process of setting up that array.

With no drives other than my two identical, unformatted SATA drives connected, I turned on the computer.  The BIOS for my Gigabyte motherboard did not give me the obvious RAID configuration option I had hoped for.  I hit DEL to go into BIOS setup.  Nothing jumped out at me.  Desperate for guidance, I turned to the manual.  I was looking at an Award Software CMOS Setup Utility.  The manual directed me to its Integrated Peripherals section.  There, I set OnChip SATA Controller to Enabled, OnChip SATA Type to RAID, and OnChip SATA Port4/5 Type to As SATA Type.  I hit F10 to save and exit.

According to the manual, that little maneuver was supposed to give me an option, after the initial boot screen, to hit Ctrl-F and go into the RAID configuration utility.  Instead, the next thing I got was this:

Press [Space] key to skip, or other key to continue...
I didn't do anything.  In scanned my drives and then led on to the Ctrl-F option.  I rebooted and tried it again.  Hitting the space key led to the same result.  Ctrl-F opened the AMD FastBuild Utility.  I hit option 2 to define an array.  This gave me a list of my two drives, labeled as LD 1 and LD 2.  Apparently it wasn't supposed to show anything.  LD was short for "logical disk set."  It was essentially showing two separate arrays, each having one drive.  So although the manual didn't say so, it seemed that I needed to get out of here and go into option 3 to delete these arrays.  I did that and then went back into option 2.  Now I was looking at a blank list of LDs, just like in the manual.

So now I was ready to prepare my array.  In option 2, I hit Enter to select LD 1.  This defaulted to RAID 0 with zero drives assigned to it.  I arrowed down to the Assignment area and put Y next to each of the two drives listed.  Now it said there were two drives assigned.  But now I had a couple of things to research.  The screen was giving me options for Stripe Block, Fast Initialize, Gigabyte Boundary, and Cache Mode.  The manual didn't say what these were.

I did a search for information on the Stripe Block size.  I found an old AnandTech article that took the approach of choosing the lowest stripe size where performance tended to level out -- where, that is, increasing the stripe size another notch did not increase performance.  For the RAID controllers they were testing, it looked like performance kept increasing right up to the range of 256KB to 512KB, for those controllers whose options went that high.  Mine only gave me a choice between 64KB and 128KB, so I chose the latter.  A more recent discussion thread seemed to support that decision.

Regarding the "Fast Init" option, a search led to some advice saying that slow initialize would take longer but would improve reliability.  A different webpage clarified that the difference was that slow initialize would physically check the disk and would be suitable if you had had trouble with the disk or if you suspected it had bad blocks.  I decided to stay with the default, which was Fast Init ON.

The "Gigabyte Boundary" option would reportedly make the larger of two slightly mismatched drives in an array behave as though it were the same size as the smaller one.  The concept appeared to be that, if you were backing up one drive with another (which was not the case with a RAID 0 array), you would use this so that the larger drive would never contain more data than the smaller drive could copy.  Mine was set to ON by default.  I couldn't quite understand why anyone would need to turn it off, even if the drives were the same size.

Finally, the "Cache Mode" option was apparently capable of offering different choices (e.g., write-back), but mine was fixed at WriteThru with no other options available.  So I thought about it a long time and then decided this was acceptable to me.  So then I hit Ctrl-Y to save these settings.  Now I was back at the Define LD Menu, but this time it showed a RAID 0 array with two drives and Functional status.  That seemed to be all I could do there, so I exited that menu.  I poked around the other options on the Main Menu.  I seemed to be done with the FastBuild Utility.

Next, the manual wanted me to use a floppy disk to install the SATA RAID driver.  I could have just gone ahead and done that -- I still had a floppy drive and some blank diskettes -- but I thought surely there must be a better way by now.  Apparently there was:  use Vista instead of WinXP.  But if you were determined to use XP, as I was, the choices seemed to be either to go through a complex slipstreaming process or use the floppy.

There was, however, another option.  I could buy a RAID controller card, for as little as $30 or as much as $1,000+, and it might come with SATA drivers on CD.  This raised the question of whether the RAID cards actually had some advantage beyond their included CD.  My brief investigation suggested that a dedicated RAID card could handle the processing task, taking that load off the CPU, but that there wasn't much of a processing task in the case of RAID 0.  In other words, for my purposes, a RAID controller card wouldn't likely add any performance improvement.  Someone said it could even impair performance if it was a regular PCI card (as distinct from e.g., PCIe) or if its onboard processor was slower than the computer's main CPU.  There did seem to be a portability advantage, though:  moving the array to a different motherboard would require its re-creation, in at least some cases, but bringing along the controller card would eliminate that need -- though the flip side was that the card might fail first, taking the array with it.

Further reading led to the distinction between hardware and software RAID.  An older article made me think that the essential difference (since they all use hardware and software) was that software RAID would be done by the operating system and would run on the CPU, and would therefore be dependent upon the operating system -- raising the question of whether dual-booting would be impossible in a software RAID array, as a generally informative Wikipedia article suggested.  To get more specific, I looked at the manual for a popular motherboard, the Gigabyte GA-MA785GM-US2H.  That unit's onboard RAID controller, plainly enough, was like mine:  it depended upon the operating system.  Wikipedia said that cheap controller cards provide a "fake RAID" service of handling early-stage bootup, without an onboard processor to take any of the load off the CPU.  FakeRAID seemed to get mixed reviews.

An alternative, hinted at in one or two things I read, was simply to set up the RAID 0 array for the operating system in need of speed, and install a separate hard drive for the other operating system.  I was interested in speeding up Linux, so that would be the one that would get the RAID array.  I rarely ran Windows on that machine, so any hard drive would do.  A look into older, smaller, and otherwise seemingly less costly drives led to the conclusion that I should pretty much expect to get a 300GB+ hard drive, at a new price of around $45.  Since I was planning to use Windows infrequently on that machine, it was also possible that I could come up with some kind of WinXP on USB solution, and just boot the machine with a flash drive whenever I needed access to the Windows installation there.

I decided that, for the time being, I would focus on just setting up the Ubuntu part of this dual-boot system, and would decide what do to about the Windows part after that.  I have described the Ubuntu RAID 0 setup in another post.

Sunday, May 23, 2010

How to Arrange Cells from Many Columns into One Column in Excel 2003

Suppose you're using Microsoft Excel 2003.  Suppose you have data in multiple columns, in an irregular array, like this:


And suppose you want to get all of that data into column A, like this:


How should you proceed?

Summary

To arrange an irregular table so that all of its cells are in a single column, create a separate worksheet for your calculations.  Count the number of cells containing data, so that you can be sure your process works correctly.  Use the CELL function to return the locations of the cells that actually contain data.  Copy those results into Word.  Convert that table to text.  Use Find-and-Replace to shrink that text file.  Paste it back into Excel.  Use string functions and indexing as needed to arrange the list as you wish. Use the INDIRECT function to show the contents of the referenced cells.

If this post is helpful, please add a comment below.

Step by Step

In this answer, I'll take the slow route, because it may be more efficient.

First, find out how big your spreadsheet is.  From anywhere in the spreadsheet, hit Ctrl-Home.  Let's say that takes you to cell A1.  That's the upper left corner of your spreadsheet.  Now hit Ctrl-End.  Let's say that takes you to FB1765.  That's the lower right corner of your spreadsheet.  (That's a pretty big spreadsheet.)

With a spreadsheet that big, things can get confusing.  If it were smaller, you could do the conversion manually.  There are a couple of ways to do that.  One would be to sort each column, by itself, and cut and paste only those cells containing data to the bottom of column A.  Another way would be to do an AutoFilter (Data > Filter > AutoFilter) and cut and paste the results from each column.

But we have a big spreadsheet, and we want a faster and safer solution than we could get with a manual cut-and-paste operation.  So now open a new worksheet within the existing file.  That's Insert (from the menu bar) > Worksheet.  Do your work here.  This will give you more space to work in, and will protect your original spreadsheet from unwanted changes.  (I'm referring to "spreadsheet" and "worksheet" interchangeably here.)  So remember:  we won't be making any changes to your original spreadsheet; all of this will take place on other spreadsheets.

Let's say the original spreadsheet is called Multiword and this new spreadsheet is called Sheet1. In Sheet1, go to cell A1 and type this:

=IF(LEN(Multiword!A1)>0,"x","")

This says, if the length of cell A1 in Multiword is greater than zero (that is, if there's something in the cell, even just a spacebar space), then give me an "x"; otherwise, give me nothing.  This is useful because sometimes formatting can cause cells in Excel to behave as though there were something in them, when there's not.

Now let's copy that formula so that it covers the same territory in Sheet1 that your data occupy in Multiword.  Move your mouse cursor to the lower right corner of cell A1.  Your cursor will change into crosshairs.  Left-click on that lower right corner of cell A1 and drag it down to A1765.  Let go, and then left-click and drag it across to column FB, and let go.  This gives you an "x" corresponding to each cell in Multiword that contains data.  Now select it all (Ctrl-A) and then go to Format > Column > Width > 1.  (Or even narrower, e.g., .5.)  This gives you a more easily visualized map of how your data is laid out.  You could have done the same thing by just making the formula say =Multiword!A1, and this would have had the advantage of showing you the actual contents of Multiword, as you moved your cursor from one cell to another; but this can be easier to think about.  Besides, we can use those consistent "x" values.

Now let's see how many entries you should wind up with at the end.  In Sheet1, go to a cell outside your data map.  In this example, let's go to A1767.  There, type this:

=COUNTIF(A1:FB1765,"x")

That will tell us how many cells contain data.  It may not display correctly in cell A1767 because the column width is too narrow, but you can see what it says by either widening the column or going to A1767 and hitting F2 and then F9.  In my example, that shows me that I have 21,242 cells containing data.  So in my final result, I should wind up with data in cells A1 through A21242, and nowhere else.

Now let's say I like that map in Sheet1, and I want to save it, but I don't want it to take up calculation time.  I can freeze it all forever -- that is, I can convert it all to values instead of formulas.  To do this, go to A1 and hit Shift End-Home (i.e., while holding Shift, hit End and then Home) to select it all.  Hit Edit > Copy and then Edit > Paste Special > Values.  Hit the Enter key a couple of times, until it looks like it's done.  Now those formulas in Sheet1 will all be converted to simple "x" entries.  Save a version of the file for backup.  For instance, let's call it BigFile 01.xls, and then save again as a newer version (BigFile 02.xls).

Alright.  On to the main event.  Let's create another spreadsheet, Sheet2.  Here, in cell A1, enter this formula:

=if(Sheet1!A1<>"x","",CELL("address",A1))

That tells Sheet2 to enter the location of cell A1 into cell A1.  That is, Sheet2!A1 will now say $A$1.  (Note that, if you didn't want to keep Sheet1 as a map, you could incorporate the LEN calculation (above) into this formula, and do both at the same time on Sheet1.)

In Sheet2, copy that formula to all cells, from A1 to FB1765, as described above.  Here's a before-and-after picture of what that would look like, if I were trying to do it all in a single spreadsheet:


In that example, what I want next would look like this:

$A$2
$A$3
$B$1
$B$2
$B$4
$C$3

This would be a step on the way to getting values like this:

3
2
5
14
8
6

So how do we do that?  In a big spreadsheet like mine, it's easier to do it in Microsoft Word.  So let's get Sheet2 ready for transfer.  Freeze Sheet2 as described above (with Paste Special etc.).  Hit Ctrl-A to select it all, and then Ctrl-C to copy it all.  In an empty Word document, hit Ctrl-V to paste it all.  With a big spreadsheet, this could take a while, as Word slowly gags on a couple hundred columns.  The result could be ugly -- mine was a pinstriped thing that didn't look like it contained any data at all -- but fear not.  When Word is done figuring it out, click somewhere on the resulting table.  Choose Table > Convert > Table to Text > Paragraph marks.  It will default to a checkmark in "Convert nested tables," which is fine.  Click OK.  After a couple of years, Word will give you a very ragged document, with lots of spaces between rows.  (Mine was more than 3,000 pages long.)  These blank rows are easy to clean up with Find-and-Replace.  In Word, ^p is the newline character for most documents.  So do a Find-and-Replace (Ctrl-H) to replace two newlines with one.  In other words, replace ^p^p with ^p.  (Actually, before doing that, you may want to remove spacebar spaces before or after the ^p, else some lines may not get fixed.)  Repeat the ^p^p replacement until all of your cell references are in a nice list.  Word may continue to believe that it needs to remove a couple more ^p^p duplicates, but at some point you can tell it's lying.  Save the result.  Let's call it BigFile.doc.

When I went through these steps with one file, they worked fine.  When I went through them with another file, however, I had a problem at this point.  The problem was that Word did not convert all of the lines properly.  It jammed a bunch of Excel cell contents together on the same line, instead of giving each its own line.  (I could tell:  I did a LEN in a separate column for each imported line in Excel, sorted on that column, and found that some were very long.)  To fix this, I had to search the Excel file to find a character that did not already occur in it (e.g., @ or `), and then revise the formula (above) so that it would stick that character on the end.  (Don't use ^ or ~ or other characters that don't turn up normal search results when you try to search for them.)  Then my first step in Word, after converting table to text, was to search for that character and replace it with ^p.

Another innovation, in that second try, was to combine the text and its cell location.  Using the data shown above, this gave me this kind of result:

3$A$2
2$A$3
5$B$1
14$B$2
8$B$4
6$C$3


That way, I could tell where the number (e.g., 3) had come from (e.g., cell A2), and I could use FIND and MID functions to put the numbers and their locations in separate columns.

Anyway, to continue.  If cell order is important for your purposes, sort the Word doc.  If it's not too big, you can do it in Word.  Hit Ctrl-A and then Table > Sort > Sort by Paragraphs.  Mine was too big, so I created a new Excel spreadsheet, Sheet3, and pasted it back into there.  Sure enough, I had my 21,242 entries in column A.  Excel didn't sort them the way I liked, though:  it had $A$9 after $AY$896.  This called for some use of the LEN and & functions.  For instance, if the LEN of the cell containing $A$9 is less than the LEN of the other cell, then insert some zeroes (using MID and FIND and &) before the 9; and to get $A before $AY, consider using an Index column to rank the entries in the order they should go (with maybe a temporary addition before the $A).

Once you have your 21,242 (or whatever) references in the order you prefer, there in column A in Sheet3, enter this in B1:

="Multiword!"&A1

and enter this in cell C1:

=INDIRECT(B1)

copy cells B1 and C1 all the way down to the bottom of the spreadsheet.  You may want to save your work as a new file (BigFile 03.xls) and then freeze column C, and then delete all other columns and worksheets.

*  *  *  *  *

Again, if this post is helpful, please add a comment below.  Cheers!