Saturday, October 24, 2009

How to Put Batteries into the Jia Hao LED Bivouac Light

OK, one of the goofier topics I've blogged on, but why not?  I bought the 11-LED (that's right, eleven LEDs) lantern, JH-2588-11, comes in a box labeled "LED Bivouac Light" from a company called Jia Hao, and here it sat, and I was damned if I can figure out how to put the batteries in it.

I found the place on the bottom easily enough, but how to open it?  I couldn't tell quite what the little black button was supposed to do.  Turns out the solution was just to push straight in (i.e., toward the battery compartment), not down (i.e., toward the top of the lantern).

In case you haven't yet bought it, but are wondering whether you should, I can't say.  I just got it.  But it does seem basically real and functional, has a little fold-out hanger on top, and throws a fair amount of light.  Here's how it did in a totally dark bathroom:

In case you're not sure what to make of that photo, what it basically says is that I have enough light to read pretty comfortably, when my magazine is right next to it, and I have enough dim light to read when I'm four or five feet away from it.  I didn't try climbing into the tub to read from a position further away.  I think there's probably enough light from this thing to distinguish objects from one another at least 15-20 feet away, but that remains to be tested.

Sunday, October 11, 2009

Making a Map with Epi Info

I wanted to create a map that would show changes in some kinds of data on a county-by-county basis for the state of Indiana. To create this map, I downloaded and installed the free Epi Info program from the CDC. To get the outline of Indiana and its counties, I downloaded the appropriate TIGER/Line shapefile from the Census Bureau. I had previously tried downloading one from another source, but it turned out not to be able to accommodate county-by-county data.

I had prepared some data in an Excel spreadsheet. I imported it into Epi Info (Analyze Data > Data > Read (Import). I ran into some problems with this. One problem was that I had a hyphen in the filename, and Epi Info couldn't deal with that. I was also not quite sure what to do with this data, so I backed up and tried using some canned data from a standard source. I went to the Census Bureau's USA Counties Data Files webpage and downloaded the POP01.xls spreadsheet. I edited out the non-Indiana data and saved that spreadsheet. I imported it into Epi Info. I tried to export just one variable in Epi 2000 format and got an error message: "The name specified for the output table is reserved." A Google search turned up no references to that error. I thought that meant I was using the name of an already existing file, but when I tried a different (new) file name, I got the same thing. The filename had indeed been created, but there was nothing in it.

A less restrictive search led to, among other things, an Epi Info Training Manual on the website of the Department of Food Science and Human Nutrition at Iowa State. (You can major in food? Why didn't I think of that?) Their manual said that having a space in your filename could cause problems, but that wasn't the problem for me. Finding that manual made me think of searching for Epi Info manuals specifically, and that led me to the CDC's manual for Epi Info for DOS. I couldn't figure out where that had downloaded itself to, but I finally found it (epi6man.exe) in My Documents\Downloads; but when I ran it from a DOS prompt, it gave me "Access Denied." Apparently it was was created in 1992 (or maybe 1994) and, as such, was designed for old-style DOS systems, not for my CMD window in Windows XP. I found another, much shorter manual produced by the Great Lakes Epidemiology Center. But neither of these seemed to explain this problem.

What I finally figured out was this. I import the data from a spreadsheet or database file. I list the imported contents. So far, so good. Now I want to write the output in Epi 2000 format (which is just .mdb format, i.e., Access). Here's the trick. I didn't need to do this on the computer where I first used Epi Info, but I needed to do it on my home installation. To write the data successfully, *temporarily* chose the same format and filename as your input file, as if you were going to overwrite that file. This will make your tables show up in the Data Table drop-down box. Select the table you want to use. Don't click OK yet. Go back to the File Name box and change it to the new name that you want to create. If necessary, select the table name again. You may have to just type the file name and let it save itself wherever, in order to avoid the problem that comes from browsing to the desired location.

So that solved that problem.

Next, in Epi Info, I clicked on the Map button to open Epi Map. In the upper-left corner, I clicked on the button that looked like a stack of three sheets of paper. This opened Map Manager. I clicked Add Layer and navigated to my Indiana state and counties shapefile (above). This was where I had finally figured out that the other shapefile I had downloaded previously did not have a capacity for county data: its Add Data button was grayed out. This time around, that was not a problem. I clicked Add Data and navigated to the .mdb file I had just written from Analyze Data. After clicking a couple of buttons, it gave me a Select Relate Fields and Render Field dialog with three panes. In the left pane, it showed Shape Fields. In the center, it showed the Geographic Field from my .mdb file, which I had called the County field.

I figured I needed to put my County names in a form that the related Shape Field would recognize, so that the data for Adams County would actually appear in the Adams County part of the map. But what was the right form? I went back to the files that I had unzipped, when I had downloaded the shapefile. There weren't any Read-Me or .pdf files there. There was an .xml file that looked like it contained what I wanted, but I had to try several browsers before I found that Google Chrome would at least show its contents in a normally readable (although unformatted) form. It didn't have much info after all, but it did point me toward a page on TIGER (short for Topically Integrated Geographic Encoding and Referencing system) products, which led to the shapefiles main page, which led to their technical documentation page, where I downloaded their full 185-page Technical Documentation manual, which was already nicely bookmarked (though for some reason it didn't open up that way), wherein I groped around for a while.

As I looked at the Select Relate Fields dialog, I saw two likely candidates: CNTYIDFP00 and COUNTYFP00. One of the unzipped files accompanying the shapefile was a .dbf file, so I tried viewing it in Access, but no go. I tried opening it in Epi Analyze as a dBASE IV file, but it said, "Filenames for this data format must be in the old 8.3 style." So I copied that .dbf file and called the copy TEMP.dbf, and tried opening that in Epi Analyze. I did a Statistics > List and there we were, and I saw there was a third field I had not counted on: the NAMELSAD00 field had complete county names, just like they were in my massaged spreadsheet: Adams County, etc. So, OK, bailing out of that, back in Epi Map I related their NAMELSAD00 field to my County field and selected, as the Render Field, the first of the several (actually, ten) years' worth of data that I wanted to map. Sadly, this gave me an OpenData dialog that said, "There were no matches found between the data table field and the shapefile field." Looking again at the structure of my data table in Access, I saw that the County field was Text type, field size 255. To get comparable information on the shapefile, I went back into that TEMP.dbf file in Epi Analyze and chose Variables > Display. This told me that NAMELSAD00 was a text field, but not its length. I tried a Write (Export) to a TEMP.mdb file in Epi 2000 (probably could have used Access 2000) format. Finally, I got the information I wanted: the NAMELSAD00 field was configured exactly the same as my County field. So what was I doing wrong?

A Google search for that error message turned up nothing. A search for "no matches" in the Technical Documentation .pdf produced - you guessed it - no matches. Once again, a less precise Google search led to some possibilities, including a Cardiff Council manual, presented in Scribd format, but I wasn't getting much mileage there either. A remark in the Technical Documentation said, "Federal Information Processing Series codes will continue to serve as the key matching and joining codes for Census Bureau products." So I thought maybe I should try linking on a numeric field instead of the County name field. In Access, I created another version of my data table, with additional fields for CNTYIDFP00 (which I made primary key) and COUNTYFP00. This dropped two records, due to county names that contained spaces or differed between the two, but I manually reinserted those and marched on. Back in Epi Map's Select Relate Fields dialog, I designated CNTYIDFP00 as my Relate field. That worked. I had myself a map. The last step was to tinker with Map Manager's Properties button, which took a lot of time but yielded great improvement in appearance.

I repeated the last few steps for each of the ten years in my study period. For each finished map, I took a screenshot (Print Screen button on keyboard) and pasted it into IrfanView, where I did some batch cropping so that the state map would remain in the same position on all ten screenshots.  The shapefile I had used also had distorted the state, making it look shorter and thicker than it normally is on maps, so I used IrfanView to batch-adjust the dimensions of each screenshot.  I then imported the images into Adobe Premiere Elements and added transitions and titles to show the year.  I posted the result on YouTube and have also posted an explanation of what it's all about, here on my blog.

Video: Ten Years of Income and Poverty Fluctuations in Indiana

I used Epi Info to make a map of some statistical information from Indiana for 1998.  I made another map of the same data for 1999, and so on through 2007.  I treated the maps as still photos and made a video of them, which is now available on YouTube.  This post explains what the video shows.

The video shows a map of Indiana and its counties over a ten-year period, from 1998 through 2007 inclusive.  The counties are represented in various colors.  The colors show whether counties fared well or poorly during each of those years.  The best outcomes are in deep green.  Nearly neutral outcomes are on the boundary between light green and yellow.  The worst outcomes range from yellow through orange to red.

The calculations behind these maps begin with year-by-year data from the U.S. Census Bureau.  The specific data sources used are described in the other post.   These maps show the relationship between two streams of county-by-county data.  One is the per capita income, adjusted for inflation.  The other is the number of people in poverty. 

For both of these data streams, I calculated rates of change.  So, for example, the video begins with 1998.  Warren County, at the left edge of the state, appears in red.  Red indicates an extreme divergence between changes in per capita income and in poverty rates.  In most if not all cases, moreover, red indicates that the divergence is undesirable.

In the case of Warren County in 1998, the situation is as follows.  Per capita income dropped very slightly (i.e., by only 0.2%), from $23,577 in 1997 to $23,527 in 1998.  Unfortunately, the number of persons in poverty rose 13.5%, from 644 in 1997 to 731 in 1998.  So there was not a general recession or other drop in earnings shared equally by everyone.  Indeed, the stable per capita income raises the question of how many people actually experienced an increase in income.

Warren County stands out, in 1998, because the ratio of its change in poverty rate to its change in per capita income was greater than 60:1.  It stands out in red because that change signals bad news for poor people.  It would have stood out in green if the ratio had been 60:1 in poor people’s favor – if, that is, there had been a slight increase in income and a dramatic decrease in poverty.  In that case, it would seem that the county channeled much of its additional prosperity, that year, into an improvement in the conditions of the poor.

The color scheme used on these maps, then, ranges from red down through orange to yellow, as the bad news for poor people becomes progressively less bad, and from yellowish green up through deep green as the good news for poor people becomes progressively better.  The color gradations go in steps of twenty:  that is, red is for a ratio of worse than (negative) 60:1; a dark shade of orange accounts for ratios between 40:1 and 60:1; a lighter shade of orange represents ratios between 20:1 and 40:1; and so on down to zero and then up through the deepest green at 60:1.  There is nothing magical about those particular gradations.  They were chosen for simplicity.  The seas of yellow and green that appear in a few years depicted in the video suggest that closer gradations might have provided more information.

One step I took that now appears to have been a mistake was to eliminate a half-dozen extreme values that I considered outliers.  Had I not done that, there would have been a handful of additional counties shown in the deepest reds and greens throughout this ten-year period.

This presentation does not purport to be definitive, or even scholarly.  Along the lines suggested in the refinements just mentioned, a high-quality product would call for manual analysis of a number of counties, like the analysis of Warren County provided above, so as to insure that representative and appropriate colors were used for all counties.  Data and calculations used here have not been carefully proofread.  The spot checks that I have done do seem to indicate accuracy in the basic calculations.

One technical refinement that will become more feasible in future years, as data become more readily available, could involve a finer-grained analysis by zip code and/or census tract.  Another refinement worth considering would be to overlay an indication of population centers.  Also, if the video were converted to, say, a PDF, it would also be possible to create links or tooltips for each county, so that mousing over or clicking on a county would bring up or lead to the underlying data.

The video suggests some areas for further inquiry.  It appears, in my review thus far, that a number of the most extreme contrasts appear in counties in the regions of Chicago, Evansville, Indianapolis, and Louisville – and also around West Lafayette.  Also, it seems that some counties tend to experience the same trends:  they are the same colors as one or more of their neighbors in most if not all of the years depicted.  There also appear to be years of greater and lesser homogeneity among the counties – such as the contrast between 1998 and 1999.  Closer investigation of sharp divergences among neighboring counties (such as in 2004) could also lead to indicia of balkanization, where large employers or governmental policies yield marked departures from (and possibly distortions in) the general tendency in the state for the year.

As noted in the other post, there were some technical difficulties in the preparation of this video.  It was, nonetheless, an interesting project.  I hope the links provided in these posts, and the techniques used in the video, lead me and/or others to undertake further analyses of this kind.