Sunday, September 25, 2005

Methods: Make Stata Do-Files Faster with a Spreadsheet

Say you've got a long list of feeling thermometer results in a data set like the National Election Study and you want to run Chi squares on all of them with your favorite independent variable. A Stata Do-file makes it easy, but composing your Do-file in a spreadsheet program makes it even easier. NES runs its similar variables all together, so in NES 2004's post-election study, the 45 (!) feeling thermometers are in variables V045043 to V045088. One way to get results is to type your first command ...

tabulate V045043 V043116, chi2

... then copy and paste it a bunch of times ...

tabulate V045043 V043116, chi2
tabulate V045043 V043116, chi2
tabulate V045043 V043116, chi2

... then go through with the arrow keys and delete key to increase the first variable by one another bunch of times -- say, 45 in all.

Or, you could let a spreadsheet (such as the free, open-source, completely Microsoft Excel-compatible OpenOffice Calc) do the work for you.

  1. Fire up a blank sheet and type your command in Column 1, your dependent variable in Column 2, your independent variable in Column 3, and ",chi2" in Column 4.
  2. Highlight the dependent variable you just typed and use the drag arrow (or whatever that thing's called) at the bottom right of the cell to generate a list of variables as long as you want that increases by one in each row.
  3. Finish by copying each of the other three elements, highlighting the empty cells where they're needed, and pasting.
  4. Highlight the whole thing and paste it into Stata's .do file editor, or your favorite text editor, and you're done. Stata doesn't care about the extra spaces.
Be careful that some "helpful" auto-capitalization feature doesn't capitalize your tabulate command, because that will mess things up.

Something similar would probably work to generate SPSS syntax, too, but I haven't tried it yet.

No comments:

Post a Comment