General Instructions for Opening an Extract on Your PC

The IPUMS extract system does not provide data in the format of any particular statistics package. Instead, we provide compressed data files and the command files necessary for reading the raw data into R, SPSS, Stata, and SAS.

Step 1: Download the data and command files

On the Download or Revise Extracts page, you will see your extract requests along with the creation date, optional description, and five links on the right side of the page: data, codebook, SPSS, SAS, STATA, and R.

Download the data file and the command file. To get the data, right click on the "data" link and select "Save Link As". Do the same for the link named "R", "SPSS", "SAS", or "Stata".

Step 2: Decompress the data file

Note: You do not need to decompress files to use in R. Instead, ensure you download the DDI codebook (which will be an .xml file).

The downloaded data file should have the suffix "dat.gz", such as "usa_00001.dat.gz". This file needs to be decompressed. Mac OSX will decompress the file when you double-click it. Many Windows-based computers will also do this. If your Windows-based does not know how to decompress the file, you need to download decompression software. A free option is 7zip; other programs are also available.

When decompression is complete, you should see a file with the suffix ".dat" (decompression removes the ".gz" part of the suffix). Note the path to the location of the ".dat" file on your computer. If you are unsure about the path to the file, right-click on the file and choose "Properties" (or, on Mac, "Get Info"). The path is indicated in the "Location:" section of the Properties or Get Info window.

Step 3: Modify the command file

The final step is to modify your command file to indicate the location of the ".dat" file on your computer. These instructions vary slightly for each statistics package. The examples below assume that you are working with a data file called "usa_00001.dat" that is stored in a folder on your "C:" drive called "IPUMS", so that the full path to your dataset is "C:\IPUMS\usa_00001.dat"

SPSS:

A line in the command file (the ".sps" file) will read
data list file ='usa_00001.dat'/

Change that line to read
data list file ='C:\IPUMS\usa_00001.dat'/

Pull down the "Run" menu and select "All". SPSS will then read in your data.

SAS:

One line in the command file (the ".sas" file) will read
libname ipumsdat '.';

Change that line to read
libname ipumsdat 'C:\IPUMS\';

Pull down the "Run" menu and select "Submit." SAS will then read in your data.

Stata:

Open Stata. Change directories to the location containing your ".dat" and ".do" files by typing
cd "C:\IPUMS\"

Then type
do usa_00001.do

You will see "end of do-file" when STATA has finished reading in the data.

R:

Open RStudio. Change your working directory to the location containing your ".dat" and ".xml" files manually or by typing
setwd("C:\IPUMS\")

If you haven't already installed the ipumsr package, install it with install.packages('ipumsr'). Then run the .R file downloaded from IPUMS to load your extract into R. Alternatively, you can do load your extract manually with the following code. library('ipumsr')
ddi <- read_ipums_ddi("usa_00001.xml")
data <- read_ipums_micro(ddi)

When R is finished reading the data, you will see an object named "data" in your global environment.

Back to Top