Winter 2005 Data Release

In Novemeber of 2005, we added thirteen new, nationally-representative samples to the IPUMS database. All samples were previously available on the IPUMS-Beta site, which has been shut down. The new datasets together add nearly 15 million cases to the IPUMS. All data are available via the IPUMS-USA extract system.

1900 General sample
1900 oversamples of Alaskans, Hawaiians, and American Indians
1930 Preliminary sample
1980 1% Labor Market Areas sample
1980 1% Detailed Metro/Nonmetro sample
1990 0.5% Labor Market Areas sample
2000 1% Unweighted Sample
2001-2004 American Community Survey samples

1900 General sample

This final 1-in-100 sample of the 1900 census includes approximately 173,000 household and 760,000 person records. The smaller 1900 sample previously available (the 1-in-750 "Preston" sample) will no longer be available via the IPUMS extract system. Users wishing to access this data can still download the entire dataset and SPSS command file via the IPUMS raw data download page.

Oversamples of Alaskans, Hawaiians, and American Indians in 1900

As a part of the project to create the 1900 1-in-100 sample, we created 1-in-5 oversamples of the populations of Alaska, Hawaii, and American Indians. All populations were enumerated on special schedules, each of which contained several unique questions. Subsets of each of the three oversamples were included in the 1900 general sample. The 1-in-5 oversamples and codebooks are available only via the IPUMS raw data download page at this time.

1930 Preliminary sample

The original 1930 census manuscripts were made available to the public in April 2002, after the usual 72-year embargo on all census manuscripts. This preliminary 1-in-500 sample of the 1930 census includes approximately 62,000 household and 240,000 person records. This 1-in-500 sample is the first release of a 5-year project that began shortly after the 1930 census manuscripts were made public. The final 1-in-100 sample of the 1930 census will be released in 2007.

1980 1% Labor Market Areas sample

The sample design is identical to that of the 1980 1% sample, 5% sample, Urban/Rural sample, and the Detailed Metro/non-metro sample. Variable availability differs only slightly. What makes this sample unique is the low-level geographic identifier: the Labor Market Area (LMA). LMAs are groups of contiguous counties with a combined population of 100,000+ residents. The county composition of LMAs is based on a hierarchical cluster analysis of the journey-to-work data and county-to-county commuter flows (see the LMA variable description). Other variables unique to this sample are COMZONE, MIGLMA5, MIGCZ5, PWLMA, and PWCZ.

The sample contains approximately 862,000 household and 2,269,000 person records. Researchers interested in maximizing case counts can combine this sample with the other 1980 samples to achieve an overall 9% sample density.

1980 1% Detailed Metro/Nonmetro sample

The sample design is identical to that of the 1980 1% sample, 5% samples, Urban/Rural sample, and the Labor Market Area sample. Variable availability differs only slightly. What makes this sample unique are the low-level geographic identifiers: additional variables identify urban and rural places to a level of detail not available in any other 1980 sample (see URBRURAL, URBFRYN, OTHURBYN, RURALYN). Other variables unique to this sample are PWMET98E and MIGTYP5E.

The sample contains approximately 862,000 household and 2,270,000 person records. Researchers interested in maximizing case counts can combine this sample with the other 1980 samples to achieve an overall 9% sample density.

1990 0.5% Labor Market Areas sample

The sample design is identical to that of the 1990 1% and 5% samples, and variable availability differs only slightly. What makes this sample unique is the low-level geographic identifier: the Labor Market Area (LMA). LMAs are groups of contiguous counties with a combined population of 100,000+ residents. The county composition of LMAs is based on a hierarchical cluster analysis of the journey-to-work data and county-to-county commuter flows (see the LMA variable description). Other variables unique to this sample are MIGLMA5 and PWLMA.

The sample contains approximately 445,000 household and 1,140,000 person records. Researchers interested in maximizing case counts can combine this sample with the other 1990 samples to achieve an overall 6.5% sample density.

2000 1% Unweighted sample

To utilize the advantages of an unweighted sample design, the IPUMS provides a 1 percent unweighted sample, extracted from the 2000 5 percent file. This was created using the same method that we used to create the 1990 1 percent unweighted sample (see Sample Design notes for 1990 ). The result is a 1-in-100 unweighted national random sample of the population in which all cases have household and person weights of 100; the use of weights is optional with this sample.

Variable availability is identical to that of the 2000 5 percent weighted sample. The sample contains approximately 1,236,891 household and 2,808,457 person records.

2001-2004 American Community Survey samples m

The American Community Survey (ACS) is a project of the U.S. Census Bureau that will eventually replace the decennial census. No place smaller than the state of residence can be identified in any ACS sample. Each sample includes data on approximately 500,000 households and 1,000,000 persons. For more information, see the ACS information page.