|
|
Descriptions
of IPUMS Samples |
|
|
|
1850-1930 Linked Representative Samples
- IPUMS 1% census samples from 1850-1930 linked to 100% database from 1880 census
- Contains 500,000 cases linked from the 1880 census to a surrounding census
- NOTE: this data is not available through the data extract system
- Data and documentation are available vie the Linked Representative Samples page
1850 1% sample
- 1-in-100 national random sample of the free population.
- African-American slaves
are not included in this dataset. Individual-level data on the 1850 slave population is available at the 1850-60 Slave PUMS website.
1860 1% sample
- 1-in-100 national random sample of the free population.
- African-American slaves are not included in this dataset. Individual-level data on the 1860 slave population is available at the 1850-60 Slave PUMS website.
1860 1% sample with black oversample
- 1-in-100 national random sample of the free population with
a 1-in-50 over-sample of the free African-American population.
- African-American slaves are not included in this dataset. Individual-level data on the 1860 slave population is available at the 1850-60 Slave PUMS website.
- This is a weighted sample.
1870 1% sample
- 1-in-100 national random sample of the population.
1870 1% sample with black oversample
- 1-in-100 national random sample of the population with a 1-in-50
over-sample of the African-American population.
- This is a weighted sample.
1880 1% sample
- 1-in-100 national random sample of the population.
1880 10% sample with oversamples
- 1-in-10 national random sample of the population with a 1-in-5 minority oversample.
- This sample replaces the 1880 5% preliminary sample. It includes all cases from the 5% preliminary sample, drawn only from counties on odd-numbered microfilm reels, and adds data from counties on even-numbered reels.
- "Minorities" are defined as
persons whose race was Native American or African American, whose race or birthplace indicated that they were Chinese, or whose name or birthplace indicated Hispanic origins. Households including a minority were sampled at a 1-in-5 rate.
- This is a weighted sample.
1880 100% Population Database (contains limited variables)
- Contains the complete 1880 population
- Data was entered and generously made available by the Church of Jesus Christ of Latter-day Saints.
- Name information is not available from the IPUMS website. Genealogists needing 1880 census data are advised to use FamilySearch.org. Academic researchers needing name information should apply for access via the North Atlantic Population Project (NAPP).
- The current version of the dataset was released in January, 2010 and contains some revisions from an earlier version.
- Several variable groups were never entered. These include items relating to school, literacy, unemployment, disability, month of birth, marriage within the past year, and street address.
- The most detailed geographic variables are MCDSTRNG and INCSTRNG.
- Group quaters units containing more than 60 people were split into 1-person households. Researchers needing to study these units intact can use SERIAL80 and PERNUM80.
1900 1% sample
- 1-in-100 national random sample of the population, including Alaskans, Hawaiians, and American Indians.
- The 1900-1920 samples include data from Alaska and Hawaii, even though they were not states until 1959. The 1930-1950 samples do not include data from Alaska and Hawaii.
1900 1% sample with oversamples
- 1-in-100 national random sample of the population, with 1-in-5 oversamples of Alaskans, Hawaiians, and persons enumerated on the American Indian schedules.
- The 1900-1920 samples include data from Alaska and Hawaii, even though they were not states until 1959. The 1930-1950 samples do not include data from Alaska and Hawaii.
- This is a weighted sample.
1900 5% sample
- 1-in-20 national random sample of the population.
- This sample replaces the 1900 2.5% preliminary sample. It includes all cases from the 2.5% preliminary sample, drawn only from counties on odd-numbered microfilm reels, and adds data from counties on even-numbered reels.
- Alaska and Hawaii are not included in this dataset. Researchers needing data from those states should use the 1900 1% sample with oversamples.
1900 0.1% sample
- 1-in-760 national random sample of the population, also known as the "Preston" sample.
- This sample is no longer available via the IPUMS extract system.
- Still available via the IPUMS downloads
page, for those needing to reproduce previous research.
1910 1% sample
- 1-in-100 national random sample of the population, including Alaskans, Hawaiians, and persons enumerated on the American Indians schedules.
- The 1900-1920 samples include data from Alaska and Hawaii, even though they were not states until 1959. The 1930-1950 samples do not include data from Alaska and Hawaii.
1910 1.4% sample with oversamples
- 1-in-70 national random sample of the population, with large oversamples of Hispanics, Blacks, Alaskans, Hawaiians, and persons enumerated on the American Indian schedules.
- The 1900-1920 samples include data from Alaska and Hawaii, even though they were not states until 1959. The 1930-1950 samples do not include data from Alaska and Hawaii.
- This is a weighted sample.
1910 Puerto Rico Sample
- There are three combined samples in the 1910 Puerto Rican sample: a 1-in-10 sample; an over-sample of households in the municipality of Loiza (an African-descent enclave); and an over-sample of households located in coffee regions.
- This sample was originally made at the University of Wisconsin-Madison by Alberto Palloni, Halliman W. Winsborough, and Francsico Scarano. Some codes and values in the IPUMS sample differ from those in the original UW sample (which is available from ICPSR).
- This is a weighted sample.
1920 1% sample
- 1-in-100 national random sample of the population.
- The 1900-1920 samples include data from Alaska and Hawaii, even though they were not states until 1959. The 1930-1950 samples do not include data from Alaska and Hawaii.
1920 Puerto Rico Sample
- There are three combined samples in the 1920 Puerto Rican sample:
a 1-in-10 sample; an over-sample of households in the municipality of Loiza (an African-descent enclave); and an over-sample of households located in coffee regions.
- This sample was originally made at the University of Wisconsin-Madison by Alberto Palloni, Halliman W. Winsborough, and Francsico Scarano. Some codes and values in the IPUMS sample differ from those in the original UW sample (which is available from ICPSR).
- This is a weighted sample.
1930 1% sample
- 1-in-100 national random sample of the population.
- The 1930-1950 samples do NOT include data from Alaska and Hawaii. Samples from 1900-1920 and 1960-present include data from Alaska and Hawaii.
1940 1% sample
- 1-in-100 national random sample of the population.
- This is a weighted sample.
- The 1930-1950 samples do NOT include data from Alaska and Hawaii. Samples from 1900-1920 and 1960-present include data from Alaska and Hawaii.
- Every household has one "sample-line" person who
answered additional census questions.
- Only places of at least 100,000 population can be identified
with any geographic variable.
1950 1% sample
- 1-in-100 national random sample of the population.
- This is a weighted sample.
- The 1930-1950 samples do NOT include data from Alaska and Hawaii. Samples from 1900-1920 and 1960-present include data from Alaska and Hawaii.
- Every household has one "sample-line" person who
answered additional census questions.
- Only places of at least 100,000 population can be identified
with any geographic variable.
1960 1% sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic unit is state.
1970 1% Form 1 State sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic unit is state.
- This sample was originally called the "5% state sample"
because Form 1 was given to 5% of the population.
- 1970 Form 1 samples contain a somewhat different set of variables
than Form 2 samples.
1970 1% Form 2 State sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic unit is state.
- This sample was originally called the "15% state sample"
because Form 2 was given to 15% of the population.
- 1970 Form 2 samples contain a somewhat different set of variables
than Form 1 samples.
1970 1% Form 1 Metro sample
- 1-in-100 national random sample of the population. (Note, this
is not a sample only of metro areas.)
- The smallest identifiable geographic units are metropolitan
areas and county groups: combinations of counties totaling at
least 250,000 population.
- PLEASE NOTE: The 1970 Metro samples do not report state of residence for persons living in county
groups that straddle state boundaries. Researchers requiring complete state information should use one of the 1970 State samples.
- This sample was originally called the "5% county group
sample" because Form 1 was given to 5% of the population.
- 1970 Form 1 samples contain a somewhat different set of variables
than Form 2 samples.
1970 1% Form 2 Metro sample
- 1-in-100 national random sample of the population. (Note, this
is not a sample only of metro areas.)
- The smallest identifiable geographic units are metropolitan
areas and county groups: combinations of counties totaling at
least 250,000 population.
- PLEASE NOTE: The 1970 Metro samples do not report state of residence for persons living in county
groups that straddle state boundaries. Researchers requiring complete state information should use one of the 1970 State samples.
- This sample was originally called the "15% county group
sample" because Form 2 was given to 15% of the population.
- 1970 Form 2 samples contain a somewhat different set of variables
than Form 1 samples.
1970 1% Form 1 Neighborhood sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic units are "neighborhoods"
of about 4000 population (approximately the size of census tracts).
The precise location of a given neighborhood is suppressed; only
its census region/division is provided. Households from 42,950
separate neighborhoods are included in the sample. Both the Form
1 and Form 2 neighborhood samples contain cases from each of the
42,950 neighborhoods (about 17 households per neighborhood from
each sample).
- PLEASE NOTE: The 1970 Neighborhood samples do not include a variable for state of residence. Researchers requiring complete state information should use one of the 1970 State samples.
- Neighborhood samples contain a set of additional variables
giving summary statistics for the neighborhood in 1970 (e.g.,
percent of population age 65 or older). The extra variables are
appended onto the end of the household record.
- This sample was originally called the "5% neighborhood sample"
because Form 1 was given to 5% of the population.
- 1970 Form 1 samples contain a somewhat different set of variables
than Form 2 samples.
1970 1% Form 2 Neighborhood sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic units are "neighborhoods"
of about 4000 population (approximately the size of census tracts).
The precise location of a given neighborhood is supressed; only
its census region/division is provided. Households from 42,950
separate neighborhoods are included in the sample. Both the Form
1 and Form 2 neighborhood samples contain cases from each of the
42,950 neighborhoods (about 17 households per neighborhood from
each sample).
- PLEASE NOTE: The 1970 Neighborhood samples do not include a variable for state of residence. Researchers requiring complete state information should use one of the 1970 State samples.
- Neighborhood samples contain a set of additional variables
giving summary statistics for the neighborhood in 1970 (e.g.,
percent of population age 65 or older). The extra variables are
appended onto the end of the household record.
- This sample was originally called the "15% neighborhood sample"
because Form 2 was given to 15% of the population.
- 1970 Form 2 samples contain a somewhat different set of variables
than Form 1 samples.
1970 1% Puerto Rico State sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic unit is the state.
1970 1% Puerto Rico Municipio sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic units are metropolitan areas and county groups: combinations of counties totaling at least 250,000 population.
1970 1% Puerto Rico Neighborhood sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic units are "neighborhoods" of about 4000 population (approximately the size of census tracts). The precise location of a given neighborhood is supressed; only its state is provided. Households from 380 separate neighborhoods are included in the sample.
- Neighborhood samples contain a set of additional variables giving summary statistics for the neighborhood in 1970 (e.g., percent of population age 65 or older). The extra variables are appended onto the end of the household record.
1980 5% State sample
- 1-in-20 national random sample of the population.
- No place smaller than 100,000 population can be identified
with any geographic variable. The most basic geographic variable
is the county group, which can be any combination of counties
or portions of counties that total 100,000 population. The state
sample privileges state identification over metropolitan area
identification. Where the combination of state and metropolitan
area would enable the identification of areas smaller than 100,000
population, the 1980 state sample suppresses the metropolitan
area information.
1980 1% Metro sample
- 1-in-100 national random sample of the population. (Note, this
is not a sample only of metro areas.)
- No place smaller than 100,000 population can be identified
with any geographic variable. The most basic geographic variable
is the county group, which can be any combination of counties
or portions of counties that total 100,000 population. The metro
sample privileges metropolitan area identification over state
identification. Where the combination of state and metropolitan
area would enable the identification of areas smaller than 100,000
population, the 1980 metro sample suppresses the state
information.
1980 1% Urban/rural sample
- 1-in-100 national random sample of the population.
- No place smaller than 100,000 population can be identified
with any geographic variable. This 1980 sample identifies urban
status, and the smallest geographic units provided are urbanized
areas (similar to metropolitan areas, but more specifically urban
in character). Some cities are given, but no metropolitan areas;
and many smaller states cannot be separately identified.
1980 1% Labor Market Areas sample
- 1% random sample of the state population.
- Labor Market Areas are defined by a hierarchical cluster analysis
of counties based on work-to-residence commuting patterns.
1980 1% Detailed Metro/Nonmetro sample
- 1-in-100 national random sample of the population.
- No place smaller than 100,000 population can be identified
with any geographic variable. This sample identifies urban and
rural place status within metropolitan areas, and the smallest
geographic units provided are urbanized areas (similar to metropolitan
areas, but more specifically urban in character). Some cities
are given, but no metropolitan areas; and many smaller states
cannot be separately identified.
1980 5% Puerto Rico sample
- 1-in-20 national random sample of the population.
- No place smaller than 100,000 population can be identified with any geographic variable. The most basic geographic variable is the county group, which can be any combination of counties or portions of counties that total 100,000 population.
1980 1% Puerto Rico sample
- 1-in-100 national random sample of the population.
- No place smaller than 100,000 population can be identified with any geographic variable. The most basic geographic variable is the county group, which can be any combination of counties or portions of counties that total 100,000 population.
1990 5% State sample
- 1-in-20 national random sample of the population.
- This is a weighted sample.
- No place smaller than 100,000 population can be identified
with any geographic variable. The most basic geographic variable
is the PUMA, which can be any combination of counties or portions
of counties that total 100,000 population. The state sample privileges
state identification over metropolitan area identification. Where
the combination of state and metropolitan area would enable the
identification of areas smaller than 100,000 population, the 1990
state sample suppresses the metropolitan area information.
- The Census Bureau re-released all 1990 PUMS data in 1993. There was a subsequent re-release of 1990 PUMS Group Quarters cases in 1996. The IPUMS is based on these re-released data.
1990 1% Metro sample
- 1-in-100 national random sample of the population. (Note, this
is not a sample only of metro areas.)
- This is a weighted sample.
- No place smaller than 100,000 population can be identified
with any geographic variable. The most basic geographic variable
is the PUMA, which can be any combination of counties or portions
of counties that total 100,000 population. The metro sample privileges
metropolitan area identification over state identification. Where
the combination of state and metropolitan area would enable the
identification of areas smaller than 100,000 population, the 1990
metro sample suppresses the state information.
- The Census Bureau re-released all 1990 PUMS data in 1993. There was a subsequent re-release of 1990 PUMS Group Quarters cases in 1996. The IPUMS is based on these re-released data.
1990 3% Elderly sample
- 1-in-33 national random sample of households containing at
least one person age 60 or older.
- This is a weighted sample.
- No place smaller than 100,000 population can be identified
with any geographic variable. The elderly sample follows the same
geographic identification system as the 5% state sample. The most
basic geographic variable is the PUMA, which can be any combination
of counties or portions of counties that total 100,000 population.
Like the 5% state sample, the elderly sample privileges state
identification over metropolitan area identification. Where the
combination of state and metropolitan area would enable the identification
of areas smaller than 100,000 population, the 1990 elderly sample
suppresses the metro area information.
- In addition to PUMAs, the elderly sample identifies the state
Planning Service Area (PSA) in which a household resided.
- The Census Bureau re-released all 1990 PUMS data in 1993. There was a subsequent re-release of 1990 PUMS Group Quarters cases in 1996. The IPUMS is based on these re-released data.
1990 1% Unweighted state sample
- 1-in-100 national random sample of the population.
- No place smaller than 100,000 population can be identified
with any geographic variable. The most basic geographic variable
is the PUMA, which can be any combination of counties or portions
of counties that total 100,000 population. The state sample privileges
state identification over metropolitan area identification. Where
the combination of state and metropolitan area would enable the
identification of areas smaller than 100,000 population, the 1990
state sample suppresses the metropolitan area information.
- The Census Bureau re-released all 1990 PUMS data in 1993. There was a subsequent re-release of 1990 PUMS Group Quarters cases in 1996. The IPUMS is based on these re-released data.
1990 0.5% Labor Market Areas sample
- An approximate 1-in-200 national random sample of the population.
- No place smaller than 100,000 population can be identified with
any geographic variable. The most basic geographic variable is
the Labor Market Area (LMA), which can be any combination of counties.
- The Census Bureau re-released all 1990 PUMS data in 1993. There was a subsequent re-release of 1990 PUMS Group Quarters cases in 1996. The IPUMS is based on these re-released data.
1990 5% Puerto Rico sample
- 1-in-20 national random sample of the population.
- This is a weighted sample.
- No place smaller than 100,000 population can be identified with any geographic variable. The most basic geographic variable is the PUMA, which can be any combination of counties or portions of counties that total 100,000 population.
1990 1% Puerto Rico sample
- 1-in-100 national random sample of the population. (Note, this is not a sample only of metro areas.)
- This is a weighted sample.
- No place smaller than 100,000 population can be identified with any geographic variable. The most basic geographic variable is the PUMA, which can be any combination of counties or portions of counties that total 100,000 population.
2000 5% sample
- 1-in-20 national random sample of the population.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing
at least 100,000 persons. PUMAs do not cross state boundaries.
2000 1% sample
- 1-in-100 national random sample of the population.
- This is a weighted sample.
- The smallest identifiable geographic unit is the Super-PUMA
containing at least 400,000 persons. Super-PUMAs do not cross
state boundaries.
2000 1% Unweighted sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic unit is the PUMA, containing
at least 100,000 persons. PUMAs do not cross state boundaries.
2000 5% Puerto Rico sample
- 1-in-20 national random sample of the population.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing at least 100,000 persons.
2000 1% Puerto Rico sample
- 1-in-100 national random sample of the population.
- The smallest identifiable geographic unit is the PUMA, containing at least 100,000 persons.
American Community Survey 2000 sample
- 1-in-750 (approximately) national random sample of the population.
- The data do not include persons in group quarters.
- This is a weighted sample.
- No place smaller than state can be identified.
- The ACS questionnaire was nearly identical to the 2000 census
long form (the source for the 2000 census samples). The ACS contains
several questions on involvement in government programs and a
fertility question not asked in the census.
American Community Survey 2001 sample
- 1-in-232 (approximately) national random sample of the population.
- The data do not include persons in group quarters.
- This is a weighted sample.
- No place smaller than state can be identified.
American Community Survey 2002 sample
- 1-in-261 (approximately) national random sample of the population.
- The data do not include persons in group quarters.
- This is a weighted sample.
- No place smaller than state can be identified.
American Community Survey 2003 sample
- 1-in-236 (approximately) national random sample of the population.
- The data do not include persons in group quarters.
- This is a weighted sample.
- No place smaller than state can be identified.
American Community Survey 2004 sample
- 1-in-239 (approximately) national random sample of the population.
- The data do not include persons in group quarters.
- This is a weighted sample.
- No place smaller than state can be identified.
American Community Survey 2005 sample
- 1-in-100 national random sample of the population.
- The data do not include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing
at least 100,000 persons. PUMAs do not cross state boundaries.
Puerto Rican Community Survey 2005 sample
- 1-in-100 national random sample of the population.
- The data do not include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing at least 100,000 persons.
American Community Survey 2006 sample
- 1-in-100 national random sample of the population.
- The data include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing
at least 100,000 persons. PUMAs do not cross state boundaries.
Puerto Rican Community Survey 2006 sample
- 1-in-100 national random sample of the population.
- The data include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing at least 100,000 persons.
American Community Survey 2007 sample
- 1-in-100 national random sample of the population.
- The data include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing
at least 100,000 persons. PUMAs do not cross state boundaries.
Puerto Rican Community Survey 2007 sample
- 1-in-100 national random sample of the population.
- The data include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing at least 100,000 persons.
American Community Survey 2005-2007 3-Year sample
- 3-in-100 national random sample of the population
- Contains all households and persons from the 1% ACS samples for 2005, 2006, and 2007, identifiable by year.
- The data include persons in group quarters except for the 2005 cases.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing
at least 100,000 persons. PUMAs do not cross state boundaries.
- Users should read the FAQ on the 3-year data.
Puerto Rican Community Survey 2005-2007 3-Year sample
- 3-in-100 national random sample of the population.
- Contains all households and persons from the 1% PRCS samples for 2005, 2006, and 2007, identifiable by year.
- The data include persons in group quarters except for the 2005 cases.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing at least 100,000 persons.
- Users should read the FAQ on the 3-year data.
American Community Survey 2008 sample
- 1-in-100 national random sample of the population.
- The data include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing
at least 100,000 persons. PUMAs do not cross state boundaries.
- Users should read the summary of the 2008 ACS/PRCS.
Puerto Rican Community Survey 2008 sample
- 1-in-100 national random sample of the population.
- The data include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing at least 100,000 persons.
- Users should read the summary of the 2008 ACS/PRCS.
American Community Survey 2006-2008 3-Year sample
- 3-in-100 national random sample of the population
- Contains all households and persons from the 1% ACS samples for 2006, 2007, and 2008, identifiable by year.
- The data include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing
at least 100,000 persons. PUMAs do not cross state boundaries.
- Users should read the FAQ on the 3-year data.
Puerto Rican Community Survey 2006-2008 3-Year sample
- 3-in-100 national random sample of the population.
- Contains all households and persons from the 1% PRCS samples for 2006, 2007, and 2008, identifiable by year.
- The data include persons in group quarters.
- This is a weighted sample.
- The smallest identifiable geographic unit is the PUMA, containing at least 100,000 persons.
- Users should read the FAQ on the 3-year data.
|