IPUMS Complete Count Data

Completed Datasets

1940 Preliminary Complete Count Data: This file is the first public 100% 1940 file released through the IPUMS extract system. New versions of the data with additional variables and various improvements will be released over the next three years. A few notes about the preliminary 1940 data are below:

  • Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT40, reconstructed using the variable SERIAL40, and the original count is found in the variable NUMPREC40.
  • Some variables are missing from this data set for specific enumeration districts. The enumeration districts with missing data can be identified using the variable EDMISS. These variables will be added in a future release.
  • Coded variables derived from string variables are still in progress. These variables include: occupation, industry and migration status.
  • We have allocated missing observations and edited some inconsistencies for the following variables: SURSIM, SEX, SCHOOL, RELATE, RACE, OCC1950, MTONGUE, MBPL, FBPL, BPL, MARST, EMPSTAT, CITIZEN, OWNERSHP. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the 'Select data quality flags' box on the extract summary page.
  • Most inconsistent information was not edited for this release, thus there are observations outside of the universe for many variables.

1880: This dataset was developed through a collaboration between the Minnesota Population Center and the Church of Latter-Day Saints; these complete count data have been available since 2008. Data are available through the North Atlantic Population Project (with names) and IPUMS-USA (without names).

1880 Linked Representative Samples: This database links records from the 1880 complete count dataset to 1% samples of the 1850 to 1930 U.S. Censuses. We have data samples for seven pairs of years: 1850-1880, 1860-1880, 1870-1880, 1880-1900, 1880-1910, 1880-1920, and 1880-1930. Each of these contains three independent linked samples: one of men, one of women, and one of married couples. Go to the Linked Samples page.

Datasets in Progress

1850: The result of a recent collaboration between Minnesota Population Center and the Church of Latter-Day Saints, the 1850 complete count database is under development. The data will be will be released on a rolling, state-by-state basis over the next 9 months. These data, with names, will be available from the North Atlantic Population Project (with 18 states and the District of Columbia available now). When the project is complete, a version without names will be available through the IPUMS-USA extract system.

1790-1930: Our largest new microdata collection will capitalize on the donations of an unprecedented scale of digitized census data by both Ancestry.com and FamilySearch. The 1790-1930 microdata include a core set of variables for every census year, including geographic location, age, sex, race, and name. Birthplace information is available in all but a few of the early years, and from 1880 forward the data include marital status, the relationship of each individual to the household head, and the birthplace of each individual's mother and father, allowing the identification of second-generation Americans. Other key variables such as year of immigration, duration of marriage, literacy, occupation, children ever born, children surviving, and disability are available sporadically.