Linked Data HomeData DescriptionLinking MethodGet Data


Linked Census Data Extracts

- MLP version 1.1: released March 2023 -

The IPUMS dissemination system produces data files designed for cross-census linking. On the sample selection tab for full-count data, click the checkbox to "link census data" to create a data extract that identifies individuals across multiple censuses from 1850 to 1940. The extract system will automatically include the Historical Identification Key (HIK) variable that uniquely identifies persons across census years. The census links were created by the IPUMS Multigenerational Longitudinal Panel (MLP) project. Roughly one-quarter to one-third of individuals are linked between censuses. See the hyperlinks at the top of this page for more information on the MLP project and the construction of the census links.

Note: The extract system is not yet updated with version 1.2 links, which include 1950 census links and direct 20- and 30-year intercensus links. Those links are currently available only through crosswalk files.

Number of linked individuals across censuses
Census years
Linked persons
1850-1860
5,997,363
1860-1870
8,925,067
1870-1880
13,798,054
1880-1900
10,705,233
1900-1910
30,313,883
1910-1920
37,205,366
1920-1930
44,642,307
1930-1940
52,480,797

Data format. The linked data extract will be produced in standard IPUMS format: all the person records for census year X followed by all person records for census year Y, etc. Users may use HIKs to sort this into a long file, restructure the data to a wide file, or otherwise manipulate the data as appropriate for their application.

Sample selection. Invoking the "link census data" option on the sample selection screen constrains your choice of censuses to the full-count datasets. Any other datasets in your data cart will be dropped and cannot subsequently be added.

Case selection. Linked extracts, like all full count extracts, are large. By default, for linked census extracts, the extract system will employ case selection to include only persons who link across ALL of the selected censuses. For example, if you select the 1900, 1910, and 1920 censuses for linking, only persons who are linked across all three censuses will be included in your data extract. You can edit those automated case selection choices on the final screen of the extract process by clicking the "Select Cases" button. Note that removing any of those selections can yield significantly larger data extracts that may prove challenging to process.

You can also change the default case selection option from including only linked persons to also include everyone who resided with the linked person. The choice to include non-linked household members will result in a more complicated dataset for analysis while providing more contextual information for the linked persons.

Users should be cautious about adding case selections beyond those that are applied automatically by the system to identify linked individuals across census years. Performing case selection on time-variant characteristics — such as age, marital status, or state of residence — risks excluding some observations for a person. An individual may be linked in a census year, but the observation will be dropped if they do not meet the additional selection criteria in that specific census.

Manual linked extracts. You can create extracts that include the HIK linking key without invoking the "link census data" checkbox on the sample selection screen and its associated automated case selection. The HIK variable and the flags identifying linked persons in each dataset are accessible in the Linking Tools group in the drop-down list in the variable browsing system.

Citation and terms of use

Cite both IPUMS MLP and the Full Count IPUMS Ancestry data:

Jonas Helgertz, Steven Ruggles, John Robert Warren, Catherine A. Fitch, J. David Hacker, Matt A. Nelson, Joseph P. Price, Evan Roberts, and Matthew Sobek. IPUMS Multigenerational Longitudinal Panel: Version 1.1 [dataset]. Minneapolis, MN: IPUMS, 2023. https://doi.org/10.18128/D016.V1.1

Steven Ruggles, Catherine A. Fitch, Ronald Goeken, J. David Hacker, Matt A. Nelson, Evan Roberts, Megan Schouweiler, and Matthew Sobek. IPUMS Ancestry Full Count Data: Version 3.0 [dataset]. Minneapolis, MN: IPUMS, 2021. https://doi.org/10.18128/D014.V3.0

Note that the version — and hence the citation — can differ between the extract system and crosswalk files.

Publications and research reports making use of IPUMS should be added to our Bibliography.

Contact us

For questions about IPUMS MLP, contact ipumsres@umn.edu.

Back to Top