Multigenerational Longitudinal Panel:
Linked Census Data
The IPUMS Multigenerational Longitudinal Panel (MLP) project links individuals' records between censuses from 1850 to 1950. Linked individuals are assigned a unique identifier that is consistent across censuses. We also link information from the Social Security Administration Numident file to the twentieth-century censuses, from which we can obtain women's birth surnames. 645 million links are made between census pairs, tracking over 175 million people across two or more censuses. The linked data can be accessed directly via the IPUMS data extract system.
The underlying full count data, covering over 800 million individual records, are the fruit of collaboration between IPUMS and the world's two largest genealogical organizations — Ancestry.com and FamilySearch — to leverage genealogical data for scientific purposes. Full count census data have opened the possibility of automated record linkages across census years to construct millions of individual life histories and trace millions of families over multiple generations. Our aim is for IPUMS MLP to eventually serve as a general framework that incorporates records from a wide range of sources.
Version 2.0 Changes
The newest version of MLP includes the following changes:
- Women are added to Step 1 (individual linking stage), resulting in substantially more female links.
- Birth and married surnames for women from Social Security Administration records are used to link women between their single and married status.
- The XGBoost (Extreme Gradient Boosting) algorithm replaces logistic regression previously used for linking.
- All conflicting links have been removed.
- Overall, the new version has 15-20% more links, due mostly to improved linking rates for women.
- The IPUMS extract system has been improved to allow creation of custom datasets containing records linked across any combination of censuses.
Next steps
We plan to continue improving the census links, add death certificate information, and begin assisting several modern survey projects in linking their records to the censuses, providing information on early life conditions.
Project team
Investigators
Steven Ruggles, Principal Investigator (MPI)
John Robert Warren, Principal Investigator (MPI)
Julia A. Rivera Drew, Principal Investigator (MPI)
Catherine Fitch, Co-investigator
Matthew Sobek, Co-investigator
Research staff
Nesile Ozder, Staff Data Analyst, Data Linking Specialist
Cheyenne Lonobile, Senior Data Analyst, Project Coordinator
Matt A. Nelson, Research Scientist
Faculty advisors
J. David Hacker, History
Evan Roberts, History of Medicine
Consultants
Martha Bailey (University of California, Los Angeles)
Leah Boustan (Princeton University)
Casey Breen (Oxford University)
Joseph Ferrie (Northwestern University)
Joshua Goldstein (University of California, Berkeley)
Jonas Helgertz (Lund University)
Sam Hwang (University of British Columbia)
Joseph Price (Brigham Young University)
Citation
Steven Ruggles, Nesile Ozder, Catherine A. Fitch, Matthew Sobek, Julia A. Rivera Drew, J. David Hacker, Jonas Helgertz, Cheyenne Lonobile, Matt A. Nelson, Evan Roberts, and John Robert Warren. IPUMS Multigenerational Longitudinal Panel: Version 2.0 [dataset]. Minneapolis, MN: IPUMS, 2025. https://doi.org/10.18128/D016.V2.0
Funding
IPUMS MLP is funded by the National Institute on Aging grant R01AG057679.
Contact us
For questions about IPUMS linked data, contact ipumsres@umn.edu.