Linked Data HomeData DescriptionLinking MethodGet Data


Crosswalk: World War II Enlistments to 1940 Census

The World War II enlistment file contains 2.6 million records from 1938-1947 linked to the 1940 Census. Compilation and cleaning of the enlistment data and linking to the census was performed by the CenSoc Project at the University of California at Berkeley. The data only include records for men. The data contain over 20 variables documenting the characteristics of the persons at the time of their enlistment, including height and weight information (see table below). Use the variable HISTID to link the data to records in the 1940 census.

Variables in the dataset

Variable Label
histid 1940 Census unique identifier
byear Year of birth
sex Sex
date_of_enlistment Date of enlistment
bpl Place of birth
residence_state State of residence at enlistment
residence_county County of residence at enlistment
place_of_enlistment Place of enlistment
education Education
grade_code Army grade (rank)
branch_code Army branch
term_of_enlistment Term of enlistment
race Race
citizenship Citizenship
civilian_occupation Civilian occupation
marital_status Marital status at enlistment
height Height at enlistment (inches)
weight_before_march_1943 Weight (pounds)
weight_or_AGCT Weight (pounds) or AGCT score*
component Army component
source Source of army personnel

* Weight data starting in 1943 are commingled with mental aptitude (AGCT) scores and sometimes other data.

Download data

Dataset (CSV)
Codebook

Citation

Users of the linked enlistment data should cite it as follows:

Goldstein, Joshua R.; Breen, Casey; Alexander, Monica; Miranda Gonzalez, Andrea; Menares, Felipe; Osborne, Maria; Snyder, Mallika; Yildirim, Ugur; Wikle, Anna, 2023, "CenSoc Army Enlistment Records", https://doi.org/10.7910/DVN/ZFVVNA, Harvard Dataverse, V2

CenSoc Methodology

The development of the WWII enlistment data by the CenSoc Project is described in this technical report, which describes related datasets and discussion of data limitations. Linking of enlistment data to the 1940 census was performed using a conservative variant of the ABE approach developed by Abramitzky, Boustan, and Eriksson, as described in the report.

CenSoc offers several datasets that were developed for the study of mortality. These datasets include social security data and the full World War II enlistment dataset from which the linked data provided from this web page were drawn. The full enlistment file is considerably larger and includes women. All CenSoc data are available free of charge.

MLP Next Steps

In future releases MLP will link the enlistment records to additional census years, provide machine-readable variable and value labels, and attempt to separate the mental aptitude data that are mixed with weight information.