1940
Sampling Procedures1
Go Back to Sampling Procedures Index
Sample Selection from Microfilmed
Population Schedules
Population schedules from the 1940 census are preserved on 4,576 one
hundred foot reels of 35mm microfilm. The original schedules of the 1940
Census of Population and Housing and the punch cards produced from them
were destroyed. Copies of the microfilm are stored at the National Archives
in Washington, D.C. and the Personal Records Service Branch of the Bureau
of the Census in Pittsburgh, Kansas. All microfilm processing for the public
use sample project was performed by the Census Bureau at its Pittsburgh,
Kansas, facility. The 4,576 microfilm reels were randomly assigned to 20
subsamples. Final sampling of cases for transcription was conducted independently
within each of the 20 subsamples. This was done because cost estimates
were uncertain and working with 20 independent replicates provided a means
of coping with a premature end of the sampling and data collection. The
microfilm is organized alphabetically by State, within States alphabetically
by county, and within counties numerically by enumeration district. Thus,
each subsample is a representative, albeit clustered, sample of the United
States population.
The 20 subsamples were processed separately and public use sample records
are sequenced by subsample. The household record item SUBSAMPL identifies
each subsample. The value in SUBSAMPL does not indicate the order of processing;
the original subsample numbers have been reassigned to protect the confidentiality
of the microfilm reels. On the microfilm, the population schedules within
enumeration district are arranged by sheet number with the "A" side followed
by the "B" side. Sheets numbered 1 through 60 list households that were
contacted during the original enumeration canvass. Sheets 61 through 80
contain individuals and entire households that were missed during the original
tour. Sheets 81 through 100 were used to enumerate persons living in transient
types of dwellings (hotels, tourist facilities, flophouses). (See chapter
1 of the technical documentation for a summary of enumeration procedures
and chapter III of the Procedural History of the 1940 Census for a detailed
description).
Within an enumeration district, households were numbered in order of
visitation. This number is recorded in column 3 ("household visitation
number") of the population schedule. Persons listed on sheets 61 through
80 who were members of households that were listed on Sheets 1 through
60 were listed with the same household visitation number as the originally
listed household.
Sampling Procedures for Household
Selection
The sampling procedure was designed to produce a household sample with
each sample household containing one member who answered the supplementary
questions at the bottom of the population schedule. A systematic random
sample of population schedules within an enumeration district was made
to select a particular population schedule. A random selection of one of
the two supplementary questions lines at the bottom of the population schedule
was then made. This systematic random sample of 1 in 5 of all supplementary
question lines, i.e., 20% of the 5% census sample, provided an overall
sample of 1 in a 100 supplemental question lines. The household of the
person listed on the selected supplementary line was made the "target"
household for selection.
The probability of including the target household was calculated as
the inverse of the number of persons included in the original listing of
the household, i.e., a target household of size "h" was selected for the
sample with probability l/h for h = 1, 2, ... 7. Households with eight
or more persons were selected with a probability of one in seven to insure
an adequate number of observations of large households. Since the chance
that a household had one of its members listed on the supplementary question
line is proportional to the household’s size, this 1 in h selection probability
provided an overall 1 in 100 sample of households and their members for
households with seven or fewer persons. As an illustration of the selection
procedure, five person households that contained a person on a selected
supplementary question line were retained in the public use sample at a
rate of one in five after a random start for the first selection. For example,
if the random start value were two, then the second, seventh, twelfth,
etc. five person households were included in the sample.
All single person target households were selected with a probability
of one. If the target supplementary line person lived not in a household
but in "group quarters" (persons living in institutions, transient type
dwelling units, and persons living in households with five or more persons
who are unrelated to the household head), the selection probability was
one.
Operation of the sampling procedure was directed by a computer program
which provided instructions to the sampling clerks. The clerks, sitting
at stations with a microfilm reader and video display terminal, were instructed
to find a designated population schedule (based on the enumeration district
number and the "sheet number" entry in the heading section of the population
schedule) and a designated supplementary line. The clerk then determined
the type of household of the person on the supplementary line. If there
were five or more unrelated individuals in the household listing, it was
designated a group quarters. If the target supplemental question line person
lived in an institution (based on the entry in the "institution" item in
the heading of the population schedule or the relationship description
in column 7), or was a transient (listed on population schedules numbered
81 through 100), the person was designated as a resident in group quarters.
For target line persons in group quarters, the person record was automatically
selected for inclusion in the public use sample. For persons in regular
private households, the clerk entered into the computer the number of lines
used to list the private household. The computer calculated the number
of persons listed in the household and determined whether the household
was selected for inclusion in the sample. If not, the computer instructed
the clerk to proceed to another designated population schedule and supplementary
line.
If the target supplemental question line person was selected into the
sample, the computer then instructed the clerk to transcribe the items
from the population schedule for all members of selected households and
the single person record from all nonregular households (group quarters,
institutions, transients).
Upon reaching the end of the microfilmed population schedules for an
entire enumeration district, the clerk was instructed to make a second
pass through the population schedule sheets numbered 61 through 80. The
purpose of the second pass was to identify and transcribe data for persons
from selected households who were not enumerated with the main body of
the household on Sheets 1 through 60. Household visitation numbers or surnames,
if the household visitation number was missing from sample households on
sheets 1 through 60 and on sheets 61 through 80, were matched to determine
if any persons listed on sheets 61 through 80 were part of selected households.
If persons on sheets 61 through 80 were found to be members of previously
selected households, the data for these persons were transcribed and later
merged with the data for the rest of the household.
ENDNOTES:
-
U.S. Department of Commerce, Bureau of the Census,
"Chapter 2, Sample Selection and Data Processing Procedures," Census
of Population, 1940: Public Use Microdata Sample, Technical Documentation,
Washington, D.C., 1983, pp. 2.1-2.3.
Go Back to Sampling Procedures Index
|