PUMAs and Mini-PUMAs in the 1960 5% Sample
Each 1960 PUMA corresponds exactly to a set of 1960 counties or census tracts, which in turn corresponds approximately to one or more 2000 PUMAs. Both the 1960 PUMAs and mini-PUMAs have a minimum population threshold of 50,000. Mini-PUMAs subdivide larger 1960 PUMAs (those with populations over 100,000) into smaller areas.
This page provides links to relationship files and boundary files for these areas, followed by complete documentation of the process and guidelines IPUMS used to create them.
- Relationship between 1960 PUMAs and mini-PUMAs and their component counties and tracts (.xlsx)
- Relationship between 1960 PUMAs and 2000 PUMAs (.xlsx)
All boundary files are provided as shapefiles within .ZIP files.
Creation Process and Guidelines
One of the primary goals of the IPUMS 1960 Data Restoration Project was the creation of new geographic areas modeled on Public Use Microdata Areas (PUMAs), the smallest identifiable geographic areas in modern public-use microdata. The new 1960 PUMAs have a minimum population of 50,000, as opposed to the standard 100,000 population threshold used in later public-use samples. The project also created "mini-PUMAs", which subdivide large 1960 PUMAs (those with more than 100,000 residents) into smaller areas with populations of at least 50,000.
The process for creating these new geographies involved four phases. First, we created the input units using 1960 census tracts and counties from IPUMS NHGIS. Second, we created 1960-based approximations of 5%-sample 2000 PUMA boundaries. This allows users to identify consistent geographic areas between 1960 and 2000 samples. Third, we took all the 1960-based approximations of 2000 PUMAs that exceeded 100,000 people and subdivided them into mini-PUMAs. Finally, we altered the mini-PUMA boundaries where necessary and appropriate to match the boundaries of Universal Area Codes, a geographic coding system unique to the 1960 census.
Creating Input Units
The first phase in the process was to construct the input geographic units. We started with the 1960 census tract and county shapefiles and the corresponding total population counts from the NHGIS. The NHGIS delineated census tract and county boundaries as defined for the 1960 Census of Population and Housing, and it obtained population data from Interuniversity Consortium for Political and Social Research (2000, 2005, 2007). We joined the population data to the shapefiles, and then we unioned the census tract and county shapefiles together. Since census tracts were not defined in all parts of the country, this unioned shapefile exhaustively covered the entire United States.
Our next step was to adjust the population totals for counties that were partially tracted in 1960. For all tracted counties, we summed the populations of the tracts and subtracted that total from the county population. We assigned that difference to the county remainder polygon.
Creating 1960 PUMAs
The second phase of the process was creating Public Use Microdata Areas (PUMAs) from the 1960 input units. Our initial goal was to create 1960 PUMAs that aligned with 2000 PUMAs, which would allow users to compare consistent geographic units over time.
As a first step we created spatial approximations of 2000 PUMAs from the 1960 input units (tract, counties, and county remainders). To create the spatial approximations, we carried out the following procedures:
- Unioned the 1960 input units with the 2000 PUMAs using a 10 meter tolerance to minimize sliver polygons.
- For each 1960 input unit, identified the 2000 PUMA with which it shared the largest area of overlap (plurality area).
- Assigned the 2000 PUMA code identified in (2) to the 1960 input unit.
- Dissolved the 1960 input units on that 2000 PUMA code to create 1960-based approximations of 2000 PUMAs. During the dissolve, we summed the population of the 1960 input units to yield the 1960 population of the 1960-based approximations of 2000 PUMAs.
This first step provides us with the three necessary pieces of information required to create 1960 PUMAs: the 1960-based approximations of 2000 PUMAs, the 1960-based populations for the approximations, and the 2000 PUMAs. In total we created 1,345 1960 PUMAs that matched 2000 PUMAs.
Creating 1960 Mini-PUMAs
This phase created units we call mini-PUMAs. These are geographic units that nest within 1960 PUMAs and have a minimum population of 50,000. Many of 1960 PUMAs created in the second phase had populations greater than 100,000. Thus, we sought to subdivide those PUMAs, where feasible, into mini-PUMAs.
We used Python, ArcGIS, and REDCAP to create the mini-PUMAs. REDCAP is a software program that aggregates spatial units into spatially continuous regions while optimizing an objective function (Guo 2008). We used the Ward's method for agglomerative clustering with a Rook contiguity constraint, set the minimum population for an output region to 50,000 people, and used population density for the objective function. Thus, REDCAP minimized the within-region variation in population density and maximized the between-region variation in population density, and it created output regions with a minimum population of 50,000 people.
Inter-university Consortium for Political and Social Research. Historical, Demographic, Economic, and Social Data: The United States, 1790-1970. ICPSR00003-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 197. http://doi.org/10.3886/ICPSR00003.v1
Bogue, Donald. Census Tract Data, 1960: Elizabeth Mullen Bogue File. ICPSR02932-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2000. http://doi.org/10.3886/ICPSR02932.v1
United States Department of Commerce. Bureau of the Census. Census Tract-Level Data, 1960. ICPSR07552-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2007-12-13. http://doi.org/10.3886/ICPSR07552.v1
Guo, D. (2008). "Regionalization with Dynamically Constrained Agglomerative Clustering and Partitioning (REDCAP)". International Journal of Geographical Information Science. 22(7), pp. 801-823.