- Description
- Codes
- Comparability
- Universe
- Availability
- Questionnaire Text
- Flags
- Source Variables
- Editing Procedure

## Description

The CPUMA0010 variable supplies codes for the 0010 version of ConsPUMAs (Consistent Public Use Microdata Areas). Each 0010 ConsPUMA is an aggregation of one or more 2010 U.S. Census PUMAs (Public Use Microdata Areas) that, in combination, align closely (within a 1% population mismatch tolerance) with a corresponding set of 2000 PUMAs.

The 0010 ConsPUMAs are effectively the smallest geographic units that can be consistently identified from the geographic codes available in U.S. Census PUMS from 2000 and later (until 2020 PUMAs take effect sometime after the 2020 Census).

See the 0010 ConsPUMA Geographic Tools page for boundary files and detailed composition tables.

PUMAs and ConsPUMAs

PUMAs are the smallest geographic units identified in U.S. Census Public Use Microdata Samples (PUMS) since 1990. PUMA definitions are altered after each decennial census, so PUMA codes are not consistently comparable across time.

To support spatio-temporal analysis of PUMS data, IPUMS defines ConsPUMAs as minimally aggregated sets of PUMAs that, when consolidated, align well across samples.

Different versions of ConsPUMAs correspond to different vintages of PUMAs. The 0010 version represents areas that are consistent among 2000 and 2010 PUMAs.

A separate variable, CONSPUMA, identifies sets of 1980 county groups and 1990 and 2000 PUMAs that comprise comparable populations for samples from 1980 through 2011.

Construction Process and Mismatch Errors

To construct 0010 ConsPUMAs, we applied an aggregation algorithm that groups together 2010 PUMAs iteratively until the total population mismatch between each set of 2010 PUMAs and its closest matching set of 2000 PUMAs falls below 1% for both the 2000 and 2010 populations.

Specifically, to compute mismatch errors, we first sum, for each intersection between 2000 and 2010 PUMAs, the populations of census blocks that have their center in the intersection according to 2010 Census TIGER/Line files. We then compute the percent omission error (the percent of 2010 PUMAs' population that resides outside of 2000 PUMAs) and percent commission error (the percent of 2000 PUMAs' population that resides outside of 2010 PUMAs) for each ConsPUMA. We sum these two statistics to obtain final mismatch scores.

We compute mismatch separately for 2000 and 2010 populations in order to ensure that the mismatch between the 2000 and 2010 PUMAs associated with each ConsPUMA is acceptably small (below 1%) at both times.

The CPUMA0010 Summary, available via the 0010 ConsPUMA Geographic Tools page, provides the mismatch errors for each ConsPUMA. That page also provides a 2000-2010 PUMA crosswalk file that includes the block-based 2000 and 2010 populations for each intersection between PUMAs.

The algorithmic approach we use for 0010 ConsPUMAs differs from the process used to construct the original CONSPUMA variable. In that case, researchers visually inspected boundaries and "hand selected" ConsPUMA sets whose boundaries were closely (if not exactly) in alignment. The visual approach can ensure minimal levels of spatial mismatch, but small areas of mismatch may occasionally contain substantial populations, and large areas of mismatch may contain very small populations. Therefore, the visual approach may occasionally merge PUMAs unnecessarily or fail to merge PUMAs where the population mismatch is in fact large. In contrast, the new population-based algorithm is more consistent and reliable with respect to population mismatch.

More information on the exact steps of the algorithm will be provided in a forthcoming paper.

CPUMA0010 is a 4-digit numeric variable identifying aggregations of one or more 2010 PUMAs that, in combination, align closely (within a 1% population mismatch tolerance) with a corresponding set of 2000 PUMAs. Its values range in consecutive sequence from 1 to 1085, and each code is unique for the entire U.S. and Puerto Rico. Therefore, unlike PUMA and county group codes, CPUMA0010 codes are not state-dependent.

See the 0010 ConsPUMA Geographic Tools page for boundary files and detailed composition tables.

## Comparability

The 0010 ConsPUMA construction algorithm allows for a small degree of mismatch between the 2000 PUMAs and 2010 PUMAs associated with each ConsPUMA, but in no case may the degree of mismatch (measured as the sum of percent omission and percent commission errors) exceed 1% in terms of either 2000 or 2010 population. Therefore, changes in the PUMAs associated with ConsPUMAs should have only a very small effect on comparability across samples.

To identify the mismatch error for any 0010 ConsPUMA, see the CPUMA0010 Summary available via the 0010 ConsPUMA Geographic Tools page.

## Universe

- All households and group quarters.

## Availability

- 2020: All samples
- 2019: All samples
- 2018: All samples
- 2017: All samples
- 2016: All samples
- 2015: All samples
- 2014: All samples
- 2013: All samples
- 2012: All samples
- 2011: All samples
- 2010: All samples
- 2009: All samples
- 2008: All samples
- 2007: All samples
- 2006: All samples
- 2005: All samples
- 2004: --
- 2003: --
- 2002: --
- 2001: --
- 2000: 5%; 1% unwt
- 1990: --
- 1980: --
- 1970: --
- 1960: --
- 1950: --
- 1940: --
- 1930: --
- 1920: --
- 1910: --
- 1900: --
- 1880: --
- 1870: --
- 1860: --
- 1850: --

- 2020: All samples
- 2019: All samples
- 2018: All samples
- 2017: All samples
- 2016: All samples
- 2015: All samples
- 2014: All samples
- 2013: All samples
- 2012: All samples
- 2011: All samples
- 2010: All samples
- 2009: All samples
- 2008: All samples
- 2007: All samples
- 2006: All samples
- 2005: All samples
- 2000: PR 5%
- 1990: --
- 1980: --
- 1970: --
- 1930: --
- 1920: --
- 1910: --

## Flags

This variable has no flags.## Editing Procedure

There is no editing procedure available for this variable.