- Description
- Codes
- Comparability
- Universe
- Availability
- Questionnaire Text
- Flags
- Source Variables
- Editing Procedure
Description
HISPAN identifies persons of Hispanic/Spanish/Latino origin and classifies them according to their country of origin when possible. Origin is defined by the Census Bureau as ancestry, lineage, heritage, nationality group, or country of birth. People of Hispanic origin may be of any race; see RACE for a discussion of coding issues involved. Users should note that race questions were not asked in the Puerto Rican censuses of 1970, 1980 and 1990. They were asked in the 1910 and 1920 Puerto Rican censuses, and in the 2000 and 2010 Puerto Rican census and the PRCS. However, questions assessing Spanish/Hispanic origin were not asked in the Puerto Rican censuses prior to 2000.
The HISPAN general code covers country-of-origin classifications common to all years; the detailed code distinguishes additional groups and subgroups. See HISPRULE for details on how country of origin information was assigned prior to 1980.
In 2020, the Census Bureau updated the questionnaire text, processing, and coding of the race and Hispanic origin questions, resulting in major changes to the distribution of race and Hispanic origin categories. As a result, users should proceed with caution when comparing HISPAN and RACE in 2019-prior samples with 2020-onward samples. See the comparability tab for more details.
Codes and Frequencies
Comparability
The collection, processing, and coding of the Hispanic origin question has varied over time, impacting the comparability of HISPAN across samples and years. Notable changes are reflected in the 1980, 2000, and 2020 decennial censuses, and the 2008 ACS/PRCS and are described below.
Between 1970 and 1980, there is a major break, in the way HISPAN was derived. Before 1980, all HISPAN codes were imputed by IPUMS using the method described below and in HISPRULE. In 1980 and in later years, HISPAN is based on a question on the enumeration form asking about Hispanic origin.
To impute Hispanic origin prior to 1980, we used the method described in "Hispanics in the United States, 1850-1990: Estimates of Population Size and National Origin" by Brian Gratton and Myron Gutmann, in Historical Methods (2000) 33:137-153. An individual could be identified as Hispanic through any one of eight criteria based on Hispanic birthplace, parental birthplace, grandparental birthplace, Spanish surname, and/or family relationship to a person with one of these characteristics. See HISPRULE for a detailed explanation of these criteria and how they were applied.
Not all source variables are available in all years or in all cases in the years before 1980. SPANNAME was only coded in five southwestern states in 1960 and 1970. Information about parents' birthplaces is only available for sample-line respondents in 1940 and 1950, for native born individuals in 1960 1 percent, and for the Form 2 samples in 1970. Because of uneven coverage, we do not use parental birthplace information to impute HISPAN in 1940 or 1950.
Although Hispanic origin was assessed through a direct question in the 1970 Form 1 samples, as it was in 1980-2010, the ACS and the PRCS, the Census Bureau has since found that the 1970 data is not comparable with the data from the later years. Several factors contribute to this lack of comparability, including changes in the order and availability of response categories and changes in the Census Bureau's public relations strategies (see page B-13 of the original 1990 codebook for more details). The original 1970 data is available in the HISP1970 variable. HISPAN contains a version of the variable that is imputed using the same procedures as were used for all pre-1980 samples.
The Hispanic origin question was similar from the 1980-2010 U.S. censuses, in the 2000-2010 Puerto Rican census, in the ACS and in the PRCS. The Census Bureau reported that the 1980 and 1990 data were "generally comparable." One category (Paraguayan, code 425) was suppressed in the 2000 5 percent sample but not in the 1 percent sample; in the former, Paraguayans are included among other South Americans in code 431.
Response rates in 2000 may have been especially high because of two changes in the questionnaire. First, the Hispanic origin question was placed before the race question. Second, a new instruction asked respondents to answer both the race question and the Hispanic origin question. Beginning in 2000, the procedure for allocating missing values also changed; the respondent's race(s) and Spanish surname were taken into account, unlike in previous years.
In the 2008 ACS, wording changes to the questionnaire (e.g., switching to "Hispanic origin" instead of "Hispanic") likely identified Hispanics--mostly native-born--who would not have been captured before. For more information, see this Census Bureau research note.
The classification of people in smaller Hispanic groups has varied. The 1980 census simply used an "other" check-box, while the 1990 census, 2000 census, 2010 census, the ACS and the PRCS asked respondents to write in their specific origin if it was not listed. Additionally, the 2008 ACS and PRCS gave six examples of these origins: Argentinean, Colombian, Dominican, Nicaraguan, Salvadoran, and Spaniard. Detailed write-in responses that were not classified are coded 499 "Other, not elsewhere classified" in 1980 and 1990. In 2000, the ACS, and the PRCS codes 417 ("Central American, n.e.c.") and 431 ("South American, n.e.c.") were added. Respondents in 1980-2010, the ACS and the PRCS who marked the "other" box but did not write anything further are coded 498 "Other, not specified." In earlier years, code 498 consists of people who were allocated Hispanic origin through their Hispanic surname but whose country of origin could not be imputed through their state of residence (see HISPRULE).
In 2020, the Census Bureau updated the questionnaire text, processing, and coding of the race and Hispanic origin questions, resulting in major changes to the frequency distributions of the race and Hispanic origin categories. These updates were first implemented in the 2020 decennial census and were then implemented in the 2020-onward ACS and PRCS samples. These changes reflect a major revision to the race and Hispanic origin questions, and as a result, users should proceed with caution when comparing HISPAN and RACE in 2019-prior samples with 2020-onward samples.
The Hispanic origin question was updated to include a fourth response option, “Yes, another Hispanic, Latino, or Spanish origin,” and a write-in space for additional responses. In addition, the instruction to “Print origin, for example” was changed to “Print, for example.” The example groups were revised from “Argentinean, Colombian, Dominican, Nicaraguan, Salvadoran, Spaniard, and so on.” to “Salvadoran, Dominican, Colombian, Guatemalan, Spaniard, Ecuadorian, etc.” to represent the largest Hispanic origin population groups and the geographic diversity of the Hispanic or Latino category, as defined by the U.S. Office of Management and Budget 1997 standards. Response processing and coding for race and Hispanic origin questions was also updated. This was done by increasing the number of coded write-in responses recorded in each write-in area from two to six responses, increasing the number of characters coded in each write-in area from 30 to 200 characters, and using a single code list for both the Hispanic origin and race questions rather than separate coding lists.
For more detail about these changes, see the Census Bureau’s blog post on changes to the questionnaire text, processing, and coding as well as compliance with standards set by the U.S. Office of Management and Budget.
Universe
- All persons.
Availability
- 2023: All samples
- 2022: All samples
- 2021: All samples
- 2020: All samples
- 2019: All samples
- 2018: All samples
- 2017: All samples
- 2016: All samples
- 2015: All samples
- 2014: All samples
- 2013: All samples
- 2012: All samples
- 2011: All samples
- 2010: All samples
- 2009: All samples
- 2008: All samples
- 2007: All samples
- 2006: All samples
- 2005: All samples
- 2004: All samples
- 2003: All samples
- 2002: All samples
- 2001: All samples
- 2000: All samples
- 1990: All samples
- 1980: All samples
- 1970: All samples
- 1960: All samples
- 1950: 1%
- 1940: All samples
- 1930: All samples
- 1920: All samples
- 1910: All samples
- 1900: All samples
- 1880: All samples
- 1870: All samples
- 1860: All samples
- 1850: All samples
- 2023: All samples
- 2022: All samples
- 2021: All samples
- 2020: All samples
- 2019: All samples
- 2018: All samples
- 2017: All samples
- 2016: All samples
- 2015: All samples
- 2014: All samples
- 2013: All samples
- 2012: All samples
- 2011: All samples
- 2010: All samples
- 2009: All samples
- 2008: All samples
- 2007: All samples
- 2006: All samples
- 2005: All samples
- 2000: All samples
- 1990: --
- 1980: --
- 1970: --
- 1930: --
- 1920: --
- 1910: --
Flags
QHISPANEditing Procedure
HISPAN (Hispanic origin) and RACE (Race)
ACS Years: 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015
ACS editing procedure:
There are two questions on race and ethnicity in the ACS. The first asks respondents if the Spanish/Hispanic/Latino. There is a "No" checkbox and three subgroups identified in separate checkboxes: 1. Mexican, Mexican-American, Chicano 2. Puerto Rican, and 3. Cuban. Then there is a final checkbox for "other Spanish/Hispanic/Latino" and a box to write in a specific group.
The question on race multiple checkboxes for different races, and four checkboxes also have a write-in box. The American Indian or Alaska Native box includes a write-in box for the enrolled or principal tribe. The "Other Asian," "Other pacific Islander" and "Other race" checkboxes likewise includes a write-in box for a more specific race.
IPUMS reports both detailed codes for RACE and broader codes. The edits below refer to the detailed codes. The flag variables (QHISPAN and QRACE) will indicate when the values were edited or allocated.
Donald Duck cases
If a respondent checks all race boxes (a "Donald Duck" case), RACE will be replaced with a missing value and later allocated. If all of the Hispanic checkboxes are selected, they will be blanked.
Geographically dependent editing
Some editing of race and Hispanic origin depend on where the respondent is currently living. For example, if a person lives in South Carolina and reports their race as "Turk," it will be replaced with "White." If a person reports their race as "Wales" and doesn't live in Alaska, it will be replaced with "White." If a person in Humboldt County, CA reports being "Trinidad", their race will be replaced with "American Indian." In 2010 and later, if a person in Humboldt County, CA reports being "Tobago", their race will be replaced with "American Indian."
If a person reports their race as "Moor" and American Indian or Alaska Native and lives in Cumberland County, New Jersey, RACE will be replaced with "Moor" as an American Indian tribe. If they do not live in Delaware or New Jersey, RACE will be replaced with "Black."
American Indian
If a person reports being American Indian and also reports being Cajun or Wesort on one of the write in boxes, and none of the checkboxes for "Other race" Other Asian" or "Other Pacific Islander" are checked, RACE will be replaced with just "American Indian."
If a person reports being American Indian and Mexican, RACE will be replaced with "Mexican American Indian."
If a person selects "American Indian" or "Indian" but offers no specific tribe or ethnicity, other household members are used to distinguish between Asian Indian and American Indian. If no other members of the household are American Indian, and all other members of the household are Asian, that person's RACE will be replaced with "Asian Indian." If the person reports being "Indian," no other members of the household are Asian Indian, and all other members of the household are American Indian, that person's RACE will be replaced with "American Indian." If the person reports being "American Indian," no other members of the household are either Asian Indian or American Indian, and all other members of the household are Indian, that person's RACE will be replaced with "Indian."
If a person reports their race as "Half-breed," it will be blanked.
General responses will be blanked when detailed response is available
If there are more detailed responses given in the write-in boxes, the checkbox for "Other race" will be blanked. For non-relatives (RELATE), the checkboxes for "American Indian or Alaska Native," "Other Asian," and "Other Pacific Islander" will be blanked when there is a more detailed write-in value given.
When a checkbox appears to contradict the white in value, the check box will be blanked. For example, if the "White" checkbox is selected and then a black ethnicity is written in the write-in box. Likewise, if the "Black" checkbox is selected but a white ethnicity is written in, the checkbox will be blanked.
If the write-in value is more detailed than the check box, the check box will also be blanked (eg, the "White" checkbox is selected and then a specific white ethnicity is reported on the write-in box). This also applies to the HISPAN variable. For example, if a person selects the "Mexican" checkbox and then reports a more specific Mexican value in the write-in box, the checkbox will be blanked and the more specific value will be used.
If a person reports two write-in values where one value is a specific value while the other is a more general version of that same ethnicity, the less specific value will be blanked. For example, if a person reports two values for an American Indian or Alaska Native write in value and one is specific while the other is not, the less specific value will be blanked. This also applies to the HISPAN variable. For example, if a writes in a specific value for a Hispanic ethnicity and also a general version of the same ethnicity, the general version will be blanked.
If a person includes multiple write-in codes that are all white ethnicities, RACE will be replaced with "Multiple white." If a person includes multiple write-in codes that are all black ethnicities, RACE will be replaced with "Multiple black." If a person includes multiple write-in codes that are all other races, RACE will be replaced with "Multiple other." If a person has multiple checkboxes and a "Multiple..." value, the "Multiple..." value will be removed.
Too many values
If a person has more than 8 races selected, RACE will be replaced with missing and later allocated.
If a person reports not being Hispanic and write in a value in the write-in box that is also not Hispanic, the write-in value will be blanked.
If a person has multiple Hispanic values reports, HISPAN will be replaced with "Multiple Hispanic." If there are mixed Hispanic and non-Hispanic values, and the person's surname is Hispanic, HISPAN will be replaced with "Mixed Hispanic." If there are mixed Hispanic and non-Hispanic values, and the person's surname is not Hispanic, HISPAN will be randomly replaced with "Mixed Hispanic" or "Non-Hispanic."
Starting in 2010, among those with 8 or more race codes, the reported races will be prioritized and certain race groups will be kept. For example if a person reports more than 8 races and among those more than 3 are American Indian or Alaska Native races, two detailed race groups will be prioritized and kept, the next will be replaced with "Multiple American Indian and Alaska Native responses." The remaining American Indian or Native Alaskan values will be blanked. A similar process occurs for other racial groups until everyone has 8 or fewer races codes.
Missing or inconsistent Hispanic Origin and Race
If a person is missing a value for HISPAN, but has a detailed value for RACE that indicated a Hispanic origin, HISPAN will take on the detailed race value. For example, if a person reports their race as "Argentinian," and has a missing value for Hispanic origin, HISPAN will be replaced with "Argentinian."
Starting in 2010, if a person reports their detailed race (RACE) as a Hispanic group, but selects "Non-Hispanic" for HISPAN, HISPAN will be replaced with the value from RACE. For example, if a person reports their race as "Argentinian" but their Hispanic origin as "Non-Hispanic," HISPAN will be replaced with "Argentinian."
If a person has a value for Hispanic origin, but not race, RACE will be the value of HISPAN. There are some exceptions when a Hispanic ethnicity is white, in which case RACE will be replaced with "White." This includes many values, for examples "Portuguese," "Azorean," "Andalusian," and "Asturian."
If after the above edits, the value for RACE and HISPAN remains missing, the values will be replaced with the values from another household member. If RACE is missing, but not Hispanic origin, RACE will be replaced with the value of another household member who has the same value for HISPAN. If HISPAN is missing, but not race, HISPAN will be replaced with the value of another household member who has the same value for RACE.
Write-in value inconsistent with checkbox
If a person selects "American Indian or Alaska Native" "Other Asian" or "Other Pacific Islander" but has a write-in value that is inconsistent with the checkbox, other household members will be used to infer the correct RACE. If no relative in the household has a value for RACE that is consistent with the checkbox, it will be blanked.
Allocation of missing values
For those who have a missing value for RACE and HISPAN after the above edits, the values will be allocated. For the reference person (RELATE), the allocated values will be drawn from another reference person, with a similar value for if their surname is Spanish and AGE. If the surname is missing or is not clearly Hispanic or non-Hispanic, the value will be drawn from another reference person with a similar age.
If a person is missing RACE but has a value for HISPAN, the allocated value of RACE will be drawn from someone with a similar value for HISPAN and AGE.
If a person is missing HISPAN but has a value for RACE, the allocated value of HISPAN will be drawn from someone with a similar value for if their surname is Spanish, RACE, and AGE. If the person is Filipino or American Indian, the value will be drawn from another person with a similar race and age. If the surname is missing or is not clearly Hispanic or non-Hispanic, the value will be drawn from another person with a similar race and age.
If person only has "Some other race" as their race, but another family member with the same value for HISPAN has a more detailed value, that family member's value will replace "Some other race."