DATA
Create an Extract
Download or Revise Extracts
Analyze data online
Register as a New User

DOCUMENTATION
FAQ
User's Guide
Variables
Samples

RESOURCES
  Enumeration Forms
Published Census Volumes
Revision History

RESEARCH
Citation and Use
Bibliography
Related Sites

CONTACT US
  Feedback
IPUMS Staff
How to Help

Revisions Made to IPUMS

Below is a list of significant changes made to the IPUMS-USA database and documentation since 1995. Earlier versions of the IPUMS database and documentation are available via the IPUMS archive page.

For changes and data releases that we have planned for the future, see our data release schedule.  


April 11, 2008.  Posted IPUMS Version 4.0, the first major revision of the IPUMS files since 2004. Includes revised versions of all samples from 1850-1930, a new 1880 5% sample, and 13 new samples from the Puerto Rican Censuses of 1910-2000 and the Puerto Rican Community Survey. IPUMS 4.0 contains many new variables, including long-term hispanic identification back to 1850 (HISPAN), a consistent single-race identification variable from 1850-2006 (RACESING), a battery of socioeconomic indices, original strings for occupation (OCCSTRNG) and industry (INDSTRNG), and new detailed weight variables for the historical samples (HHWTDET and PERWTDET), and new standardized low-level geographic identifiers (MCD and INCORP). More information is available on the IPUMS 4.0 release page.

The most recent previous version of IPUMS data and documentation (IPUMS 3.0) is still available via the IPUMS archive page at ICPSR. The archive page permits users to revise old extracts, create new extracts, and download data and documentation. The link titled "IPUMS-USA website as of March, 2008" leads to a fully-functioning mirror of the IPUMS website as it existed prior the release of IPUMS 4.0. The archive page contains versions of the website from previous years as well.

February 14, 2008.  Posted a new version of the 1950 census sample, with a correction made to the BPL variable.  Several cases that had been erroneously coded "Missing/blank" are now coded correctly as follows: 94 cases coded "Israel," 9 coded "Byelorussia," and 3 coded "Pakistan." In the 2000 census samples, changed the MIGMET5 code for Hattiesburg, MS from 3285 to 3300 to be consistent with our METAREA coding.

Re-released VALUEH for the 2006 ACS sample; during a recent website update, VALUEH had inadvertently been removed from the data extract system.

December 14, 2007.  Posted new versions of the 2005 and 2006 ACS sample: released CITYPOP for both samples. Fixed a small error in QCONDOFE and QVALUEH in the 2006 sample. Prior to the udpate, a small number of cases had missing values for these two variables.

Posted new versions of the 1% and 5% census samples for 2000: fixed PUMALAND, PUMAAREA, and ACREPROP.  Prior to this correction, these three variables contained incorrect data.

November 15, 2007.  Posted new versions of all ACS samples; a correction was made to the BUILTYR2 variable. Previously, households built prior to 1939 or earlier (BUILTYR2 = 10) were grouped with those reported as being built in 2005 or later (BUILTYR2 = 1).

October 15, 2007. Added a 1% sample from the 2006 American Community Survey (ACS). The sample has approximinately 2,970,000 person records. The 2006 ACS is the first ACS sample to provide information on group quarters, which can be identified in the GQ variable. Researchers analyzing multiple ACS samples over time should remove group quarters cases, since they are available only in the 2006 data.

The lowest level of geographic identifer in the 2006 ACS is the PUMA; 2006 PUMAs have the same boundaries as those in the 2005 ACS and the 2000 census samples. The IPUMS version of the 2006 ACS provides the following additional geographic identifiers: CITY, METAREA, METRO, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWTYPE, PWMETRO, PWCITY, and PWPUMAS. These variables were constructed at the University of Minnesota and are not available via the Census Bureau.

More information on the background and future plans for the American Community Survey is available at the ACS information page.

August 17, 2007. Added HHTYPE for all samples from 1940 to 2005. In the future, HHTYPE will also be made available for all samples from 1850-1930.

July 25, 2007. Added RACESING to all samples from 1900, 1910, and 1940. Added HISPAN and HISPRULE to all samples from 1910.

July 19, 2007. Posted updated versions of all samples from 1900, 1910, and 1930. Corrections were made to the CHSURV variable in the 1900 and 1910 samples. All values were previously 0 or 1. The updated samples contain correct values. Corrections were made to the NSIBS variable in the 1900, 1910, and 1930 samples. Previously, a small number of persons identified as "siblings" in the RELATE variable (code 701) incorrectly received a value of 0 for NSIBS. This error has been corrected.

July 10, 2007. Posted updated versions of all samples from 1970 and 1980, and the ACS samples from 2000, 2001, and 2002. The updated samples include fixes to COSTELEC, COSTGAS, COSTFUEL, and COSTWATR.  These variables did not properly identify cases having values of greater than 9990. All cases in this range are in the universe but have unreported values, usually because utility costs were included in rent payments. The old versions of the datasets incorrectly identified these cases as not being in the universe. COSTELEC and COSTGAS had the additional problem of presenting monthly values instead of annual values. These problems are now fixed.

The new 1980 5% sample additionally fixes a problem in the CITY variable. In the old sample, San Francisco was incorrectly identified. It has been corrected.

June 21, 2007. Posted an updated version of the 1930 1% sample. The updated sample includes fixes of minor problems in OCCSCORE (missing occupation data was not being allocated), YRSUSA2 (some allocated values were inconsistent with YRSUSA1), and QMARST (this variable indicated that we made more logical edits than were actually made).

Released new data extraction system with the "Attach Variables" feature, which allows researchers to create variables specifying characteristics of respondents' spouses, mothers, fathers, and household heads.

Released new version of the 2005 American Community Survey sample that includes 160 replicate weight variables (see REPWT and REPWTP).

Released CITYPOP for 1850-1930 samples. Due to a technical problem, we had not been offering CITYPOP in these samples since February 2007. The CITYPOP values that we are providing now are not different from the values that were available prior to February.

June 7, 2007. Posted 1930 1% sample (up from previous 0.5% sample for 1930). The new sample includes several new occupation and industry variables - OCC, OCC1930, IND, IND1930 - as well as HISPAN and RACESING.

April 26, 2007. Added new occupation crosswalks (OCC to OCCSOC) for the 2000 census samples and the ACS samples; these are availabe via links from our Occupation and Industry documentation page.  Also improved our OCC and OCCSOC code lists (available from the respective variable descriptions) for the 2000 census and ACS samples.

April 24, 2007. Posted a new version of the 2005 ACS; a correction was made to the MORTAMT1 variable.

April 9, 2007. Added Consistent PUMA variable and shapefiles (see CONSPUMA). CONSPUMA reconciles differences in low-level geographic identifiers in the 5% samples from 1980, 1990, , and the 2005 ACS. Also released all new shapefiles for low-level geographic identifiers from 1970-2005. Changes to the previous shapefiles were minor: numerous "holes" in the maps were assigned to their appropriate PUMA, County Group, or SEA. All files are accessible via the links on our geographic tools page.

March 27, 2007. Changed the name of the RACHIST variable to RACESING.

March 21, 2007. Added QHISPAN, the data quality flag for HISPAN, to the and ACS samples. Posted new versions of the 1940 and 1950 samples: a minor correction was made to the CHBORN variable.

Posted new versions of the 1910 samples: we corrected a problem with SERIAL so that households within multi-household dwellings are now uniquely identified.  The problem had affected less than .13% of households in the 1910 1.4% sample.

February 15, 2007. Created HISPAN and HISPRULE variables for the 1900 and 1930 samples. A later data release will create these variables for the 1850-1880 and 1910-1920 samples.

Created a new harmonized version of the TRIBE variable, which is now available in 1900-1910, 1990-2000, and the ACS.

Posted a new version of the 1940 sample: a minor correction was made to the VET1940 variable.

January 31, 2007. Relased new harmonized occupation and industry variables for 1950-2005: OCC1990 and IND1990. The OCC1990 variable was created in collaboration with researchers at the Bureau of Labor Statistics. Both variables are available only via the IPUMS.

Added metropolitan area designations to the 2003 ACS, in METAREA and MET2003. Metropolitan areas are also identified in the 2005 ACS IPUMS sample.

Created HISPAN and HISPRULE variables for the 1940-1970 samples.

Added RACHIST values to the 1950-1990 samples and the 2005 ACS IPUMS sample. RACHIST adapts an alogrithm developed at the National Center for Health Statistics to assign single races to persons who reported more than one race from 2000 onward.

December 19, 2006. Replaced the 1-in-250 1910 sample with two new samples: the 1910 1% sample and the 1910 1.4% sample with oversamples. The 1% sample includes a 1-in-100 national population sample, including Alaskans, Hawaiians, and persons enumerated on the American Indian Schedules. The 1.4% sample with oversamples includes a 1-in-70 national population sample that has been combined with large oversamples of Blacks, Hispanics, Alaskans, Hawaiians, and persons enumerated on the American Indian schedules. The 1910 Weighted sample must be used with weighting variables (see PERWT and HHWT).

Replaced the 1900 General sample with two new samples: the 1900 1% sample and the 1900 1% sample with oversamples. The 1900 1% sample is a 1-in-100 national sample, including Alaskans, Hawaiians, and persons enumerated on the American Indian Schedules. This sample has the same cases as the former "1900 General sample" did, though some variables and values have been modified in minor ways. The 1900 1% sample with oversamples is a 1% national sample that has been merged with 1-in-5 oversamples of Alaskans, Hawaiians, and persons enumerated on the American Indian schedules. The 1900 1% sample with oversamples must be used with weighting variables (see PERWT and HHWT).

More information about these samples is available in the 1900 and 1910 sections of the sample descriptions page. We expect to release a revised version of these samples in March 2007. The revised samples will included detailed geography at the minor civil division level and integrated versions of variables specific to the Alaskan, Hawaiian, and American Indian populations.

December 10, 2006. Posted new version of the 2005 ACS sample that includes the following new geographic identifiers: CITY, METAREA, METRO, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWTYPE, PWMETRO, PWCITY, and PWPUMAS. These variables were constructed at the University of Minnesota and are not available via the Census Bureau.

November 29, 2006. Posted new versions of all samples from 1940 through 2005. The new samples include several minor improvements to SPLOC, MOMLOC, and POPLOC. These modifications have resulted in minor changes to the constructed household variables, the family interrelationship variables, POVERTY, and FTOTINC. Detailed information on these variables can be found in the family interrelationships documentation.

An error was corrected in the POVERTY variable for all samples. In two-person families where one person was over age 65 and the other person was under age 65, we sometimes used slightly different poverty thresholds for each member of the family. We should have applied the same threshold to both members of the family. This resulted in several thousand cases in each sample having a poverty value that was off by an average of two percent (10 points on POVERTY's 1-500 scale). We have corrected the problem.

The new samples also include a small number of corrected income values in the 1950, 1960, and 1970 samples. The majority of cases affected have negative income values.

November 20, 2006. Corrected a problem in the RACE variable in the 2005 ACS sample. There were approximately 3,000 cases with missing values. All of the cases were multi-racial persons. All cases are now assigned to the appropriate categories.

October 11, 2006. Posted 1% sample from the 2005 American Community Survey (ACS). The 2005 sample is the first ACS microdata to identify sub-state geography, including PUMA, MIGPUMA1, and PWPUMA00. The IPUMS version of the 2005 ACS also idenifies metropolitan status (METRO). A December 2006 release of the IPUMS 2005 ACS sample will identify CITY, METAREA, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWMETRO, PWCITY, PWTYPE, and PWPUMAS. These variables are being constructed at the Minnesota Population Center and are not available via the Census Bureau.

The base data for the IPUMS 2005 sample is the ACS data that the Census Bureau released on October 5th, 2006. The Census Bureau had originally released a version of the dataset on September 11th, 2006. The September release contained several small errors, so the Census Bureau updated the dataset in October. The erroneous dataset was never available via the IPUMS data extraction system.

More information on the background and future plans for the American Community Survey is available at the ACS information page.

October 1, 2006. Posted 0.5% sample from the 1930 census (up from the previous 0.2% 1930 sample).

September 6, 2006. Posted new version of IPUMS-USA website. The website has a new design, and the content of most variable descriptions has changed at least slightly. Users can still access all extract requests made on the old website.

June 30, 2006. Posted new versions of the 2000 1%, 5%, and Unweighted samples: a correction was made to the MIGPLAC5 variable.

April 27, 2006. Posted new versions of all ACS samples: a correction was made to the INCBUS00 variable.

April 7, 2006. Posted new versions of all 2000 Census samples and all ACS samples: a correction was made to the OCC variable.

January 20, 2006. Posted new versions of the 2000 1%, 5%, and Unweighted samples, as well as the 2000 ACS: a correction was made to the MARST variable.

November 30, 2005. Posted 13 new samples on the IPUMS-USA website. All samples were previously available on the IPUMS-USA Beta site, which was shut down. The new samples combined add nearly 15 million cases to the IPUMS database. For more details on this data release, see the sample information page.

October 7, 2005. Posted new versions of the 2000 1%, 5%, and Unweighted samples: a correction was made to the VET55x64 variable. New versions of the 2000-2004 ACS samples were also posted. In all eight samples above, improvements were made to the INDNAICS and OCCSOC variables.

Corrections were made to the YRIMMIG, YRSUSA1, and YRSUSA2 variables in the ACS 2001-2004 samples.

September 16, 2005. Released the 2004 American Community Survey (ACS) sample on the IPUMS Beta site.

September 9, 2005. Released a new 2000 1% flat sample on the IPUMS Beta site. This is a national random sample drawn from the 2000 5% Census sample.

September 1, 2005. Posted a new version of the 1930 1-in-500 sample. Corrections were made to the VET1930 variable and the AGEMARR variable.

June 27, 2005. Posted new versions of the 2000 1% and 5% samples, and the 2000-2003 ACS samples. Added the following variables: RACHIST, PROBAI, PROBAPI, PROBBLK, PROBOTH, and PROBWHT. RACHIST is an historically compatible race variable which 'bridges' multiple-race responses into their most likely single race category. The other variables give detailed probabilities of each single-race response and are best used in combination with one another.

Removed RACGEN00, RACDET00, and SPANAMER from the data and documentation. The variables RACGEN00 and RACDET00 were redundant with RACE. A variable similar to SPANAMER can be created using the IPUMS variables MTONGUE, BPL, MBPL, FBPL, SPANNAME, and STATEFIP.

May 20, 2005. Released a revised version of the 1-in-100 sample of the 1900 census (see the August 21, 2003 revision note for information on the previous version of this sample). The revised dataset includes records extracted from Alaska, Hawaii, and the American Indian 1-in-5 oversamples (the complete oversample datasets are available via the IPUMS raw data download page).

Users should also be aware that the smaller 1900 sample previously available (the 1-in-750 "Preston" sample) will no longer be available via the IPUMS extract system. Users wishing to access this data can still download the entire dataset and SPSS command file via the IPUMS raw data download page).

May 13, 2005. Released a revised version of the preliminary 1-in-500 sample of the 1930 census. Corrected a major error in the race variable. The April 25th sample gave the "White" code (detailed race code 100) to all persons who reported their race as "Mexican." The revised sample gives these persons the new "Mexican" race code (detailed race code 140). The revised sample also corrects minor coding and labelling errors in the following variables: RENT30, GQTYPED, NUMHHTAK, FARMSCHD, ENUMMO, RADIO, HOMEMKR, VET1930, IND1950, MTONGUE, FBPL, MBPL, CITY, METRO, METAREA, URBAREA, and MDSTATUS.

April 25, 2005. Released preliminary 1-in-500 sample of the 1930 census. We expect to release a final 1-in-100 sample of the 1930 census by late 2007.

February 23, 2005. Posted new versions of the 2000-2003 ACS samples: a correction was made to the STATEICP variable.

February 1, 2005. Removed the POV2000 variable from the documentation and data. POV2000 was redundant with the IPUMS POVERTY variable. Both variables use the poverty matrix developed by the Social Security Administration in 1964 (and revised twice in the years since). The Office of Management and Budget's Directive 14 prescribes this definition as the official poverty measure for federal agencies to use in their statistical work.

November 23, 2004. Released the following samples on the IPUMS Beta site: the 2003 American Community Survey (ACS) sample, the 1990 Labor Market Areas sample, the 1980 Labor Market Areas sample, and the 1980 Detailed Metro/Nonmetro sample.

October 13, 2004. Posted new versions of the 2000 1% and 5% samples, and the 2000-2002 ACS samples. The following variables were improved: OCC1950, SEI, OCCSCORE, and IND1950. The new variables utilize the Census Bureau's recently published occupation and industry crosswalks between the 1990 and 2000 censuses.

Made a slight correction to the multipliers used to construct the POVERTY variable in the 2000-2002 samples (for more information see the 1990 poverty status definition).

August 27, 2004. Posted a new version of the 2000 5% sample: a correction was made to the METAREA variable.

August 6, 2004. Posted a new version of the 2000 5% sample: a corrections was made to the PWCITY variable.

June 28, 2004. Posted new versions of all the 2000 and ACS samples. The RACE variable has been expanded to incorporate all information from the new multiple-race variables. Details about multiple-race responses are now included, some value labels were clarified, and a few other categories were added. Also, CITYPOP was added to the 2000 1% and 5% samples, and corrections were made to MOBLHOME and METAREA.

June 17, 2004. Released American Community Survey (ACS) samples for 2000, 2001, and 2002 on the IPUMS Beta site.

May 6, 2004. Made 2000 5% sample available via the main IPUMS-USA site.

May 1, 2004. Posted new versions of all of the 2000 samples. The 2000 5% sample now includes variables for Super-PUMA of Work (PWPUMAS) and Super-PUMA of Migration (MIGPUMAS). For the 2000 1% sample, Super-PUMA information that was previously in the PWPUMA00 and MIGPUMA variables is now in the new PWPUMAS and MIGPUMAS variables. A new version of the INCRETIR variable in all three 2000 samples now includes retirement incomes of greater than $99,998 (the previous Top code). All three samples include a corrected version of the POV2000 variable.

Posted new versions of all 1990 samples that account for the greater width of INCRETIR (see above).

April 22, 2004. Posted a new version of the 2000 1% sample: a correction has been made to the MIGCITY5 variable.

March 10, 2004. Posted new versions of the 2000 1% sample and the 2000 5% sample. Both samples now include the PWCITY variable. For those living in group quarters, the variable HHWT now has the PERWT value, rather than a value of 0. In addition, corrections were made to the following variables: BPL, STEPMOM, STEPPOP, MARST, and PUMASUPR.

Posted new versions of the 1990 State, Metro, Elderly, and Unweighted samples. A problem in the MORTGAGE variable was corrected in the new samples.

January 30, 2004. Posted new versions of the 2000 1% sample, the 2000 5% sample, and the Census 2000 Supplementary Survey (C2SS). The 2000 1% and 5% samples now include variables for CITY and MIGCITY5. Minor problems in PWPUMAS, PWPUMA00, MIGPUMA, YRIMMIG, and MORTGAGE have also been corrected in the new samples. The new C2SS sample includes corrected values for INCBUS00 (all values were 0 before).

September 9, 2003. Posted new versions of the 1990 State, Metro, Elderly, and Unweighted samples. FTOTINC and HHINCOME now contain negative values for families and households having a net loss of income. A problem in the PERWT variable was corrected in all samples. These were the only affected variables.

August 21, 2003. Penultimate 1-in-100 version of the 1900 Minnesota sample released on the IPUMS Beta site. The dataset includes 170,438 households containing 754,631 individuals. This version has a number of flaws that will be corrected for the ultimate final version of the 1900 Minnesota sample, which we anticipate releasing in the Spring of 2004. The older 1-in-200 preliminary sample is still available via the data extract system at the main IPUMS-USA site.

  • No cases from Alaska and Hawaii are included in the current sample.
  • Data quality flags are not yet available.
  • Detailed geographic variables are not yet available (these include MDSTATUS, METDIST, URBAREA, MCIVDIV, INCPLACE, and INCORP).
  • Coding is not yet complete on the occupation variable (OCC).
  • Native Americans enumerated on the special 1900 Indian Schedules are not included in the current sample (although the current version does contain Native Americans enumerated as part of the general population). The 1900 Indian Schedules contained questions not asked on the general schedule, including tribe, percentage Indian blood, and tax status, among others.
  • Detailed German birthplaces in the current 1900 sample are coded according to the new scheme developed for the 1860 and 1870 samples. Users of this data should note that these codes do NOT correspond to those listed in the BPL variable description. Detailed German birthplace codes for the 1860-70 and 1900 samples are available here.

Users should also be aware that the smaller 1900 sample previously available (the 1-in-750 "Preston" sample) will no longer be available via the IPUMS extract system. Users wishing to access this data can still download the entire dataset and SPSS command file via the IPUMS raw data download page.

October 11, 2002. Reposted preliminary version of 1900 Minnesota sample. The previous version had incorrect values for children ever born (CHBORN). The new dataset contains corrected values. No other variables have been changed.

July 11, 2002. Final versions of the 1860 and 1870 samples released. The final 1-in-100 1860 IPUMS sample includes 54,094 households containing 273,947 free individuals and an additional 1,343 unoccupied dwellings. (A preliminary sample of the nation’s slave inhabitants is available separately at the 1860 Slave Schedule page.) The final 1-in-100 1870 IPUMS sample includes 79,023 households containing 383,308 individuals and an additional 1,447 unoccupied dwellings. Frequencies in the on-line documentation will be updated in the next few months. Both the 1860 and 1870 IPUMS samples are also available with oversamples of the black population. Sample weights for the flat and black oversamples have been adjusted to be representative of the total population.

The final 1860 and 1870 IPUMS samples now include occupation codes based on the U.S. Census Office’s 1880 classification system and detailed birthplace codes for individuals born in Germany. Several other changes have also been made, including a slightly modified urban/rural definition, minor changes in birthplace and occupation coding, and small changes in personal estate and real estate values. In addition, the final samples incorporate a few data additions and subtractions from the preliminary samples. For details of these changes and a listing of the new Germany detailed birthplace codes, click here.

May 7, 2002. Released preliminary version of the 1900 Minnesota sample. This 1900 Minnesota sample is a 1-in-200 nationally representative sample of dwellings taken from the 1900 U. S. Census of Population. The final version is scheduled to be released in 2004 and will have a 1-in-100 sampling density. Frequencies for this sample will be added to the documentation summer 2002. Currently both the 1900 Minnesota and the 1-in-760 1900 Preston sample are available. Ultimately the 1900 Minnesota sample will replace the 1900 Preston sample, although the Preston sample will be available by request.

The fundamental difference between the two 1900 samples pertains to sample design. In the 1900 Preston sample nonfamily individuals--boarders, lodgers, inmates, and military personnel--were sampled as individuals regardless of household size. In contrast, the 1900 Minnesota sample follows the general sample design used for the 1850-1880 and 1920 samples. For a discussion of issues relating to sample design see Chapter 2 of the IPUMS documentation.

July 11, 2001 -- The IPUMS extract system upgrade was successfully installed on Wednesday, July 11, 2001. No changes were made to the IPUMS data. The new extract system will process user data requests faster than the previous system and will prevent small jobs from being continually sidetracked for large data requests in the queue. Since this upgrade affects only the behind-the-scenes data extraction system, users will notice little change in the request process, itself. Re-registration is not required; previous jobs will be available for revision; and new jobs will begin numbering from the user’s last completed job in the old system.

March 7, 2001. Released new preliminary (penultimate) versions of the 1860 and 1870 samples. Frequencies in the documentation will not be changed until release of final versions of these datasets, scheduled for summer 2002. Two versions of the 1860 and 1870 samples are now available:

  • a flat 1-in-100 sample of all dwellings, and
  • a black oversample containing a 1-in-50 sample of dwellings containing one or more blacks and a 1-in-100 sample of all other dwellings.
The sample weights in both the flat and black oversamples of the preliminary 1860 and 1870 PUMS have been adjusted to be representative of the total population. Although we believe that the new samples are near their final form-we expect only minor changes in the number of cases and the coding of a few variables between the current and final versions of the samples--users are advised that the current releases have a few known problems. In particular, the occupation ("OCC") variable in 1860/1870 is not coded. Users should rely on the occupation 1950 basis ("OCC1950") variable for studying occupation and labor force participation. In addition, detailed birthplace codes are not available for individuals born in Germany. Users may still use the birthplace variable (BPL), but no detail will be returned for German birthplaces.

July 1, 1999.  Released new versions of 1850, 1860, 1870, 1880, 1900, and 1910 samples, containing the following enhancements and corrections: 

  • New geographic variables (METDIST, MDSTATUS, MCIVDIV, INCPLACE, INCORP, URBAREA) were added to 1850, 1880, and 1910 samples.  
  • Minor fixes to OCC1950, IND1950, CITIZEN, LIT, COUNTY, SEA, GQTYPE, GQFUNDS, NATIVITY, VOTE, MARRINYR, NAMEFRST, and NAMELAST.
  • Missing age allocation procedures fixed to allow age 0 to be allocated. Improved rules for spouse imputation (IMPSP).  
  • Added cases from Bradley county, TN to 1850 that had been inadvertently dropped from the 1850 sample.  PERWT adjusted slightly. 

Friday, August 18, 2000 -- The old IPUMS extract system was replaced by a new system incorporating enhanced features requested by users. One of the key features of the new system is the ability to modify and resubmit previous jobs. Data files from the two systems have been combined on a user-specific summary site. IPUMS data users previously registered in either extract system will not have to reregister to use the new extract system. Extract requests in the new system will begin numbering jobs from the highest numbered job in a user's personal extract summary. January 22, 1999. Major error in the November 25 version of 1860 and 1870 samples corrected.  The 1860/70 samples had an error in SURSIM, which in turn created errors in all the family interrelationship variables (IMPMOM, IMPPOP, IMPSP) and in the variables constructed from them (NCHILD, NCHLT, FAMSIZE, ELDCH, and so on).  The error could also have implications for missing data allocation; we recommend tossing out any previous versions of 1860 and 1870. 

November 25, 1998 -- PERWT, NUMHHTAK, and GQFUNDS fixed on the 1860 and 1870 sample. 

November 6, 1998 -- Revised preliminary samples of the 1860 and 1870 census released.  Two versions of both the 1860 and 1870 PUMS are now available: (1) a flat 1-in-200 sample of all dwellings, and (2) a black oversample containing a 1-in-100 sample of dwellings containing one or more blacks and a 1-in-200 sample of all other dwellings. 

The sample weights in both the flat and black oversamples of the preliminary 1860 and 1870 PUMS have been adjusted to be representative of the total population. 

August 20, 1998 -- Revised IPUMS-98 database released. 

  • AGE Allocations 1850-1920. There was an error in the missing data allocation procedure for AGE affecting all pre-1940 samples. Since age is used as a predictor in many other allocations, constructed variables, and universe checks, the frequencies for many variables in the earlier samples have changed slightly from the original iteration of IPUMS-98. 
  • Split YRSINUSA into two separate variables--YRSUSA1 and YRSUSA2-- to enhance compatibility over time. YRSUSA1 (columns 145-146 in the raw data files) contains the unrecoded continuous measure of years in the U.S. from the 1900-1920 samples. YRSUSA2 recodes 1900-1920 and 1970-1990 into five intervals compatible among all sample intervals. Users desiring greater detail on the original 1970-1990 intervals can refer to YRIMMIG, which retains all of the original detail recorded in the variable discussion. Documentation change: the universe for 1980 should have excluded foreign-born persons who were citizens at birth. 
  • OCCSCORE, SEI. In 1850-1870, laborers who were changed via logical edit to farm laborers (i.e., they lived on a farm), continued to receive the OCCSCORE and SEI for laborers. They will now receive the score for farm laborers. The original 1900 sample incorrectly classified many domestics as "service workers, nec" in their original 1950 occupation classification. The IPUMS fixed the occupational code, but neglected to assign the appropriate SEI and OCCSCORES for the new occupation. This has been rectified. 
  • RACE. In 1990, persons who indicated hispanic origin were recoded out of "other race, nec" in the race variable into the category "Spanish write-in." Persons of Mexican origin were mistakenly excluded from this recode. This is now fixed. 
  • PERWT and HHWT in 1990. Previously, the IPUMS adjusted the 1990 weights so that the total weighted sample would yield the same population count as the published census returns. We removed this programming, since users could not reverse this change is they desired to, and because there seemed no reason to assert the accuracy of the 1990 count at this level of detail. 
  • CITYPOP, SIZEPL. In 1980, households in New York City received the code for "not identifiable" (codes 00000, 00) in the city population variables. New York can be identified, and we have changed the population codes accordingly. 
  • ANCESTR1 and ANCESTR2. An error in the 1990 PUMS documentation slipped into the IPUMS. Anyone with a code of 0324 (West German) should have been coded 0460 (Greek). This is now fixed. 
  • MBPL, FBPL. In 1970, recoded "U.S. possesions, n.s." to match the documentation (code 12091); it was incorrectly coded 13000 in the data. 
  • YRIMMIG documentation change: the universe for 1980 excludes foreign-born persons who were citizens at birth. Changed 969 code to 970; it refers to 1965-1970, not 1965-1969. Added 914, which refers to the period before 1915 in the 1970 sample. We also changed the data, recoding 969 to 970. 
  • EDUCREC and HIGRADE. In 1980, N/A (under age 3) and "no schooling" were combined. We have separated them. 
  • BPL. In 1850, some persons with a birthplace of Iowa should have been coded as being born in Indiana (a confusion over the interpretation of the abbreviation "IA"). We have added programming to separate these codes. 
  • CLASSWKR. Removed new workers (persons looking for work but who have never obtained their first job) from the universe for 1940 and 1950 in order to increase compatibility. In 1990, reassigned unemployed persons who last worked over five years ago to the N/A category. In all years, the relevant information is preserved in other variables (EMPSTAT and YRSLASTWK). 
  • IND1950. The original 1940 contained an undocumented industry category. We determined that this is the category for "miscellaneous machinery" (code 358) The IPUMS had coded this category to "office and store machines" (code 357); we have recoded it to 358. In addition, the IND (contemporary industry classification) appendix for 1940 did not document this category. It has been added to the documentation. 

May 20, 1998 -- OCC, OCC1950, FARM. Fixed a significant error in occupation coding in the 1860 sample (which also affected 1870, though to a much lesser degree). The missing data allocation procedure changed most persons with a blank response (no occupation) to having an occupation. This greatly overstated female occupational responses in 1860, particularly for married women. Since FARM status is inferred from occupation, and many of the allocated cases were farmers, the 1860 and 1870 samples overstated the number of farms. Both the 1860 and 1870 samples have been reconstructed to rectify this problem. 

March 24, 1998 -- Made a significant, if somewhat subtle, change to the way the extraction system works. Altered the extraction system to zero out any variables that were "stacked" in the same column location as a requested variable. Previously, if you selected a variable that was not available in every sample chosen for extraction, the system would include whatever other variable was located in those columns in the raw IPUMS data files. For example, if you selected 1880 along with more modern samples and requested the variable Migration Status, 5 Years, the system would include the alphabetic data from the 1880 variable Last Name in those same extract columns. This caused considerable confusion among users.

Early March, 1998 -- Changed weights in "small" and "tiny" samples to be representative of total population. 

Early March, 1998 -- Created a new Flat 1990 sample. 

February 17, 1998 -- Changed the weights in the 1860 and 1870 files to account for oversample of blacks. 

January, 1998 -- IPUMS-98  is available.  For prior revisions, see Changes from IPUMS-95 to IPUMS-98