Errata in and Revisions Made to IPUMS

Below is a list of upcoming changes to the IPUMS, along with significant changes made to the IPUMS since 1998. The IPUMS archive page contains versions of earlier IPUMS websites.


Jump to beginning of year (scroll up to move forward in time):
2016 2015 2014 2013 2012 2011 2010 2009 2008
2006 2007 2005 2004 2003 2002 2001 2000 1999
1998

Revisions Made Previously

November 2, 2016 The 2015 1-year American Community Survey and Puerto Rican Community Survey data are now available.

September 12, 2016 Added new 1920 and 1930 100% databases. Added new 1930 Puerto Rico 5% Decennial Census sample.

March 8, 2016 Added a new 1960 5% sample. Added consistently identified geography variables to the 2012-2014 multi-year ACS/PRCS samples. Minor fixes to the 1850 and 1940 100% databases.

February 17, 2016 Added new 2013 and 2014 Multi-year ACS and PRCS samples. The Multi-year files differ in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to the most recent interview year, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files .

February 8, 2016 Added new 1850 100% population database, updated 1940 100% database, and integrated small data corrections.

December 17, 2015 IPUMS-USA posted new Health Insurance Unit variables for the 2014 1-year American Community Survey.

November 17, 2015 IPUMS-USA posted new 2014 1-year American Community Survey and Puerto Rican Community Survey data as well as new Decennial Census data from 2010:

July 1, 2015 IPUMS-USA posted new versions of data with the following improvements and additions:

December 17, 2014 Posted preliminary version of 1940 100% Population Database. Added Health Insurance Unit variables to the 2013 ACS sample. Improved family pointer variables for same-sex married couples in the 2013 ACS/PRCS sampels.

November 05, 2014 Posted new 2013 1-year American Community Survey and Puerto Rican Community Survey data.

August 13, 2014. Geography variables were updated for the 2012 1-year ACS sample using the new 2010 Decennial Census Based PUMA s. The updated variables include: METRO, CITY, CITYERR, CITYPOP, MET2013, MET2013ERR, COUNTY, HOMELAND, PWPUMA00, MIGPUMA1 .

In addition, several other improvements were made to existing IPUMS-USA variables and data files:

April 8, 2014. Added new 2008-2012 ACS/PRCS 5-Year files. These files include all cases in the previously-released single-year files from the 2008, 2009, 2010, 2011, and 2012 ACS/PRCS. The 5-year file differs in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2012 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files . Information specific to the new 2008-2012 release follows:

In addition, several other improvements were made to other IPUMS-USA samples:

Feb. 17, 2014. Posted new 2010-2012 ACS/PRCS 3-Year data. These files include all cases in the previously-released single-year files from the 2010, 2011, and 2012 ACS/PRCS. The new 3-year file differs in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2012 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files .

The 2010-2012 ACS/PRCS are generally similar to the 2009-2011 ACS/PRCS data, with several note-worthy differences:

Dec. 27, 2013. Posted new 2012 1-year American Community Survey and Puerto Rican Community Survey data. Together, the 2012 samples contain over three million person records. The 2012 ACS is the seventh ACS sample to provide information on group quarters, which can be identified in the GQ variable. Researchers analyzing multiple ACS samples over time should remove group quarters cases, since they are available only in the 2006-2012 data.

The 2011 and 2012 ACS releases are similar, but there are a couple of notable differences:

May 7, 2013. New, final versions of 1930 sample data are available. These include 5% and 1% samples. The 1% sample was drawn from the 5%, but there are minor differences in allocated values. Modifications to the previous version of the 1930 sample data focused on the following areas:

Detailed Geography. Much of this work related to reassessing breaks between enumeration districts (ENUMDIST ). This also resulted in corrected values for minor civil division and incorporated municipality information (MCDSTR and INCSTR ).

Home and Rental Value (VALUEH and RENT30 ). Outliers were flagged and erroneous values corrected.

Occupation and Industry Codes (OCC, OCC1930, OCC1950, IND, IND1930, and IND1950 ). Most of this work involved assigning codes to previously unclassified records. Consistency checks were also applied that resulted in the correction of some misclassified records. Changes to the occupation codes also resulted in modifications to variables that rely on the occupation codes as input (e.g., occupational standing variables such as OCCSCORE ).

Apr. 30, 2013. A new version of the 1940 1% sample is now available and corrections have been made to other IPUMS-USA samples.

The new 1940 release includes corrections as well as new data. Corrections were made to 374 person records that had been identified as living in Missouri that actually lived in Detroit, Michigan. Necessary changes were made to the relevant geographic and migration variables.

New geographical variables were added to the 1940 1% data that are no longer restricted by confidentiality requirements: COUNTY, METDIST, CITYMETD, URBAN and URBPOP data are now available. County, city, minor civil division, ward, tract and enumeration district information has also been added as two new sets of string variables, one that contains "clean", standardized strings (STDCNTY, STDCITY, STDMCD, STDWARD, STDTRACT, STDED ) and one that records the strings exactly as they were entered (CNTYSTR, MCDSTR, WARDSTR, INCSTR ). Also for the household record, the string variable GQSTR has been added, which contains the original group quarters response as it was entered.

New string variables have also been entered for all but 137,588 person-level records in the 1940 1% data. The records with the missing data can be identified using the SUBS4050 variable and selecting subsamples 2 and 20. The remaining data for those two subsamples will be added in the future. The new person-level string variables are: occupation and industry (OCCSTR and INDSTR ), usual occupation and industry (UOCCSTR and UINDSTR ), where the respondent was living in 1935 (MST5STR, MCNY5STR and MCIT5STR ), and five other demographic variables (RELSTR, BPLSTR, FBPLSTR, MBPLSTR, MTONGSTR ) .

Two corrections were made to other IPUMS-USA samples:

The variable EDUC has been updated to reflect corrections by the Census to ACS 2001 and 2002 single-year files. The educational attainment question changed on the 1999 ACS questionnaire, which modified the response categories and eliminated the choice of "Vocational, technical, or business school degree." Previously the 2001 and 2002 single-year IPUMS data dictionary incorrectly showed labels for categories 65, 71 and 82 as "1 or more years of college credit, no degree," "2 years of college: Associate's degree - occupational program," and "2 years of college: Associate's degree - academic program," respectively. The correct data dictionary labels for categories 65, 71, and 82 are "Some college, but less than 1 year," "1 or more years of college credit, no degree," and "2 years of college: Associate's degree, type not specified," respectively.

For the 2007-2011 American Community Survey 5-year file the variable IND had incorrect values for the cases from 2011 due to a programming error. This error has been fixed.

Feb. 4, 2013. Posted new 2007-2011 ACS/PRCS 5-Year files. These files include all cases in the previously-released single-year files from the 2007, 2008, 2009, 2010, and 2011 ACS/PRCS. The new 5-year file differs in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2011 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files .

Jan. 23, 2013. POVERTY thresholds were incorrect for individuals in families with over 7 people due to a programming error in all 1950-2011 files. This error has now been fixed.

YRSUSA1 was calculated improperly and REPWTP45 contained errors in the 2009-2011 ACS 3-year file. These errors have now been fixed.

Dec. 13, 2012. Posted new 2009-2011 ACS/PRCS 3-Year files. These files include all cases in the previously-released single-year files from the 2009, 2010, and 2011 ACS/PRCS. The new 3-year file differs in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2011 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files . The 2009-2011 ACS/PRCS are quite similar to the 2008-2010 ACS/PRCS data, except that the variables WRKLSTWK, DEGFIELD, and DEGFIELD2 are now included in the 2009-2011 ACS/PRCS files.

In addition, the supplementary health insurance variables have been added to the 2011 ACS 1-year file. These five new variables are: HIURULE, HIUFPGBASE, HIUFPGINC, HIUID, and HIUNPERS . These summary health insurance variables were constructed by SHADAC . For more detailed information, consult the variable descriptions.

Oct. 30, 2012. Posted new 2011 American Community Survey and Puerto Rican Community Survey data. Together, the 2011 samples contain over three million person records. The 2011 ACS is the sixth ACS sample to provide information on group quarters, which can be identified in the GQ variable. Researchers analyzing multiple ACS samples over time should remove group quarters cases, since they are available only in the 2006-2011 data.

The 2010 and 2011 ACS releases are remarkably similar, but there are a couple of notable differences:

September 7, 2012 QMARINYR and QYRNATUR were not updated to include data from ACS/PRCS 2008 to 2010. These flags are now available.

YRSUSA2 was wrong for a substantial number of cases in the 2006-2010 ACS file due to a programming error. This error has now been fixed.

DEGFIELD and DEGFIELD2 were updated to include new codes for 2010 ACS/PRCS. Users interested in comparing DEGFIELD or DEGFIELD2 over time should know that there may be different codes for the same field of degree across samples. For example, Neuroscience changed from code 4003 in 2009 to 3611 in 2010. In DEGFIELD and DEGFIELD2, IPUMS preserves each sample's full range of codes.

July 9, 2012 Released supplementary health insurance variables for the 2008-2010 ACS 1-year files. These five new variables are: HIURULE, HIUFPGBASE, HIUFPGINC, HIUID, and HIUNPERS . These summary health insurance variables were constructed by SHADAC . For more detailed information, consult the variable descriptions.

April 23, 2012 Some of the codes for the variable PUMARES2MIG displayed incorrect values due to a programming error. This error has now been fixed. The PUMARES2MIG codes were previously incorrect for the following states and PUMAS: Arkansas - 1000; California - 2601, 2602, 6701, 6702, 8101-8116, 8200; Kansas - 1401-1403, 1500; New Jersey - 701-703; Oklahoma 1100, 1200; Washington 2001-2009.

March 27, 2012 Released a preliminary version of the 1930 5% sample. Coding of string variables is still ongoing, with much of this work focused on the occupation and industry variables. We expect to release the final version in July.

Also, FARMSCHD in the 1930 1% sample had been coded incorrectly. The error has been corrected.

March 21, 2012 American Community Survey and Puerto Rican Community samples from 2006-2010 have been updated to include minor revisions to the POVERTY variable. For individuals with a group quarters (GQ ) code of 4, about 4.5% of individuals were incorrectly omitted from the universe. This error has been fixed .

March 13, 2012 IPUMS USA samples from 1960 to the present have been updated to include CLUSTER and STRATA variables. For the 1960-2000 samples, strata were created based on the stratification criteria used to select Public Use Microdata Samples such as household size, age, race, ethnicity, home ownership, qroup quarters membership, and vacancy status. For the American Community Survey (ACS) samples, strata were created based on the lowest level of geography available in each sample. For the 2000-2004 samples, each state forms a stratum. In the 2005 onward ACS samples, strata were defined as unique Public Use Micro-data areas (PUMA). For more information on the creation of STRATA, see this page: Construction of Strata in the IPUMS Samples .

In addition, the US and Puerto Rican 2000 1% samples had incorrect OCC codes due to a programming error. This error has now been fixed. Extracts including OCC made between November 2nd 2011 and March 12th should be revised.

Feb. 20, 2012 IPUMS USA samples from 1980 to the present have been updated to fix the following errors:

Jan. 23, 2012. Added new 2006-2010 ACS/PRCS 5-Year files. These files include all cases in the previously-released single-year files from the 2006, 2007, 2008, 2009, and 2010 ACS/PRCS. The new 5-year file differs in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2010 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files . Information specific to the new 2006-2010 release follows:

In addition, imputed relationship variables (IMPREL, IMPMOM, IMPPOP, IMPSP ) were previously unavailable for the 1860 and 1870 samples with oversamples. They are now available.

Dec. 21, 2011. Added new 2008-2010 ACS/PRCS 3-Year files. These files include all cases in the previously-released single-year files from the 2008, 2009, and 2010 ACS/PRCS. The new 3-year file differs in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2010 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files . One notable difference compared to the 2007-2009 3-Year file is that health insurance and disability variables are now included. Also, we have enhanced the data from the Census Bureau in a couple of important ways. We have included health insurance edits to the 2008 and 2009 cases and we provide integrated occupation codes. The data from the Census Bureau contained two different sets of occupation codes for the variables OCC and OCCSOC . The 2008-2009 cases contain the 2005-2009 ACS occupation codes, whereas the 2010 case contain the 2010 ACS occupation codes (a crosswalk of these changes is available here Occupation and Industry Variables ). We provide a harmonized version in OCC1990 . The original values can be found in the OCC and OCCSOC variables, although users should note that those variables conain codes that differ by the survey year, as described above.

In addition, the following errors have been fixed:

Nov. 17, 2011. The following error has been fixed:

Nov. 2, 2011. Posted new 2010 American Community Survey and Puerto Rican Community Survey data. Together, the 2010 samples contain over three million person records. The 2010 ACS is the fifth ACS sample to provide information on group quarters, which can be identified in the GQ variable. Researchers analyzing multiple ACS samples over time should remove group quarters cases, since they are available only in the 2006-2010 data.

The lowest level of geographic identifier in the 2010 ACS is the PUMA ; 2010 PUMAs have the same boundaries as those in the 2005-2009 ACS and the 2000 census samples. The IPUMS version of the 2010 ACS provides the following additional geographic identifiers: CITY, COUNTY, METAREA, METRO, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWTYPE, PWMETRO, PWCITY, and PWPUMAS . Additionally, information on unrelated subfamilies --a category not measured by the Census Bureau--is available. These variables were constructed at the University of Minnesota and are not available via the Census Bureau.

The 2009 and 2010 ACS releases are quite similar, but there are some differences:

New versions of every sample have also been posted. The migration variables (MIGRATE1 and MIGRATE5 ) have been fundamentally revised. These variables have been available since 1940, but the original Census Bureau variables have contained progressively less information over time. For example, in the 2000-onward ACS/PRCS, individuals are simply coded "same house," "different house in the U.S.," or "different house outside the U.S." in the original census data. However, it is possible to construct additional detail about these movers from other census variables, in particular MIGPLAC1 and MIGPLAC5 . In the past, users interested in additional migration detail in later samples have needed to manually recode these other variables. Users interested in comparing the migration variables across time have confronted a non-harmonized coding scheme and comparability differences across both years and samples. The current revised versions of MIGRATE1 and MIGRATE5 simplify both tasks by:

See MIGRATE1 and MIGRATE5 for more information about these changes.

There are three other major changes to the data:

August 18, 2011. Posted new versions of the 1900 5% sample and all 2000-onward samples:

August 10, 2011. The 2007-2009 and 2005-2009 multi-year American Community Survey data are now available, along with the 2007-2009 multi-year Puerto Rican Community Survey data. (Technical problems prevented the release of the 2005-2009 Puerto Rican Community Survey data; it will be released by August 18, 2011.) The 2007-2009 3-year file includes all cases in the previously-released single-year files from the 2007, 2008, and 2009 ACS/PRCS; the 2005-2009 5-year file includes all cases in the previously-released single-year files from the 2005, 2006, 2007, 2008, and 2009 ACS/PRCS. Yet the new multi-year files differ in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2009 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files . Information specific to the new 2007-2009 and 2005-2009 files follows:

All other IPUMS samples have been updated:

Finally, our extract system has been streamlined so that users see a summary of their extract first. If you just want a standard rectangular extract with no extra features, you can make it immediately. Or you can change aspects of your extract (data structure, customized sample sizes, case selection, attached characteristics of other household members, and data quality flags).

July 12, 2011. The linked representative samples have been updated. The update primarily affects variables that were not present in the original Church of Jesus Christ of Latter-day Saints complete-count database for 1880 : DEAF, BLIND, MAIMED, IDIOTIC, INSANE, SICKNESS, MARRINYR, SCHOOL, LIT, MOUNEMP, and QTRUNEMP . Previous versions of the data contained information for these variables if the record was part of the 1880 10% sample . The updated versions now contain information for these variables for all 1880 records.

June 15, 2011. The IPUMS-USA extract system now allows users to customize their sample sizes. This is useful for researchers who do not need or cannot use the large number of cases contained in some IPUMS-USA samples. It can also be used to obtain small testing datasets before running a program on a large dataset. For more information, see the FAQ .

The following errors have also been fixed:


May 19, 2011. Posted revised versions of all datasets.

Most notably, the Census Bureau's November 2010 revisions to the ACS/PRCS samples are now incorporated into the IPUMS:

IPUMS now offers integrated versions of the original Census Bureau subfamily variables that parallel the subfamily variables constructed by IPUMS. Newly available variables include CBSFRELATE (relationship within the subfamily), CBSFTYPE (type of subfamily), CBSUBFAM (subfamily number), and CBNSUBFAM (number of subfamilies in the household). Users should note that the Census Bureau's procedures for classifying subfamilies have changed dramatically over time, so these variables are useful mainly for the comparability they offer with the Census Bureau's summary files. See our subfamilies page for more information.

Information on the TRIBE of American Indians has been improved as well. Most notably, persons who were previously classified (incorrectly) as "Alaska Native, tribe not reported" in all 2000-onward samples are now classified correctly as "American Indian or Alaska Native, tribe not reported." Recoding improvements were made to the the 1990, 2000, and ACS samples; labeling improvements were made to the 1900 and 1910 samples.

Several improvements were made to the 1900 census 5% sample:

Finally, because of improvements to our data construction, editing, and allocation procedures, many variables have been refined. In particular:



Nov. 18, 2010. Posted new 2008-2009 ACS data. Because of miscommunications with Census Bureau staff, the health insurance edit for VA (HINSVA ) and Indian Health Service (HINSIHS ) insurance was performed incorrectly. These variables and their accompanying flags are now correct, and documentation of the edit has been updated on the ACS health insurance page . Additionally, the edits for all health insurance variables have been applied to Puerto Rico data (this were not done before).

Nov. 10, 2010. Posted new 2008-2009 ACS data. Because of a programming error, the data posted on Nov. 9 contained incorrectly edited variables for Medicaid (HINSCAID ), Medicare (HINSCARE ), and military insurance (HINSTRI ) coverage, which affected the summary variables for any (HCOVANY ), private (HCOVPRIV ), and public (HCOVPUB ) coverage. These variables and their accompanying flags are now correct.

Nov. 9, 2010. Posted new 2009 American Community Survey and Puerto Rican Community Survey data, along with revised data for 2008. Together, the 2009 samples contain over three million person records. The 2009 ACS is the fourth ACS sample to provide information on group quarters, which can be identified in the GQ variable. Researchers analyzing multiple ACS samples over time should remove group quarters cases, since they are available only in the 2006-2009 data.

The lowest level of geographic identifier in the 2009 ACS is the PUMA ; 2009 PUMAs have the same boundaries as those in the 2005-2008 ACS and the 2000 census samples. The IPUMS version of the 2009 ACS provides the following additional geographic identifiers: CITY, COUNTY, METAREA, METRO, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWTYPE, PWMETRO, PWCITY, and PWPUMAS . Additionally, information on unrelated subfamilies --a category not measured by the Census Bureau--is available. These variables were constructed at the University of Minnesota and are not available via the Census Bureau.

The 2008 and 2009 ACS releases are quite similar, but there are some differences:

Additionally, incorrect data in QDIFPHYS, the data allocation flag for DIFFPHYS, has been corrected in all 2000-present samples.

October 20, 2010. ENUMDIST from the 1880 IPUMS complete count database was updated in all 21 of the linked representative samples . In the previous versions, ENUMDIST from 1880 had a large proportion of missing values. This has been corrected.

October 13, 2010. Weights in the linked representative sample for males, 1880-1930 have been revised. In the previous version of the male file, PERWT was constructed with erroneous age proportions for 1930. The problem was corrected and PERWT recalculated.

September 7, 2010. Posted new versions of all IPUMS-USA samples. IPUMS-USA now offers several new geographic tools:

Several miscellaneous errors have also been corrected:

June 8, 2010. An error in the programming for the June 4 revision resulted in PERWT values of 0 for all cases in the 1880 1% and 1880 10% samples. These samples now contain the correct PERWT values.

June 4, 2010. Posted final versions of the IPUMS Linked Representative Samples. More information available here .

Posted new versions of all IPUMS samples. There are several new variables:

Other variables have been modified:

March 4, 2010. Posted new versions of 2003-onward files.

Future revisions of the IPUMS data will be posted on a quarterly schedule (the beginning of March, June, September, and December). The next update will be on June 1, 2010.

February 10, 2010. Posted new versions of 1940-onward files.

Posted new versions of 1870 and 1900 datasets. In the new 1870 data, OCC and OCC1950 values are included for people uder the age of 16 who reported an occupation. The previous version coded these people as "not in universe." In 1900, several cases previously having invalid SPEAKENG values are now coded properly.

January 28, 2010. Posted new versions of 1900-1910 and 1940-onward files. Two minor changes have been made:

Other changes are limited to the ACS/PRCS:

January 12, 2010. Added 2006-2008 ACS/PRCS 3-Year files. These files include all cases in the previously-released single-year files from the 2006, 2007, and 2008 ACS/PRCS. The new 3-year file differs in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2008 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files . Another notable difference is that some values of AGE in the 2006 portion of the 3-year file differ from those in the 2006 single-year file because of a change to the Census Bureau's disclosure avoidance methods . The Census Bureau has re-released the 2006 single-year file with the revised AGE variable, and it will be added to the IPUMS database soon.

An expanded version of the 1880 100% database is now available. The new release contains a variety of improvements over the previously-available version of this data:

Also posted new versions of all IPUMS-USA samples. There are several notable changes to variable availability and ease of use:

There are also several more minor changes:

November 9, 2009. Posted new versions of the 2008 ACS/PRCS to correct erroneous data in REPWT14 . Additionally, cases in the 1920 1% sample that should have been coded as "9997" (unknown) in YRNATUR instead received codes of "ZZZZ". This has been corrected.

November 6, 2009. Posted new versions of the 2008 ACS/PRCS. HHINCOME, FTOTINC, INCEARN, and INCTOT were not provided before due to errors in the original Census Bureau data ; they are now available because the Census Bureau has released new data. POVERTY was previously provided just as the Census Bureau released it, with different values for each nonrelative of the householder. The POVERTY variable is now calculated as in all previous samples, where people in unrelated subfamilies have the same value.

The Census Bureau has not documented their data update, so users should know that any 2008 PUMS files downloaded from the Census Bureau's website between October 30 and November 3 have incorrect data for the four income summary variables with the Census Bureau names of FINCP, HINCP, PERNP, and PINCP. And, as of November 6, the DataFerrett data had not been updated; they still contain incorrect codes of -$59,999 (HINCP and FINCP), -$10,000 (PERNP), and -$19,999 (PINCP).

There are two other changes to IPUMS-USA data:

November 4, 2009. Added 1% samples from the 2008 American Community Survey (ACS) and the 2008 Puerto Rico Community Survey (PRCS). Together, the samples contain approximately three million person records. The 2008 ACS is the third ACS sample to provide information on group quarters, which can be identified in the GQ variable. Researchers analyzing multiple ACS samples over time should remove group quarters cases, since they are available only in the 2006-2008 data.

The lowest level of geographic identifier in the 2008 ACS is the PUMA ; 2008 PUMAs have the same boundaries as those in the 2005-2007 ACS and the 2000 census samples. The IPUMS version of the 2008 ACS provides the following additional geographic identifiers: CITY, METAREA, METRO, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWTYPE, PWMETRO, PWCITY, and PWPUMAS . These variables were constructed at the University of Minnesota and are not available via the Census Bureau.

There are several noteworthy changes to the IPUMS data that stem from the Census Bureau's modifications to the ACS/PRCS questionnaire; users should consult our page on the 2008 ACS/PRCS .

We have also made a number of other improvements to the IPUMS data. Specifically:

We also made several changes that were not specifically related to the 2008 ACS/PRCS data release:

October 9, 2009. Added two new variables for 1850-1930. In these samples, dwellings could include more than one household. Households have always been uniquely identified by SERIAL ; the new variable DWELLING is a unique identifier for dwellings. Within each value of DWELLING, there may be more than one SERIAL. The new variable DWSEQ indicates the order in which households were enumerated within the dwelling. For more information, please see the variable descriptions.

October 8, 2009. Added higher-density samples for 1880 and 1900. The 1880 10% sample has replaced the preliminary 1880 5% sample, while the 1900 5% sample has replaced the preliminary 1900 2.5% sample. The final samples contain all cases from the preliminary samples (which came from odd-numbered microfilm reels only) as well as new cases from even-numbered microfilm reels. For more details, see the sample description page for 1880 and 1900 .

August 11, 2009. Posted new versions of all linked data samples . The LINKWT variable has been corrected in all samples. Due to a processing error, LINKWT values were low by an order of magnitude ranging from 2x to 50x. Any data that was downloaded previously should be replaced with these new data.

June 17, 2009. Posted new versions of 1950-2007 data.

June 11, 2009. Minor correction to 2007 ACS/PRCS data. In the 2007 ACS (and the 2007 cases in the 2005-2007 ACS 3-year file), some cases in Florida had missing values of PROPINSR . These are now coded as 9999, which is the correct PROPINSR topcode for Florida. The documentation of topcodes has also been updated to reflect this change.

May 29, 2009. Corrected minor inaccuracies in 2000 and ACS/PRCS data.

May 13, 2009. Added four new variables describing subfamilies to 1880-2007 IPUMS samples: SFTYPE (subfamily type), SFRELATE (relationship within subfamily), SUBFAM (subfamily membership), and NSUBFAM (total number of subfamilies in the household). For more information, see the subfamilies overview page .

Also, documentation for other family interrelationship variables has been updated to conform to longstanding IPUMS procedures:

See the variable descriptions for more information.

April 21, 2009. Corrected missing values and other minor inaccuracies in several samples. First, several variables contained missing data for some cases. Missing data has been assigned to the proper codes as follows:

Second, all housing units with 10 or more persons unrelated to the household head have been re-classified as group quarters in all American Community Survey and Puerto Rican Community Survey samples, consistent with the treatment of such households in the 2000 census. For more information, see GQ . The cases in such housing units are now coded as 5 in the GQ variable and 9 in the GQTYPE variable (900 in the detailed version GQTYPED).

Third, information on variable availability has been updated as follows:

Finally, the IPUMS variables FDSTPAMT and OWNCOST are now adjusted to calendar-year dollars in all ACS/PRCS samples; see the ACS income variables note for more information.

April 1, 2009. Improved and updated the coding of in-laws in the 2000-2007 American Community Survey (ACS) and 2005-2007 Puerto Rican Community Survey (PRCS) samples. In these samples, the Census Bureau's relationship variable includes only a global "in-law" category. IPUMS attempts to provide a more detailed classification of parents-in-law, siblings-in-law, and children-in-law in the RELATE variable. The new release of the ACS and PRCS datasets improves the procedures for making these detailed in-law assignments. More information on the new procedures is available here . Additionally, users should take note of three coding errors in the old classification scheme that have been corrected and/or no longer apply in the new classification scheme:

Additionally, in the 2005-2007 ACS and PRCS 3-Year samples, the person weights (PERWT ) for individuals in group quarters were not copied to the household weight variable (HHWT ). This has been corrected.

March 5, 2009. Posted the 2005-2007 American Community Survey/Puerto Rican Community Survey 3-year file. This file includes all cases from the previously-released single-year files from the 2005-2007 ACS/PRCS. The new 3-year file differs in several ways from the single-year files. Most importantly, weights have been re-calculated, incomes and other dollar amounts have been standardized to 2007 dollars, and different topcodes have been applied. For more information, please see this FAQ on multi-year PUMS files .

March 1, 2009. PRCS data from 2005-2007 have been altered to resolve small coding differences across survey years. All 1,093 cases previously coded as 2 on the ABSENT variable in the 2006 and 2007 PRCS single-year files are now coded as 3, and one individual previously coded as 899 on the RACE variable in the 2005 PRCS single-year file is now coded as 943.

There was also a slight change in the PRCS's main immigration variable. PRCS data previously available in YRSUSA1 has been shifted to YRSPR, and the flag associated with this variable has changed from QYRIMM to QYRSPR .

February 9, 2009. Posted new versions of linked data samples for males from 1860-1880 and 1870-1880. In the 1860 data, 303 cases and in the 1870 data, 302 cases were removed after applying a filter for records where there was a middle initial mismatch that previously had not been applied properly.

December 20, 2008. Posted new version of the linked data sample for males from 1850-1880, 1880-1900, and 1880-1910.

Changes to the 1850-1880 data: the dataset was increased by 49 records. Some records were removed and some added as the result of 1) rerunning one of the classifiers and 2) properly applying a middle initial mismatch filter.

Changes to the 1880-1900 and 1880-1910 data: 282 cases were removed from 1900 and 215 cases were removed from 1910, after applying a middle initial mismatch filter that previously had not been applied properly.

December 11, 2008. Posted remaining linked data samples . Also posted new versions of samples linking couples from 1870-1880 and 1880-1910. In the 1870 data, 10 cases that previously had a LINKWT of 0 were given the correct non-0 LINKWT values. In the 1910 data, 140 cases that had LINKWT values greater than 5 were assigned values of 5 (the maximum allowable LINKWT).

November 11, 2008. Posted new versions of samples for 1970-2007. Improvements were made to the 1970 samples to correct the variable INCOTHER. Samples from 1980-2007 were expanded to include the variable OWNCOST.

Posted new version of the 1880 100% database. Fixed problems with the MCDSTR and PAGENO variables. Group quaters units containing more than 60 people were split into 1-person households. Researchers needing to study these units intact can use SERIAL80 and PERNUM80 .

October 11, 2008. Added 1880 100% population database. This dataset was originally entered for genealogical purposes, by the Church of Jesus Christ of Latter Day Saints (LDS). Data cleaning and harmonization took place at the Minnesota Population Center (MPC). Versions of this data are also available from the the MPC's North Atlantic Population Project and the LDS's genealogical website FamilySearch.org .

The IPUMS-USA version of the data contains fully integrated codes and labels, newly-constructed family inter-relationship variables, and missing data allocation for key demographic variables. Since the dataset was first constructed for genealogy, several variable groups were never entered. Excluded variables include items relating to school, literacy, unemployment, disability, month of birth, marriage within the past year, and street address. The most detailed geographic variables are MCDSTR and INCSTR .

Added 2.5% preliminary sample of the 1900 census. This sample is "preliminary" because the final version will contain 5% of the population. The preliminary sample includes data only from odd-numbered microfilm reels. Counties on even-numbered reels are not represented in this dataset. Alaska and Hawaii are also excluded from the preliminary dataset. The final 5% dataset will be released in early 2009.

September 26, 2008. Added 1% samples from the 2007 American Community Survey (ACS) and the 2007 Puerto Rico Community Survey (PRCS). The samples have approximately three million person records. The 2007 ACS is the second ACS sample to provide information on group quarters, which can be identified in the GQ variable. Researchers analyzing multiple ACS samples over time should remove group quarters cases, since they are available only in the 2006-2007 data.

The lowest level of geographic identifier in the 2007 ACS is the PUMA ; 2007 PUMAs have the same boundaries as those in the 2005-2006 ACS and the 2000 census samples. The IPUMS version of the 2007 ACS provides the following additional geographic identifiers: CITY, METAREA, METRO, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWTYPE, PWMETRO, PWCITY, and PWPUMAS . These variables were constructed at the University of Minnesota and are not available via the Census Bureau.

Note that the name of the IPUMS variable describing military service September 2001 and later has been changed from VET01X03 to VET01LTR . This name more accurately reflects the information contained in the variable.

More information on the background of and future plans for the American Community Survey is available at the ACS information page .

April 11, 2008. Posted IPUMS Version 4.0, the first major revision of the IPUMS files since 2004. Includes revised versions of all samples from 1850-1930, a new 1880 5% sample, and 13 new samples from the Puerto Rican Censuses of 1910-2000 and the Puerto Rican Community Survey. IPUMS 4.0 contains many new variables, including long-term hispanic identification back to 1850 (HISPAN ), a consistent single-race identification variable from 1850-2006 (RACESING ), a battery of socioeconomic indices, original strings for occupation (OCCSTR ) and industry (INDSTR ), and new detailed weight variables for the historical samples (HHWTDET and PERWTDET ), and new standardized low-level geographic identifiers (MCD and INCORP ). More information is available on the IPUMS 4.0 release page .

The most recent previous version of IPUMS data and documentation (IPUMS 3.0) is still available via the IPUMS archive page at ICPSR. The archive page permits users to revise old extracts, create new extracts, and download data and documentation. The link titled "IPUMS-USA website as of March, 2008" leads to a fully-functioning mirror of the IPUMS website as it existed prior the release of IPUMS 4.0. The archive page contains versions of the website from previous years as well.

February 14, 2008. Posted a new version of the 1950 census sample, with a correction made to the BPL variable. Several cases that had been erroneously coded "Missing/blank" are now coded correctly as follows: 94 cases coded "Israel," 9 coded "Byelorussia," and 3 coded "Pakistan." In the 2000 census samples, changed the MIGMET5 code for Hattiesburg, MS from 3285 to 3300 to be consistent with our METAREA coding.

Re-released VALUEH for the 2006 ACS sample; during a recent website update, VALUEH had inadvertently been removed from the data extract system.

December 14, 2007. Posted new versions of the 2005 and 2006 ACS sample: released CITYPOP for both samples. Fixed a small error in QCONDOFE and QVALUEH in the 2006 sample. Prior to the udpate, a small number of cases had missing values for these two variables.

Posted new versions of the 1% and 5% census samples for 2000: fixed PUMALAND, PUMAAREA, and ACREPROP . Prior to this correction, these three variables contained incorrect data.

November 15, 2007. Posted new versions of all ACS samples; a correction was made to the BUILTYR2 variable. Previously, households built prior to 1939 or earlier (BUILTYR2 = 10) were grouped with those reported as being built in 2005 or later (BUILTYR2 = 1).

October 15, 2007. Added a 1% sample from the 2006 American Community Survey (ACS). The sample has approximately 2,970,000 person records. The 2006 ACS is the first ACS sample to provide information on group quarters, which can be identified in the GQ variable. Researchers analyzing multiple ACS samples over time should remove group quarters cases, since they are available only in the 2006 data.

The lowest level of geographic identifier in the 2006 ACS is the PUMA ; 2006 PUMAs have the same boundaries as those in the 2005 ACS and the 2000 census samples. The IPUMS version of the 2006 ACS provides the following additional geographic identifiers: CITY, METAREA, METRO, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWTYPE, PWMETRO, PWCITY, and PWPUMAS . These variables were constructed at the University of Minnesota and are not available via the Census Bureau.

More information on the background and future plans for the American Community Survey is available at the ACS information page .

August 17, 2007. Added HHTYPE for all samples from 1940 to 2005. In the future, HHTYPE will also be made available for all samples from 1850-1930.

July 25, 2007. Added RACESING to all samples from 1900, 1910, and 1940. Added HISPAN and HISPRULE to all samples from 1910.

July 19, 2007. Posted updated versions of all samples from 1900, 1910, and 1930. Corrections were made to the CHSURV variable in the 1900 and 1910 samples. All values were previously 0 or 1. The updated samples contain correct values. Corrections were made to the NSIBS variable in the 1900, 1910, and 1930 samples. Previously, a small number of persons identified as "siblings" in the RELATE variable (code 701) incorrectly received a value of 0 for NSIBS. This error has been corrected.

July 10, 2007. Posted updated versions of all samples from 1970 and 1980, and the ACS samples from 2000, 2001, and 2002. The updated samples include fixes to COSTELEC, COSTGAS, COSTFUEL, and COSTWATR . These variables did not properly identify cases having values of greater than 9990. All cases in this range are in the universe but have unreported values, usually because utility costs were included in rent payments. The old versions of the datasets incorrectly identified these cases as not being in the universe. COSTELEC and COSTGAS had the additional problem of presenting monthly values instead of annual values. These problems are now fixed.

The new 1980 5% sample additionally fixes a problem in the CITY variable. In the old sample, San Francisco was incorrectly identified. It has been corrected.

June 21, 2007. Posted an updated version of the 1930 1% sample. The updated sample includes fixes of minor problems in OCCSCORE (missing occupation data was not being allocated), YRSUSA2 (some allocated values were inconsistent with YRSUSA1 ), and QMARST (this variable indicated that we made more logical edits than were actually made).

Released new data extraction system with the "Attach Variables" feature, which allows researchers to create variables specifying characteristics of respondents' spouses, mothers, fathers, and household heads.

Released new version of the 2005 American Community Survey sample that includes 160 replicate weight variables (see REPWT and REPWTP ).

Released CITYPOP for 1850-1930 samples. Due to a technical problem, we had not been offering CITYPOP in these samples since February 2007. The CITYPOP values that we are providing now are not different from the values that were available prior to February.

June 7, 2007. Posted 1930 1% sample (up from previous 0.5% sample for 1930). The new sample includes several new occupation and industry variables - OCC, OCC1930, IND, IND1930 - as well as HISPAN and RACESING .

April 26, 2007. Added new occupation crosswalks (OCC to OCCSOC ) for the 2000 census samples and the ACS samples ; these are availabe via links from our Occupation and Industry documentation page. Also improved our OCC and OCCSOC code lists (available from the respective variable descriptions) for the 2000 census and ACS samples.

April 24, 2007. Posted a new version of the 2005 ACS; a correction was made to the MORTAMT1 variable.

April 9, 2007. Added Consistent PUMA variable and shapefiles (see CONSPUMA ). CONSPUMA reconciles differences in low-level geographic identifiers in the 5% samples from 1980, 1990, , and the 2005 ACS. Also released all new shapefiles for low-level geographic identifiers from 1970-2005. Changes to the previous shapefiles were minor: numerous "holes" in the maps were assigned to their appropriate PUMA, County Group, or SEA. All files are accessible via the links on our geographic tools page.

March 27, 2007. Changed the name of the RACHIST variable to RACESING .

March 21, 2007. Added QHISPAN, the data quality flag for HISPAN, to the and ACS samples. Posted new versions of the 1940 and 1950 samples: a minor correction was made to the CHBORN variable.

Posted new versions of the 1910 samples: we corrected a problem with SERIAL so that households within multi-household dwellings are now uniquely identified. The problem had affected less than .13% of households in the 1910 1.4% sample.

February 15, 2007. Created HISPAN and HISPRULE variables for the 1900 and 1930 samples. A later data release will create these variables for the 1850-1880 and 1910-1920 samples.

Created a new harmonized version of the TRIBE variable, which is now available in 1900-1910, 1990-2000, and the ACS.

Posted a new version of the 1940 sample: a minor correction was made to the VET1940 variable.

January 31, 2007. Relased new harmonized occupation and industry variables for 1950-2005: OCC1990 and IND1990 . The OCC1990 variable was created in collaboration with researchers at the Bureau of Labor Statistics. Both variables are available only via the IPUMS.

Added metropolitan area designations to the 2003 ACS, in METAREA and MET2003 . Metropolitan areas are also identified in the 2005 ACS IPUMS sample.

Created HISPAN and HISPRULE variables for the 1940-1970 samples.

Added RACHIST values to the 1950-1990 samples and the 2005 ACS IPUMS sample. RACHIST adapts an alogrithm developed at the National Center for Health Statistics to assign single races to persons who reported more than one race from 2000 onward.

December 19, 2006. Replaced the 1-in-250 1910 sample with two new samples: the 1910 1% sample and the 1910 1.4% sample with oversamples. The 1% sample includes a 1-in-100 national population sample, including Alaskans, Hawaiians, and persons enumerated on the American Indian Schedules. The 1.4% sample with oversamples includes a 1-in-70 national population sample that has been combined with large oversamples of Blacks, Hispanics, Alaskans, Hawaiians, and persons enumerated on the American Indian schedules. The 1910 Weighted sample must be used with weighting variables (see PERWT and HHWT ).

Replaced the 1900 General sample with two new samples: the 1900 1% sample and the 1900 1% sample with oversamples. The 1900 1% sample is a 1-in-100 national sample, including Alaskans, Hawaiians, and persons enumerated on the American Indian Schedules. This sample has the same cases as the former "1900 General sample" did, though some variables and values have been modified in minor ways. The 1900 1% sample with oversamples is a 1% national sample that has been merged with 1-in-5 oversamples of Alaskans, Hawaiians, and persons enumerated on the American Indian schedules. The 1900 1% sample with oversamples must be used with weighting variables (see PERWT and HHWT ).

More information about these samples is available in the 1900 and 1910 sections of the sample descriptions page. We expect to release a revised version of these samples in March 2007. The revised samples will included detailed geography at the minor civil division level and integrated versions of variables specific to the Alaskan, Hawaiian, and American Indian populations.

December 10, 2006. Posted new version of the 2005 ACS sample that includes the following new geographic identifiers: CITY, METAREA, METRO, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWTYPE, PWMETRO, PWCITY, and PWPUMAS . These variables were constructed at the University of Minnesota and are not available via the Census Bureau.

November 29, 2006. Posted new versions of all samples from 1940 through 2005. The new samples include several minor improvements to SPLOC, MOMLOC, and POPLOC . These modifications have resulted in minor changes to the constructed household variables, the family interrelationship variables, POVERTY, and FTOTINC . Detailed information on these variables can be found in the family interrelationships documentation .

An error was corrected in the POVERTY variable for all samples. In two-person families where one person was over age 65 and the other person was under age 65, we sometimes used slightly different poverty thresholds for each member of the family. We should have applied the same threshold to both members of the family. This resulted in several thousand cases in each sample having a poverty value that was off by an average of two percent (10 points on POVERTY's 1-500 scale). We have corrected the problem.

The new samples also include a small number of corrected income values in the 1950, 1960, and 1970 samples. The majority of cases affected have negative income values.

November 20, 2006. Corrected a problem in the RACE variable in the 2005 ACS sample. There were approximately 3,000 cases with missing values. All of the cases were multi-racial persons. All cases are now assigned to the appropriate categories.

October 11, 2006. Posted 1% sample from the 2005 American Community Survey (ACS). The 2005 sample is the first ACS microdata to identify sub-state geography, including PUMA, MIGPUMA1, and PWPUMA00 . The IPUMS version of the 2005 ACS also idenifies metropolitan status (METRO ). A December 2006 release of the IPUMS 2005 ACS sample will identify CITY, METAREA, PUMASUPR, MIGTYPE1, MIGMET1, MIGCITY1, MIGPUMS1, PWMETRO, PWCITY, PWTYPE, and PWPUMAS . These variables are being constructed at the Minnesota Population Center and are not available via the Census Bureau.

The base data for the IPUMS 2005 sample is the ACS data that the Census Bureau released on October 5th, 2006. The Census Bureau had originally released a version of the dataset on September 11th, 2006. The September release contained several small errors, so the Census Bureau updated the dataset in October. The erroneous dataset was never available via the IPUMS data extraction system.

More information on the background and future plans for the American Community Survey is available at the ACS information page .

October 1, 2006. Posted 0.5% sample from the 1930 census (up from the previous 0.2% 1930 sample).

September 6, 2006. Posted new version of IPUMS-USA website. The website has a new design, and the content of most variable descriptions has changed at least slightly. Users can still access all extract requests made on the old website.

June 30, 2006. Posted new versions of the 2000 1%, 5%, and Unweighted samples: a correction was made to the MIGPLAC5 variable.

April 27, 2006. Posted new versions of all ACS samples: a correction was made to the INCBUS00 variable.

April 7, 2006. Posted new versions of all 2000 Census samples and all ACS samples: a correction was made to the OCC variable.

January 20, 2006. Posted new versions of the 2000 1%, 5%, and Unweighted samples, as well as the 2000 ACS: a correction was made to the MARST variable.

November 30, 2005. Posted 13 new samples on the IPUMS-USA website. All samples were previously available on the IPUMS-USA Beta site, which was shut down. The new samples combined add nearly 15 million cases to the IPUMS database. For more details on this data release, see the sample information page.

October 7, 2005. Posted new versions of the 2000 1%, 5%, and Unweighted samples: a correction was made to the VET55x64 variable. New versions of the 2000-2004 ACS samples were also posted. In all eight samples above, improvements were made to the INDNAICS and OCCSOC variables.

Corrections were made to the YRIMMIG, YRSUSA1, and YRSUSA2 variables in the ACS 2001-2004 samples.

September 16, 2005. Released the 2004 American Community Survey (ACS) sample on the IPUMS Beta site.

September 9, 2005. Released a new 2000 1% flat sample on the IPUMS Beta site. This is a national random sample drawn from the 2000 5% Census sample.

September 1, 2005. Posted a new version of the 1930 1-in-500 sample. Corrections were made to the VET1930 variable and the AGEMARR variable.

June 27, 2005. Posted new versions of the 2000 1% and 5% samples, and the 2000-2003 ACS samples. Added the following variables: RACHIST, PROBAI, PROBAPI, PROBBLK, PROBOTH, and PROBWHT . RACHIST is an historically compatible race variable which 'bridges' multiple-race responses into their most likely single race category. The other variables give detailed probabilities of each single-race response and are best used in combination with one another.

Removed RACGEN00, RACDET00, and SPANAMER from the data and documentation. The variables RACGEN00 and RACDET00 were redundant with RACE . A variable similar to SPANAMER can be created using the IPUMS variables MTONGUE, BPL, MBPL, FBPL, SPANNAME, and STATEFIP .

May 20, 2005. Released a revised version of the 1-in-100 sample of the 1900 census (see the August 21, 2003 revision note for information on the previous version of this sample). The revised dataset includes records extracted from Alaska, Hawaii, and the American Indian 1-in-5 oversamples (the complete oversample datasets are available via the IPUMS raw data download page ).

Users should also be aware that the smaller 1900 sample previously available (the 1-in-750 "Preston" sample) will no longer be available via the IPUMS extract system. Users wishing to access this data can still download the entire dataset and SPSS command file via the IPUMS raw data download page ).

May 13, 2005. Released a revised version of the preliminary 1-in-500 sample of the 1930 census. Corrected a major error in the race variable. The April 25th sample gave the "White" code (detailed race code 100) to all persons who reported their race as "Mexican." The revised sample gives these persons the new "Mexican" race code (detailed race code 140). The revised sample also corrects minor coding and labelling errors in the following variables: RENT30, GQTYPE D, NUMHHTAK, FARMSCHD, ENUMMO, RADIO, HOMEMKR, VET1930, IND1950, MTONGUE, FBPL, MBPL, CITY, METRO, METAREA, URBAREA, and MDSTATUS .

April 25, 2005. Released preliminary 1-in-500 sample of the 1930 census. We expect to release a final 1-in-100 sample of the 1930 census by late 2007.

February 23, 2005. Posted new versions of the 2000-2003 ACS samples: a correction was made to the STATEICP variable.

February 1, 2005. Removed the POV2000 variable from the documentation and data. POV2000 was redundant with the IPUMS POVERTY variable. Both variables use the poverty matrix developed by the Social Security Administration in 1964 (and revised twice in the years since). The Office of Management and Budget's Directive 14 prescribes this definition as the official poverty measure for federal agencies to use in their statistical work.

November 23, 2004. Released the following samples on the IPUMS Beta site: the 2003 American Community Survey (ACS) sample, the 1990 Labor Market Areas sample, the 1980 Labor Market Areas sample, and the 1980 Detailed Metro/Nonmetro sample.

October 13, 2004. Posted new versions of the 2000 1% and 5% samples, and the 2000-2002 ACS samples. The following variables were improved: OCC1950, SEI, OCCSCORE, and IND1950 . The new variables utilize the Census Bureau's recently published occupation and industry crosswalks between the 1990 and 2000 censuses.

Made a slight correction to the multipliers used to construct the POVERTY variable in the 2000-2002 samples (for more information see the 1990 poverty status definition ).

August 27, 2004. Posted a new version of the 2000 5% sample: a correction was made to the METAREA variable.

August 6, 2004. Posted a new version of the 2000 5% sample: a corrections was made to the PWCITY variable.

June 28, 2004. Posted new versions of all the 2000 and ACS samples. The RACE variable has been expanded to incorporate all information from the new multiple-race variables. Details about multiple-race responses are now included, some value labels were clarified, and a few other categories were added. Also, CITYPOP was added to the 2000 1% and 5% samples, and corrections were made to MOBLHOME and METAREA .

June 17, 2004. Released American Community Survey (ACS) samples for 2000, 2001, and 2002 on the IPUMS Beta site.

May 6, 2004. Made 2000 5% sample available via the main IPUMS-USA site.

May 1, 2004. Posted new versions of all of the 2000 samples. The 2000 5% sample now includes variables for Super-PUMA of Work (PWPUMAS ) and Super-PUMA of Migration (MIGPUMAS ). For the 2000 1% sample, Super-PUMA information that was previously in the PWPUMA00 and MIGPUMA variables is now in the new PWPUMAS and MIGPUMAS variables. A new version of the INCRETIR variable in all three 2000 samples now includes retirement incomes of greater than $99,998 (the previous Top code). All three samples include a corrected version of the POV2000 variable.

Posted new versions of all 1990 samples that account for the greater width of INCRETIR (see above).

April 22, 2004. Posted a new version of the 2000 1% sample: a correction has been made to the MIGCITY5 variable.

March 10, 2004. Posted new versions of the 2000 1% sample and the 2000 5% sample. Both samples now include the PWCITY variable. For those living in group quarters, the variable HHWT now has the PERWT value, rather than a value of 0. In addition, corrections were made to the following variables: BPL, STEPMOM, STEPPOP, MARST, and PUMASUPR .

Posted new versions of the 1990 State, Metro, Elderly, and Unweighted samples. A problem in the MORTGAGE variable was corrected in the new samples.

January 30, 2004. Posted new versions of the 2000 1% sample, the 2000 5% sample, and the Census 2000 Supplementary Survey (C2SS). The 2000 1% and 5% samples now include variables for CITY and MIGCITY5 . Minor problems in PWPUMAS, PWPUMA00, MIGPUMA, YRIMMIG, and MORTGAGE have also been corrected in the new samples. The new C2SS sample includes corrected values for INCBUS00 (all values were 0 before).

September 9, 2003. Posted new versions of the 1990 State, Metro, Elderly, and Unweighted samples. FTOTINC and HHINCOME now contain negative values for families and households having a net loss of income. A problem in the PERWT variable was corrected in all samples. These were the only affected variables.

August 21, 2003. Penultimate 1-in-100 version of the 1900 Minnesota sample released on the IPUMS Beta site. The dataset includes 170,438 households containing 754,631 individuals. This version has a number of flaws that will be corrected for the ultimate final version of the 1900 Minnesota sample, which we anticipate releasing in the Spring of 2004. The older 1-in-200 preliminary sample is still available via the data extract system at the main IPUMS-USA site .

Users should also be aware that the smaller 1900 sample previously available (the 1-in-750 "Preston" sample) will no longer be available via the IPUMS extract system. Users wishing to access this data can still download the entire dataset and SPSS command file via the IPUMS raw data download page .

October 11, 2002. Reposted preliminary version of 1900 Minnesota sample. The previous version had incorrect values for children ever born (CHBORN ). The new dataset contains corrected values. No other variables have been changed.

July 11, 2002. Final versions of the 1860 and 1870 samples released. The final 1-in-100 1860 IPUMS sample includes 54,094 households containing 273,947 free individuals and an additional 1,343 unoccupied dwellings. The final 1-in-100 1870 IPUMS sample includes 79,023 households containing 383,308 individuals and an additional 1,447 unoccupied dwellings. Frequencies in the on-line documentation will be updated in the next few months. Both the 1860 and 1870 IPUMS samples are also available with oversamples of the black population. Sample weights for the flat and black oversamples have been adjusted to be representative of the total population.

The final 1860 and 1870 IPUMS samples now include occupation codes based on the U.S. Census Office's 1880 classification system and detailed birthplace codes for individuals born in Germany. Several other changes have also been made, including a slightly modified urban/rural definition, minor changes in birthplace and occupation coding, and small changes in personal estate and real estate values. In addition, the final samples incorporate a few data additions and subtractions from the preliminary samples. For details of these changes and a listing of the new Germany detailed birthplace codes, click here .

May 7, 2002. Released preliminary version of the 1900 Minnesota sample. This 1900 Minnesota sample is a 1-in-200 nationally representative sample of dwellings taken from the 1900 U. S. Census of Population. The final version is scheduled to be released in 2004 and will have a 1-in-100 sampling density. Frequencies for this sample will be added to the documentation summer 2002. Currently both the 1900 Minnesota and the 1-in-760 1900 Preston sample are available. Ultimately the 1900 Minnesota sample will replace the 1900 Preston sample, although the Preston sample will be available by request.

The fundamental difference between the two 1900 samples pertains to sample design. In the 1900 Preston sample nonfamily individuals--boarders, lodgers, inmates, and military personnel--were sampled as individuals regardless of household size. In contrast, the 1900 Minnesota sample follows the general sample design used for the 1850-1880 and 1920 samples. For a discussion of issues relating to sample design see Chapter 2 of the IPUMS documentation.

July 11, 2001 -- The IPUMS extract system upgrade was successfully installed on Wednesday, July 11, 2001. No changes were made to the IPUMS data. The new extract system will process user data requests faster than the previous system and will prevent small jobs from being continually sidetracked for large data requests in the queue. Since this upgrade affects only the behind-the-scenes data extraction system, users will notice little change in the request process, itself. Re-registration is not required; previous jobs will be available for revision; and new jobs will begin numbering from the user's last completed job in the old system.

March 7, 2001. Released new preliminary (penultimate) versions of the 1860 and 1870 samples. Frequencies in the documentation will not be changed until release of final versions of these datasets, scheduled for summer 2002. Two versions of the 1860 and 1870 samples are now available:

The sample weights in both the flat and black oversamples of the preliminary 1860 and 1870 PUMS have been adjusted to be representative of the total population. Although we believe that the new samples are near their final form-we expect only minor changes in the number of cases and the coding of a few variables between the current and final versions of the samples--users are advised that the current releases have a few known problems. In particular, the occupation ("OCC ") variable in 1860/1870 is not coded. Users should rely on the occupation 1950 basis ("OCC1950 ") variable for studying occupation and labor force participation. In addition, detailed birthplace codes are not available for individuals born in Germany. Users may still use the birthplace variable (BPL ), but no detail will be returned for German birthplaces.

Friday, August 18, 2000 -- The old IPUMS extract system was replaced by a new system incorporating enhanced features requested by users. One of the key features of the new system is the ability to modify and resubmit previous jobs. Data files from the two systems have been combined on a user-specific summary site. IPUMS data users previously registered in either extract system will not have to reregister to use the new extract system. Extract requests in the new system will begin numbering jobs from the highest numbered job in a user's personal extract summary.

January 22, 1999. Major error in the November 25 version of 1860 and 1870 samples corrected. The 1860/70 samples had an error in SURSIM, which in turn created errors in all the family interrelationship variables (IMPMOM, IMPPOP, IMPSP) and in the variables constructed from them (NCHILD, NCHLT, FAMSIZE, ELDCH, and so on). The error could also have implications for missing data allocation; we recommend tossing out any previous versions of 1860 and 1870.

July 1, 1999. Released new versions of 1850, 1860, 1870, 1880, 1900, and 1910 samples, containing the following enhancements and corrections:

November 25, 1998 -- PERWT, NUMHHTAK, and GQFUNDS fixed on the 1860 and 1870 sample.

November 6, 1998 -- Revised preliminary samples of the 1860 and 1870 census released. Two versions of both the 1860 and 1870 PUMS are now available: (1) a flat 1-in-200 sample of all dwellings, and (2) a black oversample containing a 1-in-100 sample of dwellings containing one or more blacks and a 1-in-200 sample of all other dwellings.

The sample weights in both the flat and black oversamples of the preliminary 1860 and 1870 PUMS have been adjusted to be representative of the total population.

August 20, 1998 -- Revised IPUMS-98 database released.

May 20, 1998 -- OCC, OCC1950, FARM. Fixed a significant error in occupation coding in the 1860 sample (which also affected 1870, though to a much lesser degree). The missing data allocation procedure changed most persons with a blank response (no occupation) to having an occupation. This greatly overstated female occupational responses in 1860, particularly for married women. Since FARM status is inferred from occupation, and many of the allocated cases were farmers, the 1860 and 1870 samples overstated the number of farms. Both the 1860 and 1870 samples have been reconstructed to rectify this problem.

March 24, 1998 -- Made a significant, if somewhat subtle, change to the way the extraction system works. Altered the extraction system to zero out any variables that were "stacked" in the same column location as a requested variable. Previously, if you selected a variable that was not available in every sample chosen for extraction, the system would include whatever other variable was located in those columns in the raw IPUMS data files. For example, if you selected 1880 along with more modern samples and requested the variable Migration Status, 5 Years, the system would include the alphabetic data from the 1880 variable Last Name in those same extract columns. This caused considerable confusion among users.

Early March, 1998 -- Changed weights in "small" and "tiny" samples to be representative of total population.

Early March, 1998 -- Created a new Flat 1990 sample.

February 17, 1998 -- Changed the weights in the 1860 and 1870 files to account for oversample of blacks.

January, 1998 -- IPUMS-98 is available. For prior revisions, see Changes from IPUMS-95 to IPUMS-98 .