Data
Download Data
Variables

Documentation
About Slave PUMS
User's Guide
Revision History
Citation and Use

Site Tools
Search This Site
Site Map

Contact Us
Feedback
Staff
Mailing Address

 

User’s Guide

The Public Use Microdata Samples of the Slave Population are a series of six samples taken from the 1850 and 1860 Censuses of Slave Inhabitants. The questions included in the Census of Slave Inhabitants were relatively few: schedules from both years asked slaves’ age, sex, color, and disabilities including whether deaf, dumb, blind, insane or idiotic. Enumerators were allowed to record slaves’ names in the 1860 census if the slave was over 100 years of age. Slaveholders also provided information about the number of slaves who had been manumitted or escaped in the prior year. In 1860, the number of houses provided to slaves on the slaveholding was also tallied.

We have provided two samples from 1850 and three samples from 1860.

The 1850 samples include:

  • a Flat sample, where all slaveholdings had an equal probability of being included, and
  • an Urban/Group Quarters sample, where slaves from urban areas and those living in large holdings were sampled under different rules (see below).
  • a sample linking 1850 Slave data to the 1850 Free population data available in IPUMS-USA

The 1860 samples include:

  • a Flat sample,
  • a Group Quarters sample, where slaves living in large holdings were sampled under different rules, and
  • a Complete Count sample that includes the complete population of slaves in selected southern counties.

All samples other than the 1850 linked sample have the same file structure, and all except for the 1860 Complete Count sample are nationally representative (when weights are applied).

We recommend that most researchers use the Flat samples. They are adequate for almost any purpose, and the use of weights is optional. The 1850 Urban/Group Quarters sample and the 1860 Group Quarters sample are intended for more specialized purposes. In particular, they provide a much more precise sample of slaves living in large holdings. The 1850 Urban/Group Quarters sample also provides an oversample of those living in cities. The final sample, the 1860 Complete Count dataset, should be used with great caution--it is not nationally representative.

Data Format
All six datasets have the same rectangular column-format structure. The unit of analysis is the individual slave (or free person, in the case of the 1850 linked sample). Since the file structure is rectangular, all holding-level variables—such as slaveholder type or number of fugitives from the holding—are attached to all of the individual slaves within each holding.

Users making tabulations of any holding-level variables should do so with caution. To make an accurate tabulation of the number of slaveholders from any of the slave samples, for instance, one would need to select only the first slave record in each holding (using the variable SLAVENUM).

Sample weights must be applied for all analyses of the 1850 Urban/Group Quarters sample. Sample weights are optional in when using the 1850 and 1860 “flat” samples, and the 1860 Group Quarters sample. The 1860 Complete Count sample covers only selected southern counties and therefore cannot be made representative even with the use of weights.

1850 Flat Sample
This sample contains approximately 1-in-20 sample of the slaves enumerated in 1850. The sampling strategy for this sample is based on the manuscript page from the slave schedule. Each page contains 84 lines in the 1850 schedules. Each line describes one or more slaves of a particular age, sex, color and disability. (Typically, each line contained information for only one slave.) We randomly generated a “window” of 4 sample points on every page (e.g., lines 24, 25, 26, and 27).

To ensure that holdings had an equal probability of being included in the sample regardless of their size, a holding was entered only if a sample point fell on the line containing the holding’s first slave. When the sample point fell on any other line, the holding was skipped. If the holding was included, all slaves were entered. Because this sample is a flat sample, the use of sample weights is not required.

1850 Urban/Group Quarters Sample
For most of the slaveholding United States, this sample is also an approximately 1-in-20 sample. Also included, however, is an oversample of cases from the counties that contained the South’s ten largest cities – Baltimore, Maryland, New Orleans, Louisiana, St. Louis, Missouri, Louisville, Kentucky, Charleston, South Carolina, Washington, DC, Richmond, Virginia, Mobile, Alabama, Savannah, Georgia, and Norfolk, Virginia. In these ten counties, the sampling rate is 1-in-4.

Additionally, we employed a different sampling strategy for slaveholdings including 100 of more slaves. Like the 1850 Flat sample, the sampling windows were only accepted if it fell on the first line of a slaveholding, provided the slaveholding had fewer than 100 slaves.

If the slaveholding had 100 or more slaves, however, a modified group-quarters sampling strategy was followed, much like that used in the IPUMS-USA samples. When the sampling window fell within a slaveholding with 100 or more slaves, data would be recorded for the slaves within the sampling window. All slaves in the sampling window were included, even if no slave in the window was on the first line of a holding. When the sampling winding included the first slave in a holding of 100 or more, only those in the window were included (i.e., we did not include ANY intact holdings of 100 or more slaves in the group quarters samples).

The chief advantage of this approach is that the number of large slaveholdings that are represented in the sample is far higher than would be the case otherwise. This increased representation ultimately increases precision of any analysis of slaves living in large holdings. Because the 1850 Urban/Group Quarters Sample contains two distinct sampling rates, users must apply sample weights to obtain representative statistics of the slave population (see the WEIGHT variable for details).

1850 Linked Slave/Free Population Sample
This dataset is a 1-in-100 sample of the slave and free population in 1850. The dataset contains all persons present in the original 1850 IPUMS sample, regardless of whether or not they were slaveholders. For all free persons, there are variables identifying serial and person number in the original 1850 IPUMS data as well as slaveholder status. For all slave persons, there are variables indicating serial number of the slaveholding household in which they lived, location of the slaveholding on the slave schedules, and age, race, and sex of slave. Using the SERIAL and PERNUM variables, this data can easily be integrated with the 1850 Free Population data available at the IPUMS-USA website.

1860 Flat Sample
This sample contains approximately 1-in-20 of the slaves enumerated. The sampling strategy for this sample is based on the manuscript page from the slave schedule. Each page contains 80 lines in the 1860 schedules. Each line describes one or more slaves of a particular age, sex, color and disability. (Typically, each line contained information for only one slave.) As with the other samples, we randomly generated a “window” of 4 sample points on every page. Only cases where the window fell on the first line of the slaveholding were included. When the sample point fell on any other line, the holding was skipped. If the holding was included, all slaves were entered. Because this sample is a flat sample, the use of sample weights is not required.

1860 Group Quarters Sample
This also is a 1-in-20 sample, but unlike the 1860 Flat Sample, group quarters sampling rules are used here for slaveholdings that contained at least 100 slaves. When the sampling window fell within a slaveholding with 100 or more slaves, data would be recorded for the slaves within the sampling window. All slaves in the sampling window were included, even if no slave in the window was on the first line of a holding. When the sampling winding included the first slave in a holding of 100 or more, only those in the window were included (i.e., we did not include ANY intact holdings of 100 or more slaves in the group quarters samples). Like the flat samples, this sample does not require the use of weights.

1860 Complete Count Sample
This sample is contains all data from a random series of microfilm reels from across the South. Within the selected reels, data from all slaves was recorded and appears within this sample. The selected reels include 1 reel from 12 different states and the District of Columbia. As this sample is not representative, it will be primarily useful for scholars analyzing aspects of slavery for places that appear in this sample.

 
Get Data Quick Reference User's Guide Search Slave PUMS Home Home