Occupational Coding in the 1900 Public Use Microdata Sample 1
An occupational title was reported for any person ten years of age or older who was engaged in "gainful labor." The enumerator reported the occupation, trade, or profession on which the person depended for support, or was engaged in the larger part of the time. The distinction between occupation and industry, more commonly in use now, was not made in 1900. Instead the enumerators were told to "Indicate in every case the kind of work done or character of service rendered. Do not state merely the article made or worked upon, or the place where the work is done." (1900 Enumerator Instructions, paragraph 156)
Children in school who were not also working were reported as "At school" even if they were under 10. Persons living on income from land, stocks, etc., were to be reported as "Capitalist." For those with regular occupations, the list of distinctions and exceptions that the enumerator was supposed to make was imposing (see Volume III: 1900 Enumerator Instructions, paragraph 153-223). It is not surprising to find that enumerators often failed to make the proper distinctions, or reported information that they were not supposed to report.
A common error was to report the industry only, e.g., "Groceries," "Hardware," or to report a very general occupation without giving the industry, e.g., "Laborer," "Machinist." For this reason, the "not specified" and "uncodable" categories are relatively large. Some enumerators reported occupations that were not supposed to be reported, e.g., "Housewife," "Retired," and sometimes occupations were reported which were inappropriate for some individuals, e.g., inmates of prisons, children under 10.
The occupational titles presented our most difficult coding task. Our overall strategy was to record the occupational title just the way it was written, and then assign occupation codes later on using an interactive computer system. The data entry staff were instructed to take town the occupation just the way the enumerator wrote it, including any spelling errors. Illegible portions were indicated with parentheses. No abbreviations were used, unless the occupational title would not fit into 25 characters.
After all data had been collected, the non-blank occupational titles were selected and sorted into alphabetic sequence. A file of titles was made up, with each unique occupational title making up one record. There were 8,128 unique occupational titles in all, but some of these were non-occupations or variant spellings of the same title.
The list of titles was reviewed by the staff, and alphanumeric codes called detail codes were assigned to each title. The detail codes separated the occupational titles into about 950 categories, the finest meaningful breakdown that we could manage. When all occupations had been assigned their detail codes, they were aggregated in different ways to produce the final coding structures. We combined the detail codes into a 1900 coding scheme, to replicate as closely as possible the Bureau of Census tables, and a 1950 coding scheme, with a more standard occupational breakdown, that makes the distinction between occupation and industry.
The rules that the Bureau of Census used in producing its tables were not always explicit, so various approaches to the 1900 scheme were tried, with the objective of matching as closely as possible the tables given in census publications. The 1950 scheme was more straightforward, but in many cases we were unable to make the kinds of distinctions necessary, so some categories had to be collapsed or eliminated. There were also some changes in the occupational structure between 1900 and 1950, so categories had to be added to the 1950 structure or eliminated from it to make it more appropriate to 1900 data. When the most acceptable coding scheme had been decided upon, the occupation codes were assigned to each individual record that carried the corresponding title.
But occupation codes could not be assigned solely on the basis of title. There were many cases in which titles were inappropriately used, conflicted with other information available for the individual, or needed to be refined by reference to other information. A number of correction runs were made to identify these kinds of problems and refine the occupational coding system to more closely approximate the true situation. We used the information contained in the individual case to distinguish farm labor (family) from ordinary farm labor, identify child laborers, identify conflicts between relation to head code and occupation, etc. Once this had been done, final frequencies were computed for the 1900 occupation coding scheme and compared to expected values based on the tabulations reported by census. Most of the occupational categories were reasonably close to the expected value, but some showed large differences. These reflected the different coding rules and assumptions used by our project and the Bureau of Census. The codes that were particularly prone to differences are discussed in detail below.
006-007 & 016-017-Farming codes. These presented a number of problems, since Census did quite a lot of recoding of these categories, and the rules that they used were not clear. Enumerators did not always distinguish correctly between farmers and farm laborers, so Census changed a number of respondents from one category to another. Members of a farmer's family who worked on the farm were recoded to the family categories (007 & 017).
We used a computer program to recode these cases. Farmers were recoded to farm labor when the head of household was a farm laborer. Both farmers and farm laborers were recoded to the family category when they were wives, sons or daughters of a head who gave a farm occupation, and they also gave a farm occupation. Other relatives were not recoded, only members of the immediate family.
This procedure gave results which closely approximated the Census results, but there are enough differences that we suspect that Census used slightly different procedures. The farm labor category (006) is larger than expected by about 100 cases, while the farm labor family category (007) is about 150 short. Census apparently recoded more individuals into the family category than we did, but we were unable to arrive at an algorithm that would give closer results, so we settled on the immediate family approach.
The reverse problem was encountered with the farmer category. Farmers (016) were understated by about 125 cases, and farmers family overstated by about the same amount. We used the same algorithm for both, which Census apparently did not.
126 & 127-Nursing Occupations. Census attempted to distinguish between trainer medical nurses and household servants or wet nurses, but the enumerators frustrated this attempt, and most nurses wound up in the unspecified category. Our unspecified category was much larger than expected, but this is due, we think, to the chance selection of a few large hospital staffs in our sample (See Volume I, Chapter 4: Occupation Codes and Income Scores). These cases, made up largely of nurses, inflated our nursing category beyond the expected size.
241 & 254-Merchants. Merchants were broken down by industry in the 1900 system, with a not specified category containing the remainder. We showed fewer respondents in each specific category of merchant, with the except of produce and provisions. Our not specified category is also lower, and this results in our merchant category being smaller than expected by about 160 individuals. We suspect that Census assigned individuals that gave only industry to the merchant category. Occupations such as "Grocery" or "Hardware" were not specific enough for us to code, but may have been put into the merchant category by Census. Our code for these cases was 946 (Uncodable employment).
307-Wholesale Merchants and Dealers. We show substantially more individuals than expected in this category, for reasons which are not clear. We included feed and grain dealers, and livestock dealers, as well as wholesale merchants in this category. Some of these individuals may have been coded as ordinary merchants (see paragraph above), although most of our titles used the word "wholesale."
311-Salesmen and Saleswomen. Individuals who specifically mentioned sales work were put into this category. Our total is short of the expected by about 175 individuals. Census may have recoded some clerical employees into sales work (see Volume III: 1900 Enumerator Instructions, paragraph 192), and sales work may also have been assumed when individuals gave only an industry.
506-508-Mining. The mining categories were separated by whether the miner worked on coal or gold and silver. We coded an individual to a specific category if their occupational title specifically mentioned the ore, otherwise they were placed in the not specified category. Our specific categories were smaller than expected, while the "not specified" category was larger. Census coders apparently made use of geographical information to supply the ore when it was not mentioned. We did not have the time and resources to attempt this.
732, 856, 943 & 946-Unspecified Occupations. Our treatment of unspecified employment was different than the Census approach. We were more conservative generally and did not assign individuals to a category unless the occupational title definitely indicated that they belonged there. Our unspecified categories were larger than the Census reports.
We used two unspecified categories for employment, code 943 and code 946. The first, called "Unspecified," was used generally for occupations which were unintelligible but had no specific category and for occupations which were too vague to categorize properly. Examples of the first are "Laborer, peanut factory" and "Fur skin dresser," while the second included such occupations as "Millwright" add "Machine operator."
Code 946, on the other hand, was used for occupations that reported industry only, and for illegible or unintelligible occupations. Examples of these are "Groceries," "Drugs," or "Hardware" for industrial occupations, and "Stale Picker," "Tail Plainer," or "Tipslaff in Count" for uncodable ones. This category was larger than 943 above, and the combination of the two categories contained about 800 individuals, or about 2 per cent of those reporting occupations.
Besides the categories used by the Census Bureau, we devised some new categories to handle some of the problems that occurred:
947-Housekeepers and Stewards, Domestic Residents. The enumerators were instructed not to record occupations for women engaged in keeping house for their families (see Volume III: 1900 Enumerator Instructions, paragraph 185). But it was clear that in some cases they recorded such domestic activity. Occupational titles such as "Housewife," "At Home," and "Housework" were often found, and were coded 962, Housewife. In other cases, it was clear that occupations such as "Housekeeper" indicated actual employment in domestic service, since the relation to head code indicated a domestic servant or employee, or the individual lived as a boarder or lodger employed by someone else. These cases were coded 156, Housekeepers and Stewards.
There was a large residual category, however, for which it was impossible to determine whether individuals were gainfully employed as housekeepers, or whether they were performing housekeeping duties for their own families on an unpaid basis. Individuals in this category were living at home with a family and listed their occupations as something like "Keeps house" or "Housekeeping." These individuals were coded into the 947 category, indicating an ambiguous occupational status.
951-Officials, Societies, and Institutions. Non-governmental officials were coded into this category.
952-Managers, Officials, and Proprietors, (Not Elsewhere Classified). General Managers, Secretaries and Superintendents were coded in this category, if their industry was not indicated.
953-Government Officials, Level Unknown. This was a group of minor officials, tax assessors, road commissioners, etc.
954-Inspectors. When the type was unknown, inspectors were coded here.
955-Foremen. Foremen with no industry given or with an unclassifiable industry.
956-Box Makers (Type Unknown).
957-Vehicle Makers, Repairers (Type Unknown). Boat and ship builders, car and engine repairers with the type unspecified.
The following occupational categories were used for non-occupational responses recorded under the occupational title.
961-At school. This included all types of schools, theological schools, law schools, theological schools, Indian schools, etc.
962-Housewife. These domestic occupations should not have been recorded by enumerators, but when they were, the individual was coded into this category (see also 947 above).
966-Landlord. This was not recognized as an occupation in the 1900 tabulations, but it was often recorded by enumerators.
967-Inmate. A number of inmates in group quarters situations reported gainful occupations. For our coding purposes, group quarters primaries who were considered to be inmates of involuntary institutions were automatically given the occupation code 967 for inmate. Any occupation reported for them was ignored. Group quarters codes for which this was done were 00, 01, 02, 03, 04, 06, 14, and 19.
969-Other Non-occupation. Includes non-occupational titles such as "Infant," "Pensioner," "Pauper," and "Tourist," as well as underworld titles such as "Prostitute" and "Gambler."
976-Child labor-farm. Children under 10 reporting farm occupations were given this code.
977-Child labor-domestic. Children under 10 in domestic service.
978-Child labor-other. Children under 10 with other occupations such as factory work.
999-No occupation reported (blank).
- Stephen N. Graham, "Chapter 8: Occupation Coding," 1900 Public Use Sample: User's Handbook (Draft Version), Seattle: Center for Studies in Demography and Ecology, University of Washington, 1980, pp. 72-77.