The Data-PASS Project: Preserving the Past of Survey Research
Partnership of the Library of Congress and six archives
The Data-PASS (Data Preservation Alliance for the Social Sciences) project is a partnership of the Library of Congress and six archives:
Its goal is to locate and archive all digital social science data on American society ever produced. We define "social science data" broadly—not only data gathered by social scientists, but data on any topic that might be of interest to social scientists. While the major focus will be on surveys, we expect to acquire other kinds of data as well.
The Roper Center has primary responsibility for three groups of data: United States Information Agency (USIA) surveys conducted around the world between 1949 and 1999, surveys from the National Opinion Research Center (NORC), and all other surveys for which the Roper Center has reports but lacks the original data.
The USIA conducted several thousand surveys in scores of countries around the world. While these included many in Western Europe, also gathered were survey data from areas in which they were otherwise scarce, among them a number of nations in Southeast Asia and Africa. These surveys focused on views of the United States and its foreign policies but touched on many other topics as well. The USIA often asked identical questions in different countries, providing a valuable resource for comparative analysis.
The Roper Center already holds several hundred USIA surveys, but many more—probably over a thousand—are preserved in the National Archives. While the surveys at the National Archives are secure, they lack effective finding aids and conversion to modern formats, making it difficult for researchers to make use of the data. The Roper Center plans to catalog and process this important collection so it is readily available to the public.
The National Opinion Research Center began in the early 1940s, making it one of the oldest survey organizations, and it has conducted many important studies over the ensuing sixty years. NORC has always donated data to the Roper Center, but many of its surveys did not find their way there or into any other archives. Sometimes the data were given to the survey sponsor, and sometimes they were simply set aside and forgotten. Fortunately, some data have been preserved in boxes and filing cabinets stored in a warehouse near NORC's Chicago headquarters, and we are working with NORC to locate and preserve them. Several of these datasets are already well known to social scientists, such as the early studies of occupational prestige from the 1940s and 1950s. Others, although less familiar, are of interest because they provide information on important topics. For example, during the early 1960s, NORC did a number of studies of happiness and mental well-being. These questions were not prominent in social science research at the time, but they have become more popular in recent years, so that the NORC data provide an important baseline for historical comparison. There are also many other NORC studies whose location is unknown—they may be in the warehouse, or at other locations, or lost entirely. We are hoping that examination of the codebooks and correspondence related to these studies may shed light on their location.
Data-PASS Punched Card Data Recovery Poster from the 2011 IASSIST conference.The third body of data is the most broadly defined, since it includes data collected by many different organizations, some of which no longer exist. Until the mid-1960s, there were relatively few survey firms. Most of those that existed regularly sent their data to the Roper Center, but even during this time some surveys were missed. In the days of punched cards, for example, if the survey sponsor wanted the original data, there was not necessarily a backup copy to send to the Roper Center. Beginning in the 1970s, the number of firms increased dramatically, and many did not regularly archive their data. As a result, a large fraction of the surveys done in the last few decades remains outside of any archives.
We have used the Roper Center's iPOLL database, which includes records for about five hundred thousand survey questions, to identify surveys of particular interest. The list includes over two hundred surveys and continues to grow. Unlike the USIA and NORC surveys, these data are highly dispersed—there is no central source to which we can go. Our primary approach will be to contact the survey organizations and survey sponsors. When possible, we will also contact individual researchers who obtained the data for their own work and may still have copies.
We know that we will be able to obtain and archive much of the data that is currently held by USIA and NORC. Whether we will be able to obtain much from the third group is uncertain. It is possible that most have already been discarded, and that the rest is too scattered to be recovered. If we do not try to find them, however, they will certainly be lost, so the attempt is worth making.
The pioneering figures of survey research were aware of the value their data would have to future generations. The Roper Center was founded in 1947, only about ten years after modern surveys began, and a great deal of survey data has been preserved there and in other archives. It has been difficult, however, to keep up with the growing volume of data produced by the expansion of academic social science and the survey industry. The Data-PASS project is the first systematic effort to go back and identify data that have been missed. We hope that in addition to recovering older data, the project will establish standards and procedures that will help archives effectively manage data produced in the future.