CISER header image
Cornell University Cornell University CISER

CISER Data Archive

Internet Data Sources for Social Scientists

Online Reference Tools

BLS Handbook of Methods
     Describes the methodology, evolution, and coverage of the major economic and labor force surveys produced by the Bureau of Labor Statistics. An important reference for users of the CPS, PPI and CPI statistics, Consumer Expenditure Surveys, and many others.

Carnegie Classification Codes
     A system for representing accredited, degree-granting colleges in the U.S. based on size and character of enrollment, scope of degrees offered, research focus, and other characteristics. Developed by the Carnegie Foundation for the Advancement of Teaching, the scheme was first developed in 1970 and has recently been overhauled.

Cartographic Boundary Files
     For use with Census data products. Includes many specialized geographies; for example, traffic zones, school districts, state legislative districts, voting districts.

Dictionary of Occupational Titles
     A classification system commonly used in datasets and compiled at the Department of Labor. The most recent edition is 1991. The DOT has been replaced by O*Net, which defines and described occupations:

Eagle Geocode
     Provides geocodes (Census geography, latitude/longitude) for standard addresses. You can search a few for free for evaluation purposes, other large-scale uses are available for a fee. Batch processing options are useful for coding survey respondants.

FIPS (ANSI) codes
     These standardized codes identify U.S. geographic areas. States are assigned 2-digit codes, counties have 3-digit codes, and there are also codes for metropolitan areas and places. FIPS codes are fairly ubiquitous in data files and useful for joining geographic records from different files. FIPS stands for Federal Information Processing Standards, and codes are assigned by the National Institute of Standards and Technology (NIST).

      bullet FIPS (ANSI) codes

      bullet geographic terms
          Terms and concepts: Federal Information Processing Series (FIPS) and American National Standards Institute (ANSI).

      bullet for states and counties

      bullet for metropolitan areas

      bullet for places

      bullet Geographic Names Information System (GNIS)
          Codes for places (including those unincorporated), primary county divisions, and other entities This link connects to the online search mechanism to display or download results. You can download the entire file from here (The GNIS features database incorporates and supersedes the FIPS55 files.) Maintained by the U.S. Board on Geographic Names, created to maintain a uniform geographic names.

      bullet Cure for the Common Codes
          One page for each State, listing FIPS codes for all geographies code one could imagine for all geographies, plus school districts (for which there are no FIPS codes). Handy beyond your wildest dreams.

Geographic changes to counties
     Boundary changes to counties or their equivalents deemed "substantial" by Census.

Geography Tools
     A wealth of codes for use with historical Census products, displayed in tabular format or ready to download as ascii files. Includes labor market areas and commuting zones, state economic areas, PUMAs of migration, county composition of metro areas back to the mid-1800s, and much more. Collected by the nice people at IPUMS.

Glossary of Decennial Census Terms and Acronyms
     Maintained by the Census Bureau, defines every imaginable term used within the Census context, both current and superseded.

Glossary of Social Science Computing Terms
     Although aimed at those who staff data archives, this list is handy for anyone working with varying formats of research data. Compiled by Jim Jacobs.

     Generates equivalency files for geographic areas used in the 1980 and 1990 Censuses,

     Creates files or reports of equivalency codes for Census 2000 geographies and more (State legislative and Congressional districts, school districts, voter tabulation districts, and more). Not able to create correlations with previous Census geographies.

Master Area Geographic Area Glossary of Terms
     Definitions for geographic entities used by Census products and many corresponding SAS format label programs . Maintained by the Missouri Census Data Center and OSEDA.

Metropolitan areas and codes

      bullet Lists of Metropolitan and Micropolitan Statistical Areas
          Based on the application to the Census 2000 and Census 2010 data.

National Crosswalk Service Center
     Delivers many occupational and educational crosswalk files, including DOT-to-1980 Census occupations, 1970-to-1980 Census, and OES-to-CIP classifications.

North American Industry Classification System (NAICS)

      bullet U.S. Census

      bullet Statistics Canada, NAICS 2012

     A geographic concept used with Census microdata files. The composition of PUMAs varies according to the microdata sample. For the 2000 Census, PUMAs for the 5% sample must contain at least 100,000 people. PUMAs for the 1% sample (also called a Super PUMA) have a population threshold of 400,000 people. PUMAs are not compatable across decennial Censuses.

      bullet PUMA Equivalency Files
          Lists PUMAs and their component parts for 1980-2000 Censuses.

      bullet State maps of PUMAs
          Handy maps of 1970-2000 PUMAs. PUMA maps for 2000 are available from Census:

Rural Urban Continuum Codes
     Classifies counties or county equivalents by degree of urbanization and proximity to urban areas. Also known as Beale codes.

      bullet 1983 and 1993
          This link downloads an Excel file.

      bullet 1993 and 2003
          Lookup feature of individual counties or download the entire file in Excel format.

SIC (Standard Industrial Classification) Codes

      bullet 1987 and 1972 versions
          Page also links to a 1972/1987 SIC concordance.

      bullet 1987 version
          Search and browse by keyword or code.

Standard Industry Classifications
     Links to classifications systems such as NAICS 1997 and 2002, SIC, ISIC, and their revisions.

Using, Documenting, and Citing Data

Bibliographic Citations for Data Files
     Dedicated to citing numeric files, liberal use of Canadian datasets as examples.

Citing Electronic Data Files
     Uses examples based on ICPSR studies.

Guide to Social Science Data Preparation and Archiving
     Although tailored to the needs of those preparing datasets for archiving at ICPSR, this document is handy for any researcher who collects, manages, and shares data. It takes a "life cycle" approach to archiving, in that the very first steps in the process begin well before data collection. The PDF version of the document links from this page.

How to Cite Electronic Media
     Don't forget to accurately and appropriately cite data you use! This page has examples for datafiles, web and FTP sites, e-mails, e-lists, and more.

How to Use a Codebook
     Detailed instructions for translating codebook and record information into SAS, SPSS, and Stata programs to read and prepare data for analysis.

Introduction to Data Handling
     Introduction to data structures (rectangular, hierarchical, et al.), how to use a codebook, merging files. The nuts and bolts to preparing data for analysis. Not updated recently but still useful. Compiled by Social Sciences Computing Services, University of Chicago.

Suggested Citation Styles for Internet Information
     Recommended citation formats for static and dynamic products provided on US Census sites.

Tools and Guidelines for managing household survey microdata
     These pages cover metadata creation, file formats and organization, data editing, principles of archiving, and minimizing disclosure risk. Compiled by the International Household Survey Network.