Cornell University Cornell University CISER

CISER Computing

How do I Create a SAS Data Set with Compressed Observations?

To create a compressed SAS data set, use the COMPRESS=YES option as an output DATA set option or in an OPTIONS statement. Compressing a data set reduces its size by reducing repeated consecutive characters or numbers to 2-bye or 3-byte representations. To uncompress observations, you must use a DATA step to copy the data set and use option COMPRESS=NO for the new data set.

The advantages of using a SAS compressed data set are reduced storage requirements for the data set and fewer input/output operations necessary to read from and write to the data set during processing. The disadvantages include not being able to use SAS observation number to access an observation. The CPU time required to prepare compressed observations for input/output observations is increased because of the overhead of compressing and expanding the observations. (Note: If there are few repeated characters, a data set can occupy more space in compressed form than in uncompressed form, due to the higher overhead per observation.) For more details on SAS compression see "SAS Language: Reference, Version 6, First Edition, Cary, NC: SAS Institute Inc., 1990".

The two ways to compress data sets in SAS:

  • Using the option in the DATA step to compress a data set:
    data ssd.income (compress=yes);
  • To compress all data sets created within a SAS sessions:
    options compress=yes;