Cornell University Cornell University CISER

CISER Computing

User Written Formats

I. How to Create User Written Formats
II. How to Create or Read a SAS Data Set with User Written Formats



I. How to Create User Written Formats

The appearance of values in a SAS data set is controlled by formats. Use PROC FORMAT with VALUE statements to create customized formats.   To avoid re-creating formats each time a job is run, these formats can be permanently stored in a catalog. The general form of a PROC FORMAT is:

PROC FORMAT library=libref.catalogname ;
    VALUE  numfmt  value1='formatted-value-1'   value2='formatted-value-2'
        ........ valuen='formatted-value-n' ;
    VALUE  $charfmt  'value1'='formatted-value-1'   'value2'='formatted-value-2'
        ........ 'valuen'='formatted-value-n' ;
RUN;

PROC FORMAT Statement:  Without the LIBRARY=option, formats are stored in a catalog called FORMATS in the temporary WORK library and exist only for the duration of the SAS session. If the LIBRARY= option specifies only a libref, formats are permanently stored in that library in a catalog called FORMATS.

VALUE Statement:  When defining your own character or numeric formats, a VALUE statement is required for each format. Format names follow SAS naming conventions, except that character formats begin with a $ (the $ is included in the length of the name). A numeric format can be applied only to a numeric variable and a character format to a character variable. The input values for a character format are quoted. An input value for a numeric or character format cannot be mapped to more than one formatted value. Formatted values for all formats are quoted, and can be up to 200 characters long.
 

To obtain a detailed report about the formats in a catalog, use the FMTLIB option in the PROC FORMAT statement. This PROC step can also include either the SELECT or EXCLUDE statement to process specific formats rather than an entire catalog. General form for PROC FORMAT with the FMTLIB option is:

PROC FORMAT LIBRARY=libref.catalogname  FMTLIB;
      /* SELECT or  EXCLUDE statement ; */
RUN;

 
Sample SAS programs to create permanent SAS formats and to obtain details about a format:

/*The following example creates a numeric format called EDU_CAT and a character format called $GEN and stores those formats permanently in MY.FORMATS catalog (here MY is the libref assigned to a data library and FORMATS is the name of the catalog saved in that library). */

libname my 'u:\myproj\';

proc format library=my;  /*SAS will use the default catalog name FORMATS*/
  value edu_cat  1-11='Less than High School'
                  12='High School'
            12<-high='More than High School' ;
  value $gen  'M','m'='Males'
              'F','f'='Females' ;
run;

/* The following example creates a numeric format called EMPST and a character format called $JOB and stores those formats permanently in IN.NEWFORM catalog (here IN is the libref assigned to a data library and NEWFORM is the name of a catalog saved in that library). */

libname in 'u:\';

proc format library=in.newform;   value $job  '01'='Teacher'
             '31'-'33'='Computing Consultant'
       '41'-'49','51'-'59'='Medical Professional'
              other='N/A' ;
  value empst  0='NotEmployed'
              1='Part-time Employed'
              2='Full-time Employed' ;
run;

/* Example of obtaining detailed reports about formats in a catalog.  If either a SELECT or an EXCLUDE statement is not included, SAS outputs definitions of all the formats in the catalog. */

libname in 'u:\';
proc format library=in.newform FMTLIB ;
    select $job ;
run;


II. How to Create or Read SAS Data Sets with User Written Formats

When you either create or process a SAS data set with user-written formats attached to it's variables, the SAS system looks for those formats in a catalog named WORK.FORMATS and then in one named LIBRARY.FORMATS (here LIBRARY is the libref assigned to a data library and FORMATS is the name of a catalog saved in that library). If all your formats are saved in a catalog named FORMATS, simply assign the special libref LIBRARY to that folder and SAS will have access to those formats.  However, if you need to search multiple formats catalogs, you can use the FMTSEARCH= system option to identify the catalogs to be searched. The general form of this options is:

OPTIONS  FMTSEARCH=(mem1  mem2  ....memn) ;
    /* Here mem is either libref or libref.catalog.If mem is only a libref, FORMATS is assumed as name of the formatscatalog.*/

To permanently attach formats with variables in a SAS data set, use a DATA step with FORMAT statement. A FORMAT statement can also be used in a PROC step to temporarily attach the formats for processing. The general form of a FORMAT statement is:

FORMAT varname1  formatnamevarname2 formatname.  ---  varnamen  formatname. ;
        /* A format name should end with a period */

Note: In a DATA step, when you read a data set with formats attached to the variables, the variables in the output data set also inherit thoseformats. If you wish to remove the formats from the variables in your output data set, you can use a FORMAT statement with blanks in places of format names.

 
Examples of DATA and PROC steps that show working with user-written formats. Examples use formats created in Section I of this document.

/* Use the special libref LIBRARY for a folder where formats are saved in a catalog named FORMATS. Use FMTSEARCH= option to identify other catalogs for searching. */

libname LIBRARY 'u:\myproj\';
libname  saslib 'u:\';
options nodate formdlim='-' FMTSEARCH=(saslib.newform);

/* Now the following catalogs will be searched for formats in this order:  System Formats Catalog -> WORK.FORMATS -> LIBRARY.FORMATS -> SASLIB.NEWFORM */

/* Creating a data set with numeric and character variables */

data temp;
 infile cards dlm=',';
 input id $ sex $ emp_statyr_edu jobcat $ ;
 cards;
 A, m, 2, 18, 42
 B, F, 0, 16, 00
 C, f, 2, 16, 32
 D, M, 1, 12, 52
 E, f, 1, 18, 01
 ;
run;

/* Use a DATA step to permanently attach formats with variables */

 data saslib.rec;
   set temp;
   FORMAT emp_statempst.

/* Attach character formats to character variables and numeric formats to numeric variables */

   jobcat $job.;
 run;

 proc contents data=saslib.rec;
  title ;
  title1 "Data with User-written Formats" ;
 run;

 proc print data=saslib.rec;
   title2 "OUTPUT: The PRINT Procedure" ;
 run;

/* Using formats temporarily in a PROC step */

proc freq data=saslib.rec;
   title2 "OUTPUT: The FREQ Procedure";
  tables sex  yr_edu ;
  format sex $gen. yr_eduedu_cat. ;
 run;

/* Example of removing formats from the variables */

data new;
   set saslib.rec;

/* Detaching formats from these variables */

   FORMAT emp_stat jobcat ;
run;

proc contents data=new;
 title "Data with NO user-written formats" ;
run;


SAS OUTPUTS:

                         Data with User-written Formats
           Partial Output: The CONTENTS Procedure        1

     ----Alphabetic List of Variables and Attributes----

          #    Variable    Type    Len   Pos    Format
          --------------------------------------------
          3    emp_stat    num      8      0    EMPST.
          1    id          Char     8     16

          5    jobcat      Char     8     32    $JOB.

          2    sex         Char     8     24

          4    yr_edu      Num      8      8


                   Data with User-written Formats
             OUTPUT: The PRINT Procedure                 2

Obs id sex emp_stat        yr_edu  jobcat

1  A   m  Full-time Employed  18  Medical Professionals
2  B   F  Not Employed        16  N/A

3  C   f  Full-time Employed  16  Computing Consultants

4  D   M  Part-time Employed  12  Medical Professionals

5  E   f  Part-time Employed  18  Teacher


         Data with User-written Formats
           OUTPUT: The FREQ Procedure                    3

                                     Cumulative Cumulative
   sex        Frequency    Percent    Frequency Percent
   ----------------------------------------------------
   Females           3      60.00         3       60.00
   Males             2      40.00         5      100.00

 

                                     Cumulative Cumulative
      yr_edu     Frequency   Percent  Frequency Percent
  -----------------------------------------------------
  High School             1    20.00      1       20.00
  More than High School   4    80.00      5      100.00

 


          Data with NO user-written formats
       Partial Output: The CONTENTS Procedure            4

   ----Alphabetic List of Variables and Attributes----

          #    Variable    Type    Len   Pos
          ------------------------------------
          3    emp_stat    Num      8      0
          1    id          Char     8     16

          5    jobcat      Char     8     32

          2    sex         Char     8     24

          4    yr_edu      Num      8      8