Cornell University Cornell University CISER

CISER Computing

How to Read Space-Delimited ASCII Files Into SAS

To read a space-delimited file you can either use the Import Wizard or use the DATA-INFILE-INPUT statements combination. Import Wizards are only useful when dealing with text or ASCII data files with just a few variables, however, if you are dealing with hundreds, if not thousands, of variables then the DATA-INFILE-INPUT combination is the best one to use.

Reading a Space-Delimited file (POVERTY-LIST.DAT) space delimited file image

A Simple List Input is usually used to read space delimited file where you simply list variables and their properties in the INPUT statement in the order in which the data values appear in the data being read. To read the above data file called poverty-list.dat:

DATA mylib.inputlist;
     INFILE 'c:\sasworkshop\poverty-list.dat';
     INPUT
state $ medinchh medincfam percapinc pctuspov pctfampov;
RUN;
 

 

Data Requirements for Simple List Input

  • No field may be skipped. All variables in the external file must be included.
  • Data values must be separated by at least one blank (space delimited).
  • No blanks may be included within a data value.
  • All values must be in every record in the same order.
  • Missing values must be represented by a placeholder such as a single period (.).
  • The DATA statement is where you specify the name of SAS data file to be created and where it will be stored.
  • The INFILE statement indicates the name and location of the external (raw data) file to be read. It may include information about logical record length ("LRECL") of the raw data file. It may include, or be followed by, instructions to read selected observations.
  • The INPUT statement is where you define the properties of your variables. The variables should be listed in the order in which their corresponding data values appear in the external raw data file.
  • May include information about where data values for each variable are located in the raw data file
  • Includes information about characteristics of the variables (numeric versus character values, implied decimals, etc).
  • The names of variables that contain character values are followed by dollar signs ($).
  • Variables listed without the dollar sign are assumed to be numeric.
  • Variable names must begin with a letter or underscore. The rest can be any letter, number or underscore. Variable names can be up to 32 characters long.
  • The RUN statement tells SAS that this is the step boundary and it should begin processing the DATA step.