Taking Random Samples of Observations from a SAS Data
Q. I am looking for a program which will let me take a random sample from a very large one (for example, a sample of 300 from a sample of 10000).
A. One way of selecting a random sample from a data set is to, first, use a DATA step to generate a random vector, then use PROC sort to rearrange the data by that random vector and then select first k observations. Below is a sample program.
DATA dummy ; /* CREATE A DATA SET */
input var1 @@;
cards;
2.1 3.1 4 6 2.2 4.9 4 5 3 3.3 4 5 3 4.3 2.3 4 5 7 3 3 9 11 2
;
run;%let k=10; /* DEFINE SAMPLE SIZE */
DATA dummy ;
SET dummy ;
random=RANUNI(-1); /* GENERATE A RANDOM VECTOR */
run;PROC SORT DATA=dummy;
BY random; /* SORT OBSERVATIONS BY THE RANDOM VECTOR */
run;DATA sample;
SET dummy(drop=random);
IF _N_ le &k; /* SELECT THE FIRST K OBSERVATIONS */
run;proc print;
run;