Taking Random Samples of Observations from a SAS Data
Q. I am looking for a program which will let me take a random sample from a very large one (for example, a sample of 300 from a sample of 10000).
A. One way of selecting a random sample from a data set is to, first, use a DATA step to generate a random vector, then use PROC sort to rearrange the data by that random vector and then select first k observations. Below is a sample program.
DATA dummy ; /* CREATE A DATA SET */
input var1 @@;
2.1 3.1 4 6 2.2 4.9 4 5 3 3.3 4 5 3 4.3 2.3 4 5 7 3 3 9 11 2
%let k=10; /* DEFINE SAMPLE SIZE */
DATA dummy ;
SET dummy ;
random=RANUNI(-1); /* GENERATE A RANDOM VECTOR */
PROC SORT DATA=dummy;
BY random; /* SORT OBSERVATIONS BY THE RANDOM VECTOR */
IF _N_ le &k; /* SELECT THE FIRST K OBSERVATIONS */