- About CISER Computing
- Computing Resources
- Request a CISER Computing Account
- Computing Node Availability and Usage
- Computing News & User Notes
- HelpDesk Services
- CISER Computing Basics
- CISER Computing FAQ
- CISER Billing FAQ
- Workshop Downloads
- Workshop Schedules & Registration
- Software on the Computing Nodes
- Online Help for Statistical Software
- Buying Statistical Software at Cornell
Taking Random Samples of Observations from a SAS Data
Q. I am looking for a program which will let me take a random sample from a very large one (for example, a sample of 300 from a sample of 10000).
A. One way of selecting a random sample from a data set is to, first, use a DATA step to generate a random vector, then use PROC sort to rearrange the data by that random vector and then select first k observations. Below is a sample program.
DATA dummy ;
input var1 @@ ;
cards;
2.1 3.1 4 6 2.2 4.9 4 5 3 3.3 4 5 3 4.3
2.3 4 5 7 3 3 9 11 2 ; /* CREATE A DATA SET */
run;%let k=10; /* DEFINE SAMPLE SIZE */
DATA dummy ;
SET dummy ;
random=RANUNI(-1); /* GENERATE A RANDOM VECTOR */
run;PROC SORT DATA=dummy;
BY random; /* SORT OBSERVATIONS BY THE RANDOM VECTOR */
run;DATA sample;
SET dummy(drop=random);
IF _N_ le &k; /* SELECT THE FIRST K OBSERVATIONS */
run;proc print;
run;