The script first creates a manual dataset. It calculates the observed mean difference via a classic PROC TTEST. Then, it uses PROC IML to generate 1000 permutations of the response variable ('Money') while keeping the groups ('School') fixed. These permutations are analyzed en masse by PROC TTEST to generate a null distribution of mean differences. Finally, the script compares the observed difference to this distribution to estimate an empirical p-value.
Data Analysis
Type : INTERNAL_CREATION
The data is defined directly in the code via a Data Step using DATALINES (School and Money variables).
1 Code Block
DATA STEP Data
Explanation : Creation of the 'cash' dataset containing observations for two schools (School 0 and 1) and associated amounts.
Explanation : Execution of the initial t-test on real data to obtain the observed statistic (the actual mean difference).
Copied!
proc ttest data=cash;
class School;
*may need to convert School to numeric;
var Money;
run;
1
PROC TTESTDATA=cash;
2
class School;
3
*may need to convert School to numeric;
4
var Money;
5
RUN;
3 Code Block
PROC IML Data
Explanation : Use of the IML matrix language to read data, generate 1000 random permutations of the 'Money' column (ranperm function), and save the result in a wide table 'newds'. ODS outputs are disabled to improve performance.
Copied!
ods output off;
ods exclude all;
*borrowed code from internet ... randomizes observations and creates a matrix ... one row per randomization ;
proc iml ;
use cash;
read all var{School Money} into x;
*change varibale names here ... make sure it is class then var ... in that order.;
p=t(ranperm(x[, 2], 1000));
*Note that the "1000" here is the number of permutations. ;
paf=x[, 1]||p;
create newds from paf;
append from paf;
quit;
*calculates differences and creates a histogram;
ods output conflimits=diff;
1
ods OUTPUT off;
2
ods exclude all;
3
*borrowed code from internet ... randomizes observations and creates a matrix ... one row per randomization ;
4
5
PROC IML ;
6
use cash;
7
read all var{School Money} into x;
8
*change varibale names here ... make sure it is class then var ... in that order.;
9
p=t(ranperm(x[, 2], 1000));
10
*Note that the "1000" here is the number of permutations. ;
11
paf=x[, 1]||p;
12
create newds from paf;
13
append from paf;
14
QUIT;
15
*calculates differences and creates a histogram;
16
ods OUTPUT conflimits=diff;
4 Code Block
PROC TTEST Data
Explanation : Massive execution of t-tests on the 1000 permuted columns. The 'newds' table contains the class in col1 and permutations in col2-col1001. The results (confidence limits/differences) are captured in the 'diff' table via the previously declared 'ods output conflimits=diff'.
Copied!
proc ttest data=newds plots=none;
class col1;
var col2 - col1001;
run;
ods output on;
ods exclude none;
1
PROC TTESTDATA=newds plots=none;
2
class col1;
3
var col2 - col1001;
4
RUN;
5
6
ods OUTPUT on;
7
ods exclude none;
5 Code Block
PROC UNIVARIATE
Explanation : Analysis of the distribution of simulated mean differences (stored in the 'mean' variable of the 'diff' table) to visualize the null distribution.
Copied!
proc univariate data=diff;
where method="Pooled";
var mean;
histogram mean;
run;
1
PROC UNIVARIATEDATA=diff;
2
where method="Pooled";
3
var mean;
4
histogram mean;
5
RUN;
6 Code Block
DATA STEP Data
Explanation : Calculation of the empirical p-value: iterations where the simulated difference (in absolute value) is greater than or equal to the actually observed difference (hardcoded here at 114.6) are filtered.
Copied!
data numdiffs;
set diff;
where method="Pooled";
if abs(mean) >=114.6;
*if abs(mean) >=44.0667;
*you will need to put the observed difference you got from t test above here. note if you have a one or two tailed test.;
run;
1
DATA numdiffs;
2
SET diff;
3
where method="Pooled";
4
5
IF abs(mean) >=114.6;
6
*if abs(mean) >=44.0667;
7
*you will need to put the observed difference you got from t test above here. note IF you have a one or two tailed test.;
8
RUN;
7 Code Block
PROC PRINT
Explanation : Display of identified extreme cases for verification.
Copied!
proc print data=numdiffs;
where method="Pooled";
run;
1
2
PROC PRINT
3
DATA=numdiffs;
4
where method="Pooled";
5
RUN;
6
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.