The script starts by creating an internal dataset. It performs an initial T-test to observe the real difference. Then, it uses PROC IML to generate 1000 random permutations of the target variable 'Money'. These permutations are analyzed by PROC TTEST to generate an empirical distribution of mean differences under the null hypothesis. Finally, PROC UNIVARIATE and a DATA step calculate the empirical p-value by comparing the observed statistic to the simulated distribution.
Data Analysis
Type : INTERNAL_CREATION
Data is defined directly in the script via the datalines statement in the 'cash' dataset.
1 Code Block
DATA STEP Data
Explanation : Creation of the initial 'cash' dataset containing the variables 'School' and 'Money' with embedded data.
Copied!
data cash;
input School Money;
datalines;
0 34
0 1200
...
1 3
1 0
;
1
DATA cash;
2
INPUT School Money;
3
4
DATALINES;
5
0 34
6
0 1200
7
...
8
13
9
1 0
10
;
2 Code Block
PROC TTEST
Explanation : Execution of the initial Student's T-test to obtain the observed mean difference on the real data.
Copied!
proc ttest data=cash;
class School;
var Money;
run;
1
PROC TTESTDATA=cash;
2
class School;
3
var Money;
4
RUN;
3 Code Block
PROC IML Data
Explanation : Using the IML matrix language to read data, generate 1000 random permutations of the 'Money' column (variable x[,2]) while keeping 'School' fixed, and save the result in 'newds'.
Copied!
ods output off;
ods exclude all;
proc iml ;
use cash;
read all var{School Money} into x;
p=t(ranperm(x[, 2], 1000));
paf=x[, 1]||p;
create newds from paf;
append from paf;
quit;
1
ods OUTPUT off;
2
ods exclude all;
3
4
PROC IML ;
5
use cash;
6
read all var{School Money} into x;
7
p=t(ranperm(x[, 2], 1000));
8
paf=x[, 1]||p;
9
create newds from paf;
10
append from paf;
11
QUIT;
4 Code Block
PROC TTEST
Explanation : Calculation of T-tests for the 1000 permuted columns (col2 to col1001) relative to the group variable (col1). The results (confidence limits including the mean) are exported to the 'diff' table.
Copied!
ods output conflimits=diff;
proc ttest data=newds plots=none;
class col1;
var col2 - col1001;
run;
ods output on;
ods exclude none;
1
ods OUTPUT conflimits=diff;
2
3
PROC TTESTDATA=newds plots=none;
4
class col1;
5
var col2 - col1001;
6
RUN;
7
8
ods OUTPUT on;
9
ods exclude none;
5 Code Block
PROC UNIVARIATE
Explanation : Analysis of the distribution of simulated mean differences (stored in the 'mean' variable of the 'diff' table).
Copied!
proc univariate data=diff;
where method="Pooled";
var mean;
histogram mean;
run;
1
PROC UNIVARIATEDATA=diff;
2
where method="Pooled";
3
var mean;
4
histogram mean;
5
RUN;
6 Code Block
DATA STEP Data
Explanation : Filtering of simulated results to keep only those with an absolute difference greater than or equal to the observed value (114.6), in order to calculate the p-value.
Copied!
data numdiffs;
set diff;
where method="Pooled";
if abs(mean) >=114.6;
run;
1
DATA numdiffs;
2
SET diff;
3
where method="Pooled";
4
5
IF abs(mean) >=114.6;
6
RUN;
7 Code Block
PROC PRINT
Explanation : Display of permutations that satisfy the extremeness criterion for visual verification.
Copied!
proc print data=numdiffs;
where method="Pooled";
run;
1
2
PROC PRINT
3
DATA=numdiffs;
4
where method="Pooled";
5
RUN;
6
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Related Documentation
Aucune documentation spécifique pour cette catégorie.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.