This script performs a randomization test (permutation test) to evaluate the age difference between fired and non-fired employees. It first calculates the observed difference via PROC TTEST. Then, it uses PROC IML to generate 1000 random permutations of the data. These permuted datasets are mass-analyzed by PROC TTEST to construct the null distribution of the mean difference. Finally, an empirical p-value is estimated by comparing the observed statistic to the generated distribution.
Data Analysis
Type : CREATION_INTERNE
Data is created directly within the script via a DATA step using the DATALINES statement (Status and Age variables).
1 Code Block
DATA STEP Data
Explanation : Creation of the initial 'discriminate' dataset containing individual status and age via internal data (datalines).
Copied!
data discriminate;
input Status Age;
/* Status = 0 = Fired
Status = 1 = Not Fired */
datalines;
0 34
0 37
...
1 54
;
run;
1
DATA discriminate;
2
INPUTSTATUS Age;
3
4
/* Status = 0 = Fired
5
Status = 1 = Not Fired */
6
7
DATALINES;
8
0 34
9
0 37
10
...
11
154
12
;
13
RUN;
2 Code Block
PROC TTEST
Explanation : Execution of Student's T-test on real data to obtain the observed mean difference (reference).
Copied!
proc ttest data=discriminate;
class Status;
*may need to convert School to numeric;
var Age;
run;
1
PROC TTESTDATA=discriminate;
2
class STATUS;
3
*may need to convert School to numeric;
4
var Age;
5
RUN;
3 Code Block
PROC IML Data
Explanation : Use of the IML matrix language to generate 1000 random permutations of the 'Age' variable. Creation of a wide table 'newds' where the first column is the status and the following ones are the permutations.
Copied!
ods output off;
ods exclude all;
proc iml ;
use discriminate;
read all var{Status Age} into x;
p=t(ranperm(x[, 2], 1000));
paf=x[, 1]||p;
create newds from paf;
append from paf;
quit;
1
ods OUTPUT off;
2
ods exclude all;
3
PROC IML ;
4
use discriminate;
5
read all var{STATUS Age} into x;
6
p=t(ranperm(x[, 2], 1000));
7
paf=x[, 1]||p;
8
create newds from paf;
9
append from paf;
10
QUIT;
4 Code Block
PROC TTEST Data
Explanation : Massive execution of T-tests on the 1000 permuted columns (col2 to col1001) against the status (col1). ODS outputs are suppressed for performance, except for the 'conflimits' table which is saved in 'diff'.
Copied!
ods output conflimits=diff;
proc ttest data=newds plots=none;
class col1;
var col2 - col1001;
run;
ods output on;
ods exclude none;
1
ods OUTPUT conflimits=diff;
2
3
PROC TTESTDATA=newds plots=none;
4
class col1;
5
var col2 - col1001;
6
RUN;
7
8
ods OUTPUT on;
9
ods exclude none;
5 Code Block
PROC UNIVARIATE
Explanation : Analysis of the distribution of randomly generated mean differences (Pooled method) and display of a histogram.
Copied!
proc univariate data=diff;
where method="Pooled";
var mean;
histogram mean;
run;
1
PROC UNIVARIATEDATA=diff;
2
where method="Pooled";
3
var mean;
4
histogram mean;
5
RUN;
6 Code Block
DATA STEP Data
Explanation : Filtering results to count how many permutations produced an absolute difference greater than or equal to the observed value (here hardcoded to 1.9238, which should correspond to the result of the first PROC TTEST).
Copied!
data numdiffs;
set diff;
where method="Pooled";
if abs(mean) >=1.9238;
run;
1
DATA numdiffs;
2
SET diff;
3
where method="Pooled";
4
5
IF abs(mean) >=1.9238;
6
RUN;
7 Code Block
PROC PRINT
Explanation : Display of extreme permutations to allow manual calculation of the p-value (number of extreme observations / 1000).
Copied!
proc print data=numdiffs;
where method="Pooled";
run;
1
2
PROC PRINT
3
DATA=numdiffs;
4
where method="Pooled";
5
RUN;
6
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : Mention 'borrowed code from internet' present in comments.
Related Documentation
Aucune documentation spécifique pour cette catégorie.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.