Published on :
Statistical CREATION_INTERNE

Comparison of Regression Models with PROC GLM and Macros

This code is also available in: Deutsch Español Français
Awaiting validation
The script begins by creating a 'comp2010' dataset from in-line data (datalines/cards), calculating additional variables. Then, it defines a macro `%hw6problem1` that encapsulates the `PROC GLM` procedure to fit a linear regression model. A second macro, `%hw6problem2`, uses the first to fit two models (a full and a reduced one), combines their output statistics, and manually calculates an F-test to compare the two models. Finally, the script executes this model comparison twice with different configurations and generates a PDF report.
Data Analysis

Type : CREATION_INTERNE


The 'comp2010' dataset is created directly in the code using a DATA step with a 'cards' statement for data input.

1 Code Block
ODS
Explanation :
Opens an ODS destination to generate an output file in PDF format.
Copied!
1ods pdf file="HW6NickLipanovich.pdf";
2 Code Block
DATA STEP Data
Explanation :
Creates the 'comp2010' table. Reads the three variables 'winper', 'score', and 'save' from in-line data (cards) and calculates three additional variables: 'save2', 'scoresave', and 'differential'.
Copied!
1DATA comp2010;
2INPUT winper score save;
3 save2=save*save;
4 scoresave = score+save;
5 differential = 100*(save-(1-score));
6CARDS;
70.661016949 0.409090909 0.706730769
80.631578947 0.369158879 0.720379147
90.571428571 0.317596567 0.729613734
100.593220339 0.352040816 0.712871287
110.615384615 0.365714286 0.729885057
120.596153846 0.359605911 0.694581281
130.517241379 0.339622642 0.690821256
140.576923077 0.340909091 0.714285714
150.568627451 0.365979381 0.680412371
160.5625 0.388571429 0.649122807
170.519230769 0.349726776 0.650793651
180.465517241 0.256281407 0.705882353
190.446428571 0.293532338 0.668367347
200.568181818 0.404580153 0.65942029
210.510638298 0.315068493 0.68627451
220.436363636 0.308080808 0.663212435
230.442307692 0.337423313 0.62962963
240.5 0.283950617 0.707006369
250.55 0.324675325 0.70625
260.44 0.345945946 0.60989011
270.456521739 0.365853659 0.628930818
280.416666667 0.259259259 0.680851064
290.434782609 0.305389222 0.650887574
300.434782609 0.337423313 0.625
310.358490566 0.234972678 0.653631285
320.386363636 0.309859155 0.582733813
330.457142857 0.345132743 0.617391304
340.365853659 0.311258278 0.623287671
350.365853659 0.27480916 0.603053435
360.378378378 0.307692308 0.580357143
37;
38RUN;
3 Code Block
PROC PRINT
Explanation :
Displays the content of the 'comp2010' table in the results output.
Copied!
1PROC PRINT DATA=comp2010;
2RUN;
3QUIT;
4 Code Block
MACRO
Explanation :
Defines a macro `%hw6problem1` that executes the GLM (General Linear Model) procedure. The macro fits a model ('model') for a dependent variable ('outcome') based on predictors ('modelx'). It can optionally include a classification variable ('classx'). Model statistics are saved to an output table ('myoutstat').
Copied!
1%macro hw6problem1 (outcome,classx,modelx,myoutstat,indata);
2 %IF &classx=' ' %THEN
3 %DO;
4 PROC GLM noprint DATA=&indata outstat=&myoutstat;
5 model &outcome = &modelx;
6 RUN;
7 QUIT;
8 %END;
9 %ELSE
10 %DO;
11 PROC GLM noprint DATA=&indata outstat=&myoutstat;
12 class &classx;
13 model &outcome = &modelx;
14 RUN;
15 QUIT;
16 %END;
17 
18%mend;
5 Code Block
MACRO Data
Explanation :
Defines a macro `%hw6problem2` to compare two models (a full and a reduced one). It calls the `%hw6problem1` macro twice to fit each model. Then, via DATA steps, it combines the results, calculates the F-test statistic and the associated p-value to assess whether the full model is significantly better than the reduced model. The F-test result is then displayed.
Copied!
1%macro hw6problem2 (outcome,classx1,classx2,modelx1,modelx2,myoutstat1,myoutstat2,indata);
2 %hw6problem1 (&outcome,&classx1,&modelx1,&myoutstat1,&indata);
3 %hw6problem1 (&outcome,&classx2,&modelx2,&myoutstat2,&indata);
4 DATA fullvsreduced;
5 SET &myoutstat1 &myoutstat2;
6 IF _type_="ERROR";
7 RUN;
8 PROC SORT DATA=fullvsreduced;
9 BY df;
10 RUN;
11 DATA fullvsreduced2;
12 SET fullvsreduced;
13 fullss=lag(ss);
14 fulldf=lag(df);
15 num = (ss-fullss)/(df-fulldf);
16 den = fullss/fulldf;
17 f = num/den;
18 pvalue=1-cdf('f', f, df-fulldf, fulldf);
19 keep f pvalue;
20 IF _n_=1 THEN delete;
21 RUN;
22 PROC PRINT DATA=fullvsreduced2;
23 RUN;
24%mend;
6 Code Block
MACRO CALL
Explanation :
Calls the model comparison macro `%hw6problem2` twice. The first call compares a model with 'score' and 'save' as predictors to a simpler model with 'differential'. The second call reverses the order of comparison. Results are written to tables 'ssout' and 'dout'.
Copied!
1%hw6problem2(winper,,,score save,differential,ssout,dout,comp2010);
2%hw6problem2(winper,,,differential,score save,dout,ssout,comp2010);
3 
7 Code Block
ODS
Explanation :
Closes the ODS PDF destination, thus finalizing the file creation.
Copied!
1ods pdf close;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.