Comparison of Regression Models with PROC GLM and Macros

The script begins by creating a 'comp2010' dataset from in-line data (datalines/cards), calculating additional variables. Then, it defines a macro `%hw6problem1` that encapsulates the `PROC GLM` procedure to fit a linear regression model. A second macro, `%hw6problem2`, uses the first to fit two models (a full and a reduced one), combines their output statistics, and manually calculates an F-test to compare the two models. Finally, the script executes this model comparison twice with different configurations and generates a PDF report.

Data Analysis

Type : CREATION_INTERNE

The 'comp2010' dataset is created directly in the code using a DATA step with a 'cards' statement for data input.

1 Code Block

ODS

Explanation :
Opens an ODS destination to generate an output file in PDF format.

Copied!

1	ods pdf file="HW6NickLipanovich.pdf";

2 Code Block

DATA STEP Data

Explanation :
Creates the 'comp2010' table. Reads the three variables 'winper', 'score', and 'save' from in-line data (cards) and calculates three additional variables: 'save2', 'scoresave', and 'differential'.

Copied!

1	DATA comp2010;
2	INPUT winper score save;
3	save2=save*save;
4	scoresave = score+save;
5	differential = 100*(save-(1-score));
6	CARDS;
7	0.661016949 0.409090909 0.706730769
8	0.631578947 0.369158879 0.720379147
9	0.571428571 0.317596567 0.729613734
10	0.593220339 0.352040816 0.712871287
11	0.615384615 0.365714286 0.729885057
12	0.596153846 0.359605911 0.694581281
13	0.517241379 0.339622642 0.690821256
14	0.576923077 0.340909091 0.714285714
15	0.568627451 0.365979381 0.680412371
16	0.5625 0.388571429 0.649122807
17	0.519230769 0.349726776 0.650793651
18	0.465517241 0.256281407 0.705882353
19	0.446428571 0.293532338 0.668367347
20	0.568181818 0.404580153 0.65942029
21	0.510638298 0.315068493 0.68627451
22	0.436363636 0.308080808 0.663212435
23	0.442307692 0.337423313 0.62962963
24	0.5 0.283950617 0.707006369
25	0.55 0.324675325 0.70625
26	0.44 0.345945946 0.60989011
27	0.456521739 0.365853659 0.628930818
28	0.416666667 0.259259259 0.680851064
29	0.434782609 0.305389222 0.650887574
30	0.434782609 0.337423313 0.625
31	0.358490566 0.234972678 0.653631285
32	0.386363636 0.309859155 0.582733813
33	0.457142857 0.345132743 0.617391304
34	0.365853659 0.311258278 0.623287671
35	0.365853659 0.27480916 0.603053435
36	0.378378378 0.307692308 0.580357143
37	;
38	RUN;

3 Code Block

PROC PRINT

Explanation :
Displays the content of the 'comp2010' table in the results output.

Copied!

1	PROC PRINT DATA=comp2010;
2	RUN;
3	QUIT;

4 Code Block

MACRO

Explanation :
Defines a macro `%hw6problem1` that executes the GLM (General Linear Model) procedure. The macro fits a model ('model') for a dependent variable ('outcome') based on predictors ('modelx'). It can optionally include a classification variable ('classx'). Model statistics are saved to an output table ('myoutstat').

Copied!

1	%macro hw6problem1 (outcome,classx,modelx,myoutstat,indata);
2	%IF &classx=' ' %THEN
3	%DO;
4	PROC GLM noprint DATA=&indata outstat=&myoutstat;
5	model &outcome = &modelx;
6	RUN;
7	QUIT;
8	%END;
9	%ELSE
10	%DO;
11	PROC GLM noprint DATA=&indata outstat=&myoutstat;
12	class &classx;
13	model &outcome = &modelx;
14	RUN;
15	QUIT;
16	%END;
17
18	%mend;

5 Code Block

MACRO Data

Explanation :
Defines a macro `%hw6problem2` to compare two models (a full and a reduced one). It calls the `%hw6problem1` macro twice to fit each model. Then, via DATA steps, it combines the results, calculates the F-test statistic and the associated p-value to assess whether the full model is significantly better than the reduced model. The F-test result is then displayed.

Copied!

1	%macro hw6problem2 (outcome,classx1,classx2,modelx1,modelx2,myoutstat1,myoutstat2,indata);
2	%hw6problem1 (&outcome,&classx1,&modelx1,&myoutstat1,&indata);
3	%hw6problem1 (&outcome,&classx2,&modelx2,&myoutstat2,&indata);
4	DATA fullvsreduced;
5	SET &myoutstat1 &myoutstat2;
6	IF _type_="ERROR";
7	RUN;
8	PROC SORT DATA=fullvsreduced;
9	BY df;
10	RUN;
11	DATA fullvsreduced2;
12	SET fullvsreduced;
13	fullss=lag(ss);
14	fulldf=lag(df);
15	num = (ss-fullss)/(df-fulldf);
16	den = fullss/fulldf;
17	f = num/den;
18	pvalue=1-cdf('f', f, df-fulldf, fulldf);
19	keep f pvalue;
20	IF _n_=1 THEN delete;
21	RUN;
22	PROC PRINT DATA=fullvsreduced2;
23	RUN;
24	%mend;

6 Code Block

MACRO CALL

Explanation :
Calls the model comparison macro `%hw6problem2` twice. The first call compares a model with 'score' and 'save' as predictors to a simpler model with 'differential'. The second call reverses the order of comparison. Results are written to tables 'ssout' and 'dout'.

Copied!

1	%hw6problem2(winper,,,score save,differential,ssout,dout,comp2010);
2	%hw6problem2(winper,,,differential,score save,dout,ssout,comp2010);
3

7 Code Block

ODS

Explanation :
Closes the ODS PDF destination, thus finalizing the file creation.

Copied!

1	ods pdf close;

This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.

Retour à la liste