Regression Analysis on Imputed Data with PROC MIANALYZE

The process begins with the creation of a 'Fitness1' dataset containing physical fitness measurements with intentionally introduced missing values. Then, PROC MI is used to generate several complete datasets through multiple imputation. A regression analysis (PROC REG) is performed on each imputed dataset. Finally, PROC MIANALYZE combines the results of these regressions to produce valid statistical estimates and tests that account for the uncertainty associated with imputation.

Data Analysis

Type : CREATION_INTERNE

The data is created directly within the script via a DATA step with a DATALINES statement. The dataset is named Fitness1.

1 Code Block

DATA STEP Data

Explanation :
This block creates the 'Fitness1' table from internal data provided via 'datalines'. Missing values (represented by periods '.') are intentionally included for the RunTime and RunPulse variables.

Copied!

1	DATA Fitness1;
2	INPUT Oxygen RunTime RunPulse @code_sas_json/8_SAS_Intro_ReadFile_MultiCol_@@.json;
3	DATALINES;
4	44.609 11.37 178 45.313 10.07 185
5	54.297 8.65 156 59.571 . .
6	49.874 9.22 . 44.811 11.63 176
7	. 11.95 176 . 10.85 .
8	39.442 13.08 174 60.055 8.63 170
9	50.541 . . 37.388 14.03 186
10	44.754 11.12 176 47.273 . .
11	51.855 10.33 166 49.156 8.95 180
12	40.836 10.95 168 46.672 10.00 .
13	46.774 10.25 . 50.388 10.08 168
14	39.407 12.63 174 46.080 11.17 156
15	45.441 9.63 164 . 8.92 .
16	45.118 11.08 . 39.203 12.88 168
17	45.790 10.47 186 50.545 9.93 148
18	48.673 9.40 186 47.920 11.50 170
19	47.467 10.50 170
20	;

2 Code Block

PROC MI Data

Explanation :
The MI (Multiple Imputation) procedure is used to handle missing data. It generates several 'complete' datasets by replacing missing values with plausible estimates. The result is stored in the 'outmi' table.

Copied!

1
2	PROC MI
3	DATA=Fitness1 seed=3237851 noprint out=outmi;
4	var Oxygen RunTime RunPulse;
5	RUN;
6

3 Code Block

PROC REG Data

Explanation :
A linear regression (PROC REG) is executed to model 'Oxygen' as a function of 'RunTime' and 'RunPulse'. The 'by _Imputation_' statement forces the execution of a distinct regression for each dataset imputed by PROC MI. The parameter estimates from each model are saved in the 'outreg' table.

Copied!

1	PROC REG DATA=outmi outest=outreg covout noprint;
2	model Oxygen= RunTime RunPulse;
3	BY _Imputation_;
4	RUN;

4 Code Block

PROC MIANALYZE

Explanation :
The MIANALYZE procedure combines the results of regressions performed on multiple imputed datasets. It reads the estimates from 'outreg' and produces a final statistical inference (parameter estimates, standard errors, tests) that is valid by accounting for the variability due to imputation.

Copied!

1	PROC MIANALYZE DATA=outreg edf=28;
2	modeleffects Intercept RunTime RunPulse;
3	test Intercept, RunTime=RunPulse / mult;
4	RUN;

This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.

Retour à la liste