Published on :
Statistical CREATION_INTERNE

Regression Analysis on Imputed Data with PROC MIANALYZE

This code is also available in: Español Français
Awaiting validation
The process begins with the creation of a 'Fitness1' dataset containing physical fitness measurements with intentionally introduced missing values. Then, PROC MI is used to generate several complete datasets through multiple imputation. A regression analysis (PROC REG) is performed on each imputed dataset. Finally, PROC MIANALYZE combines the results of these regressions to produce valid statistical estimates and tests that account for the uncertainty associated with imputation.
Data Analysis

Type : CREATION_INTERNE


The data is created directly within the script via a DATA step with a DATALINES statement. The dataset is named Fitness1.

1 Code Block
DATA STEP Data
Explanation :
This block creates the 'Fitness1' table from internal data provided via 'datalines'. Missing values (represented by periods '.') are intentionally included for the RunTime and RunPulse variables.
Copied!
1DATA Fitness1;
2 INPUT Oxygen RunTime RunPulse @code_sas_json/8_SAS_Intro_ReadFile_MultiCol_@@.json;
3 DATALINES;
444.609 11.37 178 45.313 10.07 185
554.297 8.65 156 59.571 . .
649.874 9.22 . 44.811 11.63 176
7 . 11.95 176 . 10.85 .
839.442 13.08 174 60.055 8.63 170
950.541 . . 37.388 14.03 186
1044.754 11.12 176 47.273 . .
1151.855 10.33 166 49.156 8.95 180
1240.836 10.95 168 46.672 10.00 .
1346.774 10.25 . 50.388 10.08 168
1439.407 12.63 174 46.080 11.17 156
1545.441 9.63 164 . 8.92 .
1645.118 11.08 . 39.203 12.88 168
1745.790 10.47 186 50.545 9.93 148
1848.673 9.40 186 47.920 11.50 170
1947.467 10.50 170
20;
2 Code Block
PROC MI Data
Explanation :
The MI (Multiple Imputation) procedure is used to handle missing data. It generates several 'complete' datasets by replacing missing values with plausible estimates. The result is stored in the 'outmi' table.
Copied!
1 
2PROC MI
3DATA=Fitness1 seed=3237851 noprint out=outmi;
4var Oxygen RunTime RunPulse;
5RUN;
6 
3 Code Block
PROC REG Data
Explanation :
A linear regression (PROC REG) is executed to model 'Oxygen' as a function of 'RunTime' and 'RunPulse'. The 'by _Imputation_' statement forces the execution of a distinct regression for each dataset imputed by PROC MI. The parameter estimates from each model are saved in the 'outreg' table.
Copied!
1PROC REG DATA=outmi outest=outreg covout noprint;
2 model Oxygen= RunTime RunPulse;
3 BY _Imputation_;
4RUN;
4 Code Block
PROC MIANALYZE
Explanation :
The MIANALYZE procedure combines the results of regressions performed on multiple imputed datasets. It reads the estimates from 'outreg' and produces a final statistical inference (parameter estimates, standard errors, tests) that is valid by accounting for the variability due to imputation.
Copied!
1PROC MIANALYZE DATA=outreg edf=28;
2 modeleffects Intercept RunTime RunPulse;
3 test Intercept, RunTime=RunPulse / mult;
4RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : S A S S A M P L E L I B R A R Y