Published on :
Statistical INTERNAL_CREATION

Example of multiple imputation with PROC MI

This code is also available in: Deutsch Français
Awaiting validation
The script first creates a dataset named 'Fitness1' containing fitness measurements (oxygen consumption, run time, resting pulse) with arbitrarily introduced missing values. Then, the `PROC MI` procedure is used with an MCMC (Markov Chain Monte Carlo) method to generate multiple imputations for these missing data. The result is stored in a new dataset 'outmi'. Finally, the first 10 observations of the imputed dataset are displayed for verification.
Data Analysis

Type : INTERNAL_CREATION


Data are created directly in the script via a DATA step with the `datalines` statement. The 'Fitness1' dataset contains fitness measurements.

1 Code Block
DATA STEP Data
Explanation :
This code block creates the 'Fitness1' table. It reads data directly embedded in the program using the 'datalines' statement. The '@@' formatting specifier is used in the INPUT statement to indicate to SAS that multiple observations can be on the same data line.
Copied!
1DATA Fitness1;
2 INPUT Oxygen RunTime RunPulse @code_sas_json/8_SAS_Intro_ReadFile_MultiCol_@@.json;
3 DATALINES;
444.609 11.37 178 45.313 10.07 185
554.297 8.65 156 59.571 . .
649.874 9.22 . 44.811 11.63 176
7 . 11.95 176 . 10.85 .
839.442 13.08 174 60.055 8.63 170
950.541 . . 37.388 14.03 186
1044.754 11.12 176 47.273 . .
1151.855 10.33 166 49.156 8.95 180
1240.836 10.95 168 46.672 10.00 .
1346.774 10.25 . 50.388 10.08 168
1439.407 12.63 174 46.080 11.17 156
1545.441 9.63 164 . 8.92 .
1645.118 11.08 . 39.203 12.88 168
1745.790 10.47 186 50.545 9.93 148
1848.673 9.40 186 47.920 11.50 170
1947.467 10.50 170
20;
21 
2 Code Block
PROC MI Data
Explanation :
This procedure performs multiple imputation on the 'Fitness1' dataset. The 'seed' option initializes the random number generator for reproducibility. The 'mu0' option specifies the initial means for the imputation algorithm. The 'mcmc' statement invokes the Markov Chain Monte Carlo method. The variables 'Oxygen', 'RunTime', and 'RunPulse' are specified for imputation. The result is saved in the 'outmi' table.
Copied!
1PROC MI DATA=Fitness1 seed=501213 mu0=50 10 180 out=outmi;
2 mcmc;
3 var Oxygen RunTime RunPulse;
4RUN;
3 Code Block
PROC PRINT
Explanation :
This block displays the first 10 rows (obs=10 option) of the 'outmi' dataset, which contains the values imputed by PROC MI. A title is added to clarify the output.
Copied!
1 
2PROC PRINT
3DATA=outmi (obs=10);
4title 'First 10 Observations of the Imputed
5Data Set';
6RUN;
7 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : S A S S A M P L E L I B R A R Y