Published on :
Statistical INTERNAL_CREATION

Multiple Imputation Example with PROC MI

This code is also available in: Deutsch Español Français
Awaiting validation
The script begins by creating a 'Fitness1' dataset from internal data (datalines), which simulates fitness measurements with intentional missing values. Then, the MI procedure is used for the first time to generate multiple imputations via an MCMC method and stores the result in the 'outex10' table. A second execution of PROC MI is performed on the imputed dataset with the nimpute=0 option, which is typically done to analyze the imputation results without generating new ones.
Data Analysis

Type : INTERNAL_CREATION


The data is created directly in the script via a DATA step with a DATALINES statement. The 'Fitness1' dataset contains fitness measurements (Oxygen, RunTime, RunPulse) for several individuals.

1 Code Block
DATA STEP Data
Explanation :
This DATA STEP block creates the 'Fitness1' table. The 'input' statement reads three numeric variables (Oxygen, RunTime, RunPulse). The ' @@' (double trailing at) specifier keeps the pointer on the input data line, allowing multiple observations to be read from a single line. The data is provided directly in the code via the 'datalines' statement.
Copied!
1DATA Fitness1;
2 INPUT Oxygen RunTime RunPulse @code_sas_json/8_SAS_Intro_ReadFile_MultiCol_@@.json;
3 DATALINES;
444.609 11.37 178 45.313 10.07 185
554.297 8.65 156 59.571 . .
649.874 9.22 . 44.811 11.63 176
7 . 11.95 176 . 10.85 .
839.442 13.08 174 60.055 8.63 170
950.541 . . 37.388 14.03 186
1044.754 11.12 176 47.273 . .
1151.855 10.33 166 49.156 8.95 180
1240.836 10.95 168 46.672 10.00 .
1346.774 10.25 . 50.388 10.08 168
1439.407 12.63 174 46.080 11.17 156
1545.441 9.63 164 . 8.92 .
1645.118 11.08 . 39.203 12.88 168
1745.790 10.47 186 50.545 9.93 148
1848.673 9.40 186 47.920 11.50 170
1947.467 10.50 170
20;
21 
2 Code Block
PROC MI
Explanation :
This Multiple Imputation (MI) procedure processes the 'Fitness1' dataset. It uses an MCMC (Markov Chain Monte Carlo) method with monotone imputation to generate values for missing data in the specified variables. The 'seed' option ensures the reproducibility of the imputation. The results, including the multiple imputed datasets, are stored in the 'outex10' output table.
Copied!
1PROC MI DATA=Fitness1 seed=17655417 out=outex10;
2 mcmc impute=monotone;
3 var Oxygen RunTime RunPulse;
4RUN;
3 Code Block
PROC MI
Explanation :
The MI procedure is called again, this time on the 'outex10' table which contains the imputed data. The 'nimpute=0' option indicates that no new imputation should be performed. This type of call is generally used to obtain descriptive statistics or analyses on all imputed data, by combining the results of different imputations.
Copied!
1 
2PROC MI
3DATA=outex10 seed=15541 nimpute=0;
4var Oxygen RunTime RunPulse;
5RUN;
6 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : S A S S A M P L E L I B R A R Y