Published on :
Statistical INTERNAL_CREATION

Binomial Data Modeling with PROC HPGENSELECT

This code is also available in: Deutsch Español Français
Awaiting validation
The script analyzes data on ingots to model the probability that an ingot is not ready for rolling, based on heating time (Heat) and soaking time (Soak). It illustrates two approaches: first with grouped data (r/n syntax) and then by transforming the data into a binary format (one line per individual observation) to fit an equivalent binary model.
Data Analysis

Type : INTERNAL_CREATION


Data is created directly within the script via a DATALINES statement in a DATA step. There are no dependencies on external files or SASHELP tables.

1 Code Block
DATA STEP Data
Explanation :
This block creates the 'Ingots' table by reading embedded data via 'datalines'. Variables represent heating time (Heat), soaking time (Soak), the number of events (r, ingots not ready), and the total number of trials (n). The ' @@' operator allows reading multiple observations on the same data line. An 'Obsnum' identifier variable is added for later use.
Copied!
1DATA Ingots;
2 INPUT Heat Soak r n @code_sas_json/8_SAS_Intro_ReadFile_MultiCol_@@.json;
3 Obsnum= _n_;
4 DATALINES;
57 1.0 0 10 14 1.0 0 31 27 1.0 1 56 51 1.0 3 13
67 1.7 0 17 14 1.7 0 43 27 1.7 4 44 51 1.7 0 1
77 2.2 0 7 14 2.2 2 33 27 2.2 0 21 51 2.2 0 1
87 2.8 0 12 14 2.8 0 31 27 2.8 1 22 51 4.0 0 1
97 4.0 0 9 14 4.0 0 19 27 4.0 1 16
10;
11 
2 Code Block
PROC HPGENSELECT
Explanation :
This procedure fits a logistic regression model. The 'model r/n' syntax specifies that the response variable is binomial, where 'r' is the number of events and 'n' is the number of trials. The model includes the main effects of Heat and Soak, as well as their interaction. The distribution is explicitly defined as 'Binomial'. Predictions and linear score values (xbeta) are saved to the 'Out' table.
Copied!
1PROC HPGENSELECT DATA=Ingots;
2 model r/n = Heat Soak Heat*Soak / dist=Binomial;
3 id Obsnum;
4 OUTPUT out=Out xbeta predicted=Pred;
5RUN;
3 Code Block
DATA STEP Data
Explanation :
This block merges the 'Out' output table (containing predictions) with the original 'Ingots' table using the 'Obsnum' identifier to ensure that the rows match correctly.
Copied!
1DATA Out;
2 MERGE Out Ingots;
3 BY Obsnum;
4 
4 Code Block
PROC PRINT
Explanation :
Displays observations from the merged 'Out' table for a specific combination of 'Heat' and 'Soak', allowing verification of model results for a particular case.
Copied!
1 
2PROC PRINT
3DATA=Out;
4where Heat=14 & Soak=1.7;
5RUN;
6 
5 Code Block
DATA STEP Data
Explanation :
This DATA step transforms grouped data (events/trials format) into an individual binary format. For each row in the 'Ingots' table, it generates 'n' new rows. The 'Y' variable is set to 1 for the first 'r' cases (events) and to 0 for the others, creating one row per ingot tested.
Copied!
1DATA Ingots_binary;
2 SET Ingots;
3 DO i=1 to n;
4 IF i <= r THEN Y=1; ELSE Y = 0;
5 OUTPUT;
6 END;
7RUN;
6 Code Block
PROC HPGENSELECT
Explanation :
This procedure fits a logistic model equivalent to the first one, but using the data in binary format. The response variable is 'Y', and the 'event='1'' option specifies that Y=1 is the event of interest. The distribution is defined as 'Binary'. The results of this model should be consistent with those of the first model on grouped data.
Copied!
1 
2PROC HPGENSELECT
3DATA=Ingots_binary;
4model Y(event='1') = Heat Soak Heat*Soak / dist=Binary;
5RUN;
6 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : S A S S A M P L E L I B R A R Y