regression

logistic

Description

The logistic action fits logistic regression models for binary, binomial, and multinomial response data in SAS Viya. It provides a comprehensive set of tools for statistical modeling, including various link functions (Logit, Probit, Cloglog), and supports both classification and continuous variables. The action is highly customizable, offering multiple model selection methods like Forward, Backward, and Stepwise, as well as modern techniques like LASSO and Elastic Net for handling high-dimensional data. It can generate a rich set of output tables, including parameter estimates, odds ratios, fit statistics, and scoring code for model deployment.

regression.logistic <result=results> <status=rc> / alpha=double, applyRowOrder=TRUE | FALSE, association=TRUE | FALSE, attributes={{casinvardesc-1} <, {casinvardesc-2}, ...>}, binEps=double, class={{classStatement-1} <, {classStatement-2}, ...>}, classGlobalOpts={classopts}, classLevelsPrint=TRUE | FALSE, clb=TRUE | FALSE | "WALD" | "PL", code={aircodegen}, collection={{collection-1} <, {collection-2}, ...>}, corrB=TRUE | FALSE, covB=TRUE | FALSE, ctable={ctableOptions}, display={displayTables}, fitData=TRUE | FALSE, freq="variable-name", inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}, lackfit={lackfitOptions}, lsmeans={{lsmeansStatement-1} <, {lsmeansStatement-2}, ...>}, maxOptBatch=64-bit-integer | "AUTO", maxResponseLevels=integer, model={logisticModel}, multimember={{multimember-1} <, {multimember-2}, ...>}, multipass=TRUE | FALSE, nClassLevelsPrint=integer, noCheck=TRUE | FALSE, nominals={{casinvardesc-1} <, {casinvardesc-2}, ...>}, normalize=TRUE | FALSE, nostderr=TRUE | FALSE, noxpx=TRUE | FALSE, oddsratio={oddsratioOptions}, optimization={optimizationStatement}, output={logisticOutputStatement}, outputTables={outputTables}, parmEstLevDetails="NONE" | "RAW" | "RAW_AND_FORMATTED", partByFrac={partByFracStatement}, partByVar={partByVarStatement}, partFit=TRUE | FALSE, plConv=double, plMaxIter=integer, plSingular=double, polynomial={{polynomial-1} <, {polynomial-2}, ...>}, repeated={{logisticModelRepeated-1} <, {logisticModelRepeated-2}, ...>}, restore={castable}, seed=64-bit-integer, selection={selectionStatement}, spline={{spline-1} <, {spline-2}, ...>}, ss3=TRUE | FALSE, stb=TRUE | FALSE, store={casouttable}, storetext={"string-1" <, "string-2", ...>}, table={castable}, target="string", useLastIter=TRUE | FALSE, weight="variable-name", weightNorm=TRUE | FALSE ;
Settings
ParameterDescription
alphaSpecifies the significance level for constructing all confidence intervals.
classSpecifies the classification variables to be used as explanatory variables in the analysis.
modelDefines the model to be fit, including the dependent variable(s) and explanatory effects.
selectionSpecifies the method for model selection, such as FORWARD, BACKWARD, STEPWISE, or LASSO.
outputCreates an output CAS table containing observation-wise statistics like predicted values and residuals.
storeSaves the fitted model to a CAS table as a binary object for later scoring or analysis.
ctableGenerates a classification table to evaluate model performance, including statistics like accuracy, sensitivity, and specificity.
oddsratioComputes and displays odds ratios for specified variables, which is useful for interpreting the effect of predictors.
lackfitPerforms the Hosmer and Lemeshow goodness-of-fit test to assess how well the model fits the data.
repeatedSpecifies options for analyzing repeated measures data, defining subject and correlation structures.
weightSpecifies a variable to use for weighting the observations in the analysis.
freqSpecifies a variable that contains the frequency of occurrence for each observation.
partByFracPartitions the input data by specifying fractions for training, validation, and testing sets.
partByVarPartitions the data based on the values of a specified variable.
Data Preparation View data prep sheet
Creating Sample Data for Logistic Regression

This SAS code snippet creates a sample CAS table named 'getstarted'. The table contains information about patients, including their survival status, gender, age, cholesterol level, and smoking habits. This dataset is suitable for demonstrating how to fit a logistic regression model to predict a binary outcome.

Copied!
1DATA casuser.getstarted;
2 INPUT STATUS $ Sex $ Age Cholesterol Smoking;
3 DATALINES;
4 Dead Male 55 220 20
5 Alive Female 55 180 10
6 Dead Male 65 240 30
7 Alive Female 45 170 5
8 Dead Female 70 260 15
9 Alive Male 48 210 0
10 ;
11RUN;

Examples

This example demonstrates a basic logistic regression analysis. It uses the 'getstarted' table and models the binary 'Status' variable, with 'Dead' as the event of interest. The model includes 'Sex' as a classification variable and 'Age' and 'Smoking' as continuous explanatory variables.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 regression.logistic TABLE={name='getstarted'},
3 class={'Sex'},
4 model={depvars={{name='Status', options={event='Dead', order='FORMATTED'}}},
5 effects={'Sex', 'Age', 'Smoking'}};
6RUN;
Result :
The action returns several tables, including 'ModelInfo' describing the model settings, 'ConvergenceStatus' indicating if the optimization succeeded, and 'ParameterEstimates' showing the estimated coefficients, standard errors, and p-values for each predictor in the model.

This example performs a logistic regression with stepwise model selection to identify the most significant predictors. It also demonstrates how to generate an output dataset with predicted probabilities and request odds ratios for the final selected model's variables.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 regression.logistic TABLE='getstarted',
3 class={'Sex'},
4 model={depvar={{name='Status', options={event='Dead'}}},
5 effects={'Sex', 'Age', 'Cholesterol', 'Smoking'}},
6 selection={method='STEPWISE', details='ALL'},
7 OUTPUT={casOut={name='logistic_output', replace=true}, pred='predProb', role='role'},
8 oddsratio={vars={'Sex', 'Age', 'Smoking'}}$
9RUN;
Result :
The output will include detailed step-by-step information from the selection process, showing which variables are added or removed. It also produces tables for odds ratios and a new CAS table named 'logistic_output' containing the predicted probabilities ('predProb') and the role of each observation.

This example fits a generalized logit model for a multinomial response variable. It uses the 'getstarted' dataset and models the 'Smoking' status (categorized) based on 'Age' and 'Sex'. This is useful when the response variable has more than two unordered categories.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 regression.logistic TABLE='getstarted',
3 class={'Sex'},
4 model={depvar={{name='Smoking'}},
5 effects={'Sex', 'Age'},
6 link='GLOGIT'};
7RUN;
Result :
The results will include parameter estimates for each level of the 'Smoking' response variable compared to a reference level. It provides insights into how 'Age' and 'Sex' influence the odds of being in one smoking category versus another.

FAQ

What is the primary function of the 'logistic' action in SAS Viya?
How do you specify the model to be fitted using the 'logistic' action?
What model selection methods are available in the 'logistic' action?
Can I save the results of a logistic regression model for later use?
How can I handle classification variables within the 'logistic' action?
Is it possible to perform a repeated measures analysis with the 'logistic' action?