regression

logistic

Description

The logistic action fits logistic regression models for binary, binomial, and multinomial response data in SAS Viya. It provides a comprehensive set of tools for statistical modeling, including various link functions (Logit, Probit, Cloglog), and supports both classification and continuous variables. The action is highly customizable, offering multiple model selection methods like Forward, Backward, and Stepwise, as well as modern techniques like LASSO and Elastic Net for handling high-dimensional data. It can generate a rich set of output tables, including parameter estimates, odds ratios, fit statistics, and scoring code for model deployment.

regression.logistic <result=results> <status=rc> / alpha=double, applyRowOrder=TRUE | FALSE, association=TRUE | FALSE, attributes={{casinvardesc-1} <, {casinvardesc-2}, ...>}, binEps=double, class={{classStatement-1} <, {classStatement-2}, ...>}, classGlobalOpts={classopts}, classLevelsPrint=TRUE | FALSE, clb=TRUE | FALSE | "WALD" | "PL", code={aircodegen}, collection={{collection-1} <, {collection-2}, ...>}, corrB=TRUE | FALSE, covB=TRUE | FALSE, ctable={ctableOptions}, display={displayTables}, fitData=TRUE | FALSE, freq="variable-name", inputs={{casinvardesc-1} <, {casinvardesc-2}, ...>}, lackfit={lackfitOptions}, lsmeans={{lsmeansStatement-1} <, {lsmeansStatement-2}, ...>}, maxOptBatch=64-bit-integer | "AUTO", maxResponseLevels=integer, model={logisticModel}, multimember={{multimember-1} <, {multimember-2}, ...>}, multipass=TRUE | FALSE, nClassLevelsPrint=integer, noCheck=TRUE | FALSE, nominals={{casinvardesc-1} <, {casinvardesc-2}, ...>}, normalize=TRUE | FALSE, nostderr=TRUE | FALSE, noxpx=TRUE | FALSE, oddsratio={oddsratioOptions}, optimization={optimizationStatement}, output={logisticOutputStatement}, outputTables={outputTables}, parmEstLevDetails="NONE" | "RAW" | "RAW_AND_FORMATTED", partByFrac={partByFracStatement}, partByVar={partByVarStatement}, partFit=TRUE | FALSE, plConv=double, plMaxIter=integer, plSingular=double, polynomial={{polynomial-1} <, {polynomial-2}, ...>}, repeated={{logisticModelRepeated-1} <, {logisticModelRepeated-2}, ...>}, restore={castable}, seed=64-bit-integer, selection={selectionStatement}, spline={{spline-1} <, {spline-2}, ...>}, ss3=TRUE | FALSE, stb=TRUE | FALSE, store={casouttable}, storetext={"string-1" <, "string-2", ...>}, table={castable}, target="string", useLastIter=TRUE | FALSE, weight="variable-name", weightNorm=TRUE | FALSE ;
Settings
ParameterDescription
alpha Specifies the significance level for constructing all confidence intervals.
class Specifies the classification variables to be used as explanatory variables in the analysis.
model Defines the model to be fit, including the dependent variable(s) and explanatory effects.
selection Specifies the method for model selection, such as FORWARD, BACKWARD, STEPWISE, or LASSO.
output Creates an output CAS table containing observation-wise statistics like predicted values and residuals.
store Saves the fitted model to a CAS table as a binary object for later scoring or analysis.
ctable Generates a classification table to evaluate model performance, including statistics like accuracy, sensitivity, and specificity.
oddsratio Computes and displays odds ratios for specified variables, which is useful for interpreting the effect of predictors.
lackfit Performs the Hosmer and Lemeshow goodness-of-fit test to assess how well the model fits the data.
repeated Specifies options for analyzing repeated measures data, defining subject and correlation structures.
weight Specifies a variable to use for weighting the observations in the analysis.
freq Specifies a variable that contains the frequency of occurrence for each observation.
partByFrac Partitions the input data by specifying fractions for training, validation, and testing sets.
partByVar Partitions the data based on the values of a specified variable.
Data Preparation View data prep sheet
Creating Sample Data for Logistic Regression

This SAS code snippet creates a sample CAS table named 'getstarted'. The table contains information about patients, including their survival status, gender, age, cholesterol level, and smoking habits. This dataset is suitable for demonstrating how to fit a logistic regression model to predict a binary outcome.

Copied!
1DATA casuser.getstarted;
2 INPUT STATUS $ Sex $ Age Cholesterol Smoking;
3 DATALINES;
4 Dead Male 55 220 20
5 Alive Female 55 180 10
6 Dead Male 65 240 30
7 Alive Female 45 170 5
8 Dead Female 70 260 15
9 Alive Male 48 210 0
10 ;
11RUN;

Examples

This example demonstrates a basic logistic regression analysis. It uses the 'getstarted' table and models the binary 'Status' variable, with 'Dead' as the event of interest. The model includes 'Sex' as a classification variable and 'Age' and 'Smoking' as continuous explanatory variables.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 regression.logistic TABLE={name='getstarted'},
3 class={'Sex'},
4 model={depvars={{name='Status', options={event='Dead', order='FORMATTED'}}},
5 effects={'Sex', 'Age', 'Smoking'}};
6RUN;
Result :
The action returns several tables, including 'ModelInfo' describing the model settings, 'ConvergenceStatus' indicating if the optimization succeeded, and 'ParameterEstimates' showing the estimated coefficients, standard errors, and p-values for each predictor in the model.

This example performs a logistic regression with stepwise model selection to identify the most significant predictors. It also demonstrates how to generate an output dataset with predicted probabilities and request odds ratios for the final selected model's variables.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 regression.logistic TABLE='getstarted',
3 class={'Sex'},
4 model={depvar={{name='Status', options={event='Dead'}}},
5 effects={'Sex', 'Age', 'Cholesterol', 'Smoking'}},
6 selection={method='STEPWISE', details='ALL'},
7 OUTPUT={casOut={name='logistic_output', replace=true}, pred='predProb', role='role'},
8 oddsratio={vars={'Sex', 'Age', 'Smoking'}}$
9RUN;
Result :
The output will include detailed step-by-step information from the selection process, showing which variables are added or removed. It also produces tables for odds ratios and a new CAS table named 'logistic_output' containing the predicted probabilities ('predProb') and the role of each observation.

This example fits a generalized logit model for a multinomial response variable. It uses the 'getstarted' dataset and models the 'Smoking' status (categorized) based on 'Age' and 'Sex'. This is useful when the response variable has more than two unordered categories.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 regression.logistic TABLE='getstarted',
3 class={'Sex'},
4 model={depvar={{name='Smoking'}},
5 effects={'Sex', 'Age'},
6 link='GLOGIT'};
7RUN;
Result :
The results will include parameter estimates for each level of the 'Smoking' response variable compared to a reference level. It provides insights into how 'Age' and 'Sex' influence the odds of being in one smoking category versus another.

FAQ

What is the primary function of the 'logistic' action in SAS Viya?
How do you specify the model to be fitted using the 'logistic' action?
What model selection methods are available in the 'logistic' action?
Can I save the results of a logistic regression model for later use?
How can I handle classification variables within the 'logistic' action?
Is it possible to perform a repeated measures analysis with the 'logistic' action?