builtins

glm

Description

The `glm` action fits linear regression models using the method of least squares. It allows specifying various model effects, selection methods, and output options. It supports confidence intervals, diagnostic statistics, and saving model output to CAS tables. Classification variables can be defined with global options or individually, and polynomial and spline effects are also supported. Data can be partitioned for training, validation, and testing.

regression.glm <result=results> <status=rc> /\n alpha=double,\n attributes={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n byLimit=64-bit-integer,\n class={{countMissing=TRUE | FALSE, descending=TRUE | FALSE, ignoreMissing=TRUE | FALSE, levelizeRaw=TRUE | FALSE, maxLev=integer, order="FORMATTED" | "FREQ" | "FREQFORMATTED" | "FREQINTERNAL" | "INTERNAL", param="BTH" | "EFFECT" | "GLM" | "ORDINAL" | "ORTHBTH" | "ORTHEFFECT" | "ORTHORDINAL" | "ORTHPOLY" | "ORTHREF" | "POLYNOMIAL" | "REFERENCE", ref="FIRST" | "LAST" | double | "string", split=TRUE | FALSE, *vars={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n classGlobalOpts={countMissing=TRUE | FALSE, descending=TRUE | FALSE, ignoreMissing=TRUE | FALSE, levelizeRaw=TRUE | FALSE, maxLev=integer, order="FORMATTED" | "FREQ" | "FREQFORMATTED" | "FREQINTERNAL" | "INTERNAL", param="BTH" | "EFFECT" | "GLM" | "ORDINAL" | "ORTHBTH" | "ORTHEFFECT" | "ORTHORDINAL" | "ORTHPOLY" | "ORTHREF" | "POLYNOMIAL" | "REFERENCE", ref="FIRST" | "LAST" | double | "string", split=TRUE | FALSE},\n classLevelsPrint=TRUE | FALSE,\n clb=TRUE | FALSE,\n code={casOut={caslib="string" compress=TRUE | FALSE indexVars={"variable-name-1" <, "variable-name-2", ...>} label="string" lifetime=64-bit-integer maxMemSize=64-bit-integer memoryFormat="DVR" | "INHERIT" | "STANDARD" name="table-name" onDemand=TRUE | FALSE promote=TRUE | FALSE replace=TRUE | FALSE replication=integer tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE" threadBlockSize=64-bit-integer timeStamp="string" where={"string-1" <, "string-2", ...>}}, comment=TRUE | FALSE, fmtWdth=integer, indentSize=integer, intoCutPt=double, iProb=TRUE | FALSE, labelId=integer, lineSize=integer, noTrim=TRUE | FALSE, pCatAll=TRUE | FALSE, tabForm=TRUE | FALSE},\n collection={{details=TRUE | FALSE, *name="string", *vars={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n display={caseSensitive=TRUE | FALSE, exclude=TRUE | FALSE, excludeAll=TRUE | FALSE, keyIsPath=TRUE | FALSE, names={"string-1" <, "string-2", ...>}, pathType="LABEL" | "NAME", traceNames=TRUE | FALSE},\n freq="variable-name",\n inputs={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n maxParameters=integer,\n model={addlaststopstep=TRUE | FALSE, clb=TRUE | FALSE, depVars={{name="variable-name"}}, effects={{interaction="BAR" | "CROSS" | "NONE", maxInteract=integer, nest={"string-1" <, "string-2", ...>}, *vars={"string-1" <, "string-2", ...>}}, {...}}, entry="variable-name", include=integer | {{effect-1} <, {effect-2}, ...>}, informative=TRUE | FALSE, noint=TRUE | FALSE, ridge={double-1 <, double-2, ...>}, ss3=TRUE | FALSE, start=integer | {{effect-1} <, {effect-2}, ...>}, stb=TRUE | FALSE, tol=TRUE | FALSE, vif=TRUE | FALSE, xpx=TRUE | FALSE, xpxScaled=TRUE | FALSE, xpxUnscaled=TRUE | FALSE},\n multimember={{details=TRUE | FALSE, *name="string", noEffect=TRUE | FALSE, stdize=TRUE | FALSE, *vars={"variable-name-1" <, "variable-name-2", ...>}, weight={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n nClassLevelsPrint=integer,\n nominals={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n output={casOut={caslib="string" compress=TRUE | FALSE indexVars={"variable-name-1" <, "variable-name-2", ...>} label="string" lifetime=64-bit-integer maxMemSize=64-bit-integer memoryFormat="DVR" | "INHERIT" | "STANDARD" name="table-name" promote=TRUE | FALSE replace=TRUE | FALSE replication=integer tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE" threadBlockSize=64-bit-integer timeStamp="string" where={"string-1" <, "string-2", ...>}}, cooksD="string", copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>}, covRatio="string", dffits="string", h="string", lcl="string", lclm="string", likeDist="string", pred="string", press="string", resid="string", role="string", rStudent="string", stdi="string", stdp="string", stdr="string", student="string", ucl="string", uclm="string"},\n outputTables={groupByVarsRaw=TRUE | FALSE, includeAll=TRUE | FALSE, names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>}, repeated=TRUE | FALSE, replace=TRUE | FALSE},\n parmEstLevDetails="NONE" | "RAW" | "RAW_AND_FORMATTED",\n partByFrac={seed=integer, test=double, validate=double},\n partByVar={*name="variable-name", test="string", train="string", validate="string"},\n polynomial={{degree=integer, details=TRUE | FALSE, labelStyle={expand=TRUE | FALSE, exponent="string", includeName=TRUE | FALSE, productSymbol="NONE" | "string"}, mDegree=integer, *name="string", noSeparate=TRUE | FALSE, standardize={method="MOMENTS" | "MRANGE" | "WMOMENTS", options="CENTER" | "CENTERSCALE" | "NONE" | "SCALE", prefix="NONE" | "string"}, *vars={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n selection={adaptive=TRUE | FALSE, bestSubsetOptions={best=integer, computeBeta=TRUE | FALSE, displayAIC=TRUE | FALSE, displayBIC=TRUE | FALSE, displayGMSEP=TRUE | FALSE, displayJP=TRUE | FALSE, displayMSE=TRUE | FALSE, displayPC=TRUE | FALSE, displayRMSE=TRUE | FALSE, displaySBC=TRUE | FALSE, displaySP=TRUE | FALSE, displaySSE=TRUE | FALSE, sigma=double}, candidates=integer | "ALL", choose="ADJRSQ" | "AIC" | "AICC" | "CP" | "CV" | "DEFAULT" | "NONE" | "PRESS" | "RSQUARE" | "SBC" | "VALIDATE", competitive=TRUE | FALSE, details="ALL" | "NONE" | "STEPS" | "SUMMARY", elasticNetOptions={absFConv=double, fConv=double, gConv=double, lambda={double-1 <, double-2, ...>}, mixing={double-1 <, double-2, ...>}, numLambda=integer, rho=double, solver="ADMM" | "BFGS" | "LBFGS" | "NLP"}, enscale=TRUE | FALSE, ensteps=integer, fcpSelectionOptions={alpha=double, bigM=double, coefTol=double, intTol=double, lambda=double, lambdaGrid="DEFAULT" | "LINSPACE" | "LOGSPACE", maxAlpha=double, maxIterAlpha=integer, maxIterLambda=integer, maxLambda=double, maxTime=double, minAlpha=double, minLambda=double, scale=TRUE | FALSE, solver="DEFAULT" | "MILP" | "NLP"}, gamma=double, hierarchy="DEFAULT" | "NONE" | "SINGLE" | "SINGLECLASS", kappa={double-1 <, double-2, ...>}, L2=double, L2HIGH=double, L2LOW=double, lsCoeffs=TRUE | FALSE, maxEffects=integer, maxSteps=integer, method="BACKWARD" | "BESTSUBSET" | "ELASTICNET" | "FORWARD" | "FORWARDSWAP" | "LAR" | "LASSO" | "MCP" | "NONE" | "SCAD" | "STEPWISE", minEffects=integer, orderSelect=TRUE | FALSE, plots=TRUE | FALSE, relaxed=TRUE | FALSE, select="ADJRSQ" | "AIC" | "AICC" | "CP" | "DEFAULT" | "RSQUARE" | "SBC" | "SL", slEntry=double, slStay=double, stop="ADJRSQ" | "AIC" | "AICC" | "CP" | "CV" | "DEFAULT" | "NONE" | "PRESS" | "RSQUARE" | "SBC" | "SL" | "VALIDATE", stopHorizon=integer},\n spline={{basis="BSPLINE" | "TPF_DEFAULT" | "TPF_NOINT" | "TPF_NOINTANDNOPOWERS" | "TPF_NOPOWERS", dataBoundary=TRUE | FALSE, degree=integer, details=TRUE | FALSE, knotMax=double, knotMethod={equal=integer, list={double-1 <, double-2, ...>}, listWithBoundary={double-1 <, double-2, ...>}, multiscale={endScale=integer, startScale=integer}, rangeFractions={double-1 <, double-2, ...>}}, knotMin=double, *name="string", naturalCubic=TRUE | FALSE, separate=TRUE | FALSE, split=TRUE | FALSE, *vars={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n ss3=TRUE | FALSE,\n store={caslib="string", compress=TRUE | FALSE, indexVars={"variable-name-1" <, "variable-name-2", ...>}, label="string", lifetime=64-bit-integer, maxMemSize=64-bit-integer, memoryFormat="DVR" | "INHERIT" | "STANDARD", name="table-name", promote=TRUE | FALSE, replace=TRUE | FALSE, replication=integer, tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE", threadBlockSize=64-bit-integer, timeStamp="string", where={"string-1" <, "string-2", ...>}},\n *table={caslib="string", computedOnDemand=TRUE | FALSE, computedVars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, groupBy={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, groupByMode="NOSORT" | "REDISTRIBUTE", importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, *name="table-name", orderBy={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, singlePass=TRUE | FALSE, vars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, *name="table-name", vars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}},\n target="string",\n weight="variable-name";
Settings
ParameterDescription
alphaSpecifies the significance level to use for the construction of all confidence intervals. Default: 0.05. Range: (0, 1).
attributesChanges the attributes of variables used in this action. Subparameters: format, formattedLength, label, *name, nfd, nfl.
byLimitSpecifies that the analysis not be performed if the number of BY groups exceeds the specified value. Minimum value: 1.
classNames the classification variables to be used as explanatory variables in the analysis. Subparameters: countMissing, descending, ignoreMissing, levelizeRaw, maxLev, order, param, ref, split, *vars.
classGlobalOptsLists options that apply to all classification variables. Subparameters: countMissing, descending, ignoreMissing, levelizeRaw, maxLev, order, param, ref, split.
classLevelsPrintWhen set to False, suppresses the display of class levels. Default: TRUE.
clbWhen set to True, displays upper and lower confidence limits for the parameter estimates. Default: FALSE.
codeWrites SAS DATA step code for computing predicted values of the fitted model. Subparameters include casOut (for output table settings), comment, fmtWdth, indentSize, intoCutPt, iProb, labelId, lineSize, noTrim, pCatAll, tabForm.
collectionDefines a set of variables that are treated as a single effect that has multiple degrees of freedom. Subparameters: details, *name, *vars.
displaySpecifies a list of results tables to send to the client for display. Subparameters: caseSensitive, exclude, excludeAll, keyIsPath, names, pathType, traceNames.
freqNames the numeric variable that contains the frequency of occurrence of each observation.
inputsSpecifies variables to use for analysis. Subparameters: format, formattedLength, label, *name, nfd, nfl.
maxParametersSpecifies that models not be fit if the number of parameters exceeds the specified value. Minimum value: 0.
modelNames the dependent variable, explanatory effects, and model options. Subparameters: addlaststopstep, clb, depVars, effects, entry, include, informative, noint, ridge, ss3, start, stb, tol, vif, xpx, xpxScaled, xpxUnscaled.
model.depVarsSubparameter of `model`. Specifies one or more variables to use as response variables in the model. Subparameter: name.
model.effectsSubparameter of `model`. Specifies a list of effects that define the model. Subparameters: interaction, maxInteract, nest, *vars.
model.includeSubparameter of `model`. Specifies effects to include at the start of the selection process. Can be an integer or a list of effects.
model.informativeSubparameter of `model`. When set to True, models missing values using extra model effects. Default: FALSE.
model.nointSubparameter of `model`. When set to True, does not include the intercept term in the model. Default: FALSE.
model.ridgeSubparameter of `model`. Specifies the ridge constant values for ridge regression.
model.ss3Subparameter of `model`. When set to True, performs a model analysis of variance based on type III sums of squares. Default: FALSE.
model.startSubparameter of `model`. Specifies effects to use to begin the selection process in FORWARD, FORWARDSWAP, and STEPWISE methods. Can be an integer or a list of effects.
model.stbSubparameter of `model`. When set to True, produces standardized regression coefficients. Default: FALSE.
model.tolSubparameter of `model`. When set to True, produces tolerance values for the estimates. Default: FALSE.
model.vifSubparameter of `model`. When set to True, produces variance inflation factors with the parameter estimates. Default: FALSE.
model.xpxSubparameter of `model`. Crossproducts. Default: FALSE.
model.xpxScaledSubparameter of `model`. Scaled Crossproducts. Default: FALSE.
model.xpxUnscaledSubparameter of `model`. Unscaled Crossproducts. Default: FALSE.
multimemberUses one or more classification variables specified in the vars parameter such that each observation can be associated with one or more levels. Subparameters: details, *name, noEffect, stdize, *vars, weight.
nClassLevelsPrintLimits the display of class levels. The value 0 suppresses all levels. Minimum value: 0.
nominalsSpecifies nominal variables to use for analysis. Subparameters: format, formattedLength, label, *name, nfd, nfl.
outputCreates a table on the server that contains observationwise statistics, computed after fitting the model. Subparameters: *casOut (for output table settings), cooksD, copyVars, covRatio, dffits, h, lcl, lclm, likeDist, pred, press, resid, role, rStudent, stdi, stdp, stdr, student, ucl, uclm.
outputTablesLists the names of results tables to save as CAS tables on the server. Subparameters: groupByVarsRaw, includeAll, names, repeated, replace.
parmEstLevDetailsSpecifies whether to add raw and formatted values of classification variables in the ParameterEstimates table. Options: NONE, RAW, RAW_AND_FORMATTED. Default: RAW.
partByFracSpecifies the fractions of the data to be used for validation and testing. Subparameters: seed, test, validate.
partByVarNames the variable and its values used to partition the data into training, validation, and testing roles. Subparameters: *name, test, train, validate.
polynomialSpecifies a polynomial effect. All specified variables must be numeric. Subparameters: degree, details, labelStyle, mDegree, *name, noSeparate, standardize, *vars.
selectionSpecifies the method and options for performing model selection. Subparameters: adaptive, bestSubsetOptions, candidates, choose, competitive, details, elasticNetOptions, enscale, ensteps, fcpSelectionOptions, gamma, hierarchy, kappa, L2, L2HIGH, L2LOW, lsCoeffs, maxEffects, maxSteps, method, minEffects, orderSelect, plots, relaxed, select, slEntry, slStay, stop, stopHorizon.
selection.bestSubsetOptionsSubparameter of `selection`. Specifies options to perform best-subset selection. Subparameters: best, computeBeta, displayAIC, displayBIC, displayGMSEP, displayJP, displayMSE, displayPC, displayRMSE, displaySBC, displaySP, displaySSE, sigma.
selection.elasticNetOptionsSubparameter of `selection`. Specifies options to use in performing elastic net selection methods. Subparameters: absFConv, fConv, gConv, lambda, mixing, numLambda, rho, solver.
selection.fcpSelectionOptionsSubparameter of `selection`. Specifies options to use in performing the folded concave penalized (FCP) selection methods. Subparameters: alpha, bigM, coefTol, intTol, lambda, lambdaGrid, maxAlpha, maxIterAlpha, maxIterLambda, maxLambda, maxTime, minAlpha, minLambda, scale, solver.
splineExpands variables into spline bases whose form depends on the specified parameters. Subparameters: basis, dataBoundary, degree, details, knotMax, knotMethod, knotMin, *name, naturalCubic, separate, split, *vars.
ss3When set to True, performs a model analysis of variance based on type III sums of squares. Default: FALSE.
storeStores regression models to a binary large object (BLOB). Subparameters: caslib, compress, indexVars, label, lifetime, maxMemSize, memoryFormat, name, promote, replace, replication, tableRedistUpPolicy, threadBlockSize, timeStamp, where.
tableSpecifies the input data table. Subparameters: caslib, computedOnDemand, computedVars, computedVarsProgram, dataSourceOptions, groupBy, groupByMode, importOptions, *name, orderBy, singlePass, vars, where, whereTable.
targetSpecifies the target variable to use for analysis.
weightNames the numeric variable to use to perform a weighted analysis of the data.
Data Preparation View data prep sheet
Example Data Creation

This example shows how to create a simple CAS table for use with the `glm` action.

Copied!
1DATA casuser.mydata;
2 INPUT x y z @@;
3 CARDS;
41 10 100 2 12 110 3 15 120 4 18 130 5 20 140
56 22 150 7 25 160 8 28 170 9 30 180 10 33 190
6;

Examples

This example performs a simple linear regression using `x` as the independent variable and `y` as the dependent variable.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 regression.glm /
3 TABLE={name='mydata'},
4 model={depVars={{name='y'}}, effects={{vars={'x'}}}};
5RUN;
6QUIT;
Result :
Output tables for regression analysis, including parameter estimates for x and y.

This example demonstrates fitting a linear regression model with multiple predictors, including a classification variable, and generating an output table with predicted values and residuals.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 /* Load the data */
3 DATA casuser.cars;
4 SET sashelp.cars;
5 IF make='Audi' THEN type_cat='German';
6 ELSE IF make='BMW' THEN type_cat='German';
7 ELSE IF make='Toyota' THEN type_cat='Japanese';
8 ELSE IF make='Honda' THEN type_cat='Japanese';
9 ELSE type_cat='Other';
10 RUN;
11
12 /* Run the glm action */
13 regression.glm /
14 TABLE={name='cars'},
15 model={depVars={{name='MSRP'}}, effects={{vars={'Horsepower'}}, {vars={'type_cat'}}, {vars={'Horsepower', 'type_cat'}, interaction='CROSS'}}},
16 class={{vars={'type_cat'}}},
17 OUTPUT={casOut={name='predicted_cars', replace=true}, pred='PredictedMSRP', resid='Residuals'};
18RUN;
19QUIT;
Result :
Detailed regression analysis results, parameter estimates, and an output table named 'predicted_cars' containing predicted MSRP and residuals.