builtins

glm

Description

The `glm` action fits linear regression models using the method of least squares. It allows specifying various model effects, selection methods, and output options. It supports confidence intervals, diagnostic statistics, and saving model output to CAS tables. Classification variables can be defined with global options or individually, and polynomial and spline effects are also supported. Data can be partitioned for training, validation, and testing.

regression.glm <result=results> <status=rc> /\n alpha=double,\n attributes={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n byLimit=64-bit-integer,\n class={{countMissing=TRUE | FALSE, descending=TRUE | FALSE, ignoreMissing=TRUE | FALSE, levelizeRaw=TRUE | FALSE, maxLev=integer, order="FORMATTED" | "FREQ" | "FREQFORMATTED" | "FREQINTERNAL" | "INTERNAL", param="BTH" | "EFFECT" | "GLM" | "ORDINAL" | "ORTHBTH" | "ORTHEFFECT" | "ORTHORDINAL" | "ORTHPOLY" | "ORTHREF" | "POLYNOMIAL" | "REFERENCE", ref="FIRST" | "LAST" | double | "string", split=TRUE | FALSE, *vars={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n classGlobalOpts={countMissing=TRUE | FALSE, descending=TRUE | FALSE, ignoreMissing=TRUE | FALSE, levelizeRaw=TRUE | FALSE, maxLev=integer, order="FORMATTED" | "FREQ" | "FREQFORMATTED" | "FREQINTERNAL" | "INTERNAL", param="BTH" | "EFFECT" | "GLM" | "ORDINAL" | "ORTHBTH" | "ORTHEFFECT" | "ORTHORDINAL" | "ORTHPOLY" | "ORTHREF" | "POLYNOMIAL" | "REFERENCE", ref="FIRST" | "LAST" | double | "string", split=TRUE | FALSE},\n classLevelsPrint=TRUE | FALSE,\n clb=TRUE | FALSE,\n code={casOut={caslib="string" compress=TRUE | FALSE indexVars={"variable-name-1" <, "variable-name-2", ...>} label="string" lifetime=64-bit-integer maxMemSize=64-bit-integer memoryFormat="DVR" | "INHERIT" | "STANDARD" name="table-name" onDemand=TRUE | FALSE promote=TRUE | FALSE replace=TRUE | FALSE replication=integer tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE" threadBlockSize=64-bit-integer timeStamp="string" where={"string-1" <, "string-2", ...>}}, comment=TRUE | FALSE, fmtWdth=integer, indentSize=integer, intoCutPt=double, iProb=TRUE | FALSE, labelId=integer, lineSize=integer, noTrim=TRUE | FALSE, pCatAll=TRUE | FALSE, tabForm=TRUE | FALSE},\n collection={{details=TRUE | FALSE, *name="string", *vars={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n display={caseSensitive=TRUE | FALSE, exclude=TRUE | FALSE, excludeAll=TRUE | FALSE, keyIsPath=TRUE | FALSE, names={"string-1" <, "string-2", ...>}, pathType="LABEL" | "NAME", traceNames=TRUE | FALSE},\n freq="variable-name",\n inputs={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n maxParameters=integer,\n model={addlaststopstep=TRUE | FALSE, clb=TRUE | FALSE, depVars={{name="variable-name"}}, effects={{interaction="BAR" | "CROSS" | "NONE", maxInteract=integer, nest={"string-1" <, "string-2", ...>}, *vars={"string-1" <, "string-2", ...>}}, {...}}, entry="variable-name", include=integer | {{effect-1} <, {effect-2}, ...>}, informative=TRUE | FALSE, noint=TRUE | FALSE, ridge={double-1 <, double-2, ...>}, ss3=TRUE | FALSE, start=integer | {{effect-1} <, {effect-2}, ...>}, stb=TRUE | FALSE, tol=TRUE | FALSE, vif=TRUE | FALSE, xpx=TRUE | FALSE, xpxScaled=TRUE | FALSE, xpxUnscaled=TRUE | FALSE},\n multimember={{details=TRUE | FALSE, *name="string", noEffect=TRUE | FALSE, stdize=TRUE | FALSE, *vars={"variable-name-1" <, "variable-name-2", ...>}, weight={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n nClassLevelsPrint=integer,\n nominals={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n output={casOut={caslib="string" compress=TRUE | FALSE indexVars={"variable-name-1" <, "variable-name-2", ...>} label="string" lifetime=64-bit-integer maxMemSize=64-bit-integer memoryFormat="DVR" | "INHERIT" | "STANDARD" name="table-name" promote=TRUE | FALSE replace=TRUE | FALSE replication=integer tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE" threadBlockSize=64-bit-integer timeStamp="string" where={"string-1" <, "string-2", ...>}}, cooksD="string", copyVars="ALL" | "ALL_MODEL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>}, covRatio="string", dffits="string", h="string", lcl="string", lclm="string", likeDist="string", pred="string", press="string", resid="string", role="string", rStudent="string", stdi="string", stdp="string", stdr="string", student="string", ucl="string", uclm="string"},\n outputTables={groupByVarsRaw=TRUE | FALSE, includeAll=TRUE | FALSE, names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>}, repeated=TRUE | FALSE, replace=TRUE | FALSE},\n parmEstLevDetails="NONE" | "RAW" | "RAW_AND_FORMATTED",\n partByFrac={seed=integer, test=double, validate=double},\n partByVar={*name="variable-name", test="string", train="string", validate="string"},\n polynomial={{degree=integer, details=TRUE | FALSE, labelStyle={expand=TRUE | FALSE, exponent="string", includeName=TRUE | FALSE, productSymbol="NONE" | "string"}, mDegree=integer, *name="string", noSeparate=TRUE | FALSE, standardize={method="MOMENTS" | "MRANGE" | "WMOMENTS", options="CENTER" | "CENTERSCALE" | "NONE" | "SCALE", prefix="NONE" | "string"}, *vars={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n selection={adaptive=TRUE | FALSE, bestSubsetOptions={best=integer, computeBeta=TRUE | FALSE, displayAIC=TRUE | FALSE, displayBIC=TRUE | FALSE, displayGMSEP=TRUE | FALSE, displayJP=TRUE | FALSE, displayMSE=TRUE | FALSE, displayPC=TRUE | FALSE, displayRMSE=TRUE | FALSE, displaySBC=TRUE | FALSE, displaySP=TRUE | FALSE, displaySSE=TRUE | FALSE, sigma=double}, candidates=integer | "ALL", choose="ADJRSQ" | "AIC" | "AICC" | "CP" | "CV" | "DEFAULT" | "NONE" | "PRESS" | "RSQUARE" | "SBC" | "VALIDATE", competitive=TRUE | FALSE, details="ALL" | "NONE" | "STEPS" | "SUMMARY", elasticNetOptions={absFConv=double, fConv=double, gConv=double, lambda={double-1 <, double-2, ...>}, mixing={double-1 <, double-2, ...>}, numLambda=integer, rho=double, solver="ADMM" | "BFGS" | "LBFGS" | "NLP"}, enscale=TRUE | FALSE, ensteps=integer, fcpSelectionOptions={alpha=double, bigM=double, coefTol=double, intTol=double, lambda=double, lambdaGrid="DEFAULT" | "LINSPACE" | "LOGSPACE", maxAlpha=double, maxIterAlpha=integer, maxIterLambda=integer, maxLambda=double, maxTime=double, minAlpha=double, minLambda=double, scale=TRUE | FALSE, solver="DEFAULT" | "MILP" | "NLP"}, gamma=double, hierarchy="DEFAULT" | "NONE" | "SINGLE" | "SINGLECLASS", kappa={double-1 <, double-2, ...>}, L2=double, L2HIGH=double, L2LOW=double, lsCoeffs=TRUE | FALSE, maxEffects=integer, maxSteps=integer, method="BACKWARD" | "BESTSUBSET" | "ELASTICNET" | "FORWARD" | "FORWARDSWAP" | "LAR" | "LASSO" | "MCP" | "NONE" | "SCAD" | "STEPWISE", minEffects=integer, orderSelect=TRUE | FALSE, plots=TRUE | FALSE, relaxed=TRUE | FALSE, select="ADJRSQ" | "AIC" | "AICC" | "CP" | "DEFAULT" | "RSQUARE" | "SBC" | "SL", slEntry=double, slStay=double, stop="ADJRSQ" | "AIC" | "AICC" | "CP" | "CV" | "DEFAULT" | "NONE" | "PRESS" | "RSQUARE" | "SBC" | "SL" | "VALIDATE", stopHorizon=integer},\n spline={{basis="BSPLINE" | "TPF_DEFAULT" | "TPF_NOINT" | "TPF_NOINTANDNOPOWERS" | "TPF_NOPOWERS", dataBoundary=TRUE | FALSE, degree=integer, details=TRUE | FALSE, knotMax=double, knotMethod={equal=integer, list={double-1 <, double-2, ...>}, listWithBoundary={double-1 <, double-2, ...>}, multiscale={endScale=integer, startScale=integer}, rangeFractions={double-1 <, double-2, ...>}}, knotMin=double, *name="string", naturalCubic=TRUE | FALSE, separate=TRUE | FALSE, split=TRUE | FALSE, *vars={"variable-name-1" <, "variable-name-2", ...>}}, {...}},\n ss3=TRUE | FALSE,\n store={caslib="string", compress=TRUE | FALSE, indexVars={"variable-name-1" <, "variable-name-2", ...>}, label="string", lifetime=64-bit-integer, maxMemSize=64-bit-integer, memoryFormat="DVR" | "INHERIT" | "STANDARD", name="table-name", promote=TRUE | FALSE, replace=TRUE | FALSE, replication=integer, tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE", threadBlockSize=64-bit-integer, timeStamp="string", where={"string-1" <, "string-2", ...>}},\n *table={caslib="string", computedOnDemand=TRUE | FALSE, computedVars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, groupBy={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, groupByMode="NOSORT" | "REDISTRIBUTE", importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, *name="table-name", orderBy={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, singlePass=TRUE | FALSE, vars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, *name="table-name", vars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}},\n target="string",\n weight="variable-name";
Settings
ParameterDescription
alpha Specifies the significance level to use for the construction of all confidence intervals. Default: 0.05. Range: (0, 1).
attributes Changes the attributes of variables used in this action. Subparameters: format, formattedLength, label, *name, nfd, nfl.
byLimit Specifies that the analysis not be performed if the number of BY groups exceeds the specified value. Minimum value: 1.
class Names the classification variables to be used as explanatory variables in the analysis. Subparameters: countMissing, descending, ignoreMissing, levelizeRaw, maxLev, order, param, ref, split, *vars.
classGlobalOpts Lists options that apply to all classification variables. Subparameters: countMissing, descending, ignoreMissing, levelizeRaw, maxLev, order, param, ref, split.
classLevelsPrint When set to False, suppresses the display of class levels. Default: TRUE.
clb When set to True, displays upper and lower confidence limits for the parameter estimates. Default: FALSE.
code Writes SAS DATA step code for computing predicted values of the fitted model. Subparameters include casOut (for output table settings), comment, fmtWdth, indentSize, intoCutPt, iProb, labelId, lineSize, noTrim, pCatAll, tabForm.
collection Defines a set of variables that are treated as a single effect that has multiple degrees of freedom. Subparameters: details, *name, *vars.
display Specifies a list of results tables to send to the client for display. Subparameters: caseSensitive, exclude, excludeAll, keyIsPath, names, pathType, traceNames.
freq Names the numeric variable that contains the frequency of occurrence of each observation.
inputs Specifies variables to use for analysis. Subparameters: format, formattedLength, label, *name, nfd, nfl.
maxParameters Specifies that models not be fit if the number of parameters exceeds the specified value. Minimum value: 0.
model Names the dependent variable, explanatory effects, and model options. Subparameters: addlaststopstep, clb, depVars, effects, entry, include, informative, noint, ridge, ss3, start, stb, tol, vif, xpx, xpxScaled, xpxUnscaled.
model.depVars Subparameter of `model`. Specifies one or more variables to use as response variables in the model. Subparameter: name.
model.effects Subparameter of `model`. Specifies a list of effects that define the model. Subparameters: interaction, maxInteract, nest, *vars.
model.include Subparameter of `model`. Specifies effects to include at the start of the selection process. Can be an integer or a list of effects.
model.informative Subparameter of `model`. When set to True, models missing values using extra model effects. Default: FALSE.
model.noint Subparameter of `model`. When set to True, does not include the intercept term in the model. Default: FALSE.
model.ridge Subparameter of `model`. Specifies the ridge constant values for ridge regression.
model.ss3 Subparameter of `model`. When set to True, performs a model analysis of variance based on type III sums of squares. Default: FALSE.
model.start Subparameter of `model`. Specifies effects to use to begin the selection process in FORWARD, FORWARDSWAP, and STEPWISE methods. Can be an integer or a list of effects.
model.stb Subparameter of `model`. When set to True, produces standardized regression coefficients. Default: FALSE.
model.tol Subparameter of `model`. When set to True, produces tolerance values for the estimates. Default: FALSE.
model.vif Subparameter of `model`. When set to True, produces variance inflation factors with the parameter estimates. Default: FALSE.
model.xpx Subparameter of `model`. Crossproducts. Default: FALSE.
model.xpxScaled Subparameter of `model`. Scaled Crossproducts. Default: FALSE.
model.xpxUnscaled Subparameter of `model`. Unscaled Crossproducts. Default: FALSE.
multimember Uses one or more classification variables specified in the vars parameter such that each observation can be associated with one or more levels. Subparameters: details, *name, noEffect, stdize, *vars, weight.
nClassLevelsPrint Limits the display of class levels. The value 0 suppresses all levels. Minimum value: 0.
nominals Specifies nominal variables to use for analysis. Subparameters: format, formattedLength, label, *name, nfd, nfl.
output Creates a table on the server that contains observationwise statistics, computed after fitting the model. Subparameters: *casOut (for output table settings), cooksD, copyVars, covRatio, dffits, h, lcl, lclm, likeDist, pred, press, resid, role, rStudent, stdi, stdp, stdr, student, ucl, uclm.
outputTables Lists the names of results tables to save as CAS tables on the server. Subparameters: groupByVarsRaw, includeAll, names, repeated, replace.
parmEstLevDetails Specifies whether to add raw and formatted values of classification variables in the ParameterEstimates table. Options: NONE, RAW, RAW_AND_FORMATTED. Default: RAW.
partByFrac Specifies the fractions of the data to be used for validation and testing. Subparameters: seed, test, validate.
partByVar Names the variable and its values used to partition the data into training, validation, and testing roles. Subparameters: *name, test, train, validate.
polynomial Specifies a polynomial effect. All specified variables must be numeric. Subparameters: degree, details, labelStyle, mDegree, *name, noSeparate, standardize, *vars.
selection Specifies the method and options for performing model selection. Subparameters: adaptive, bestSubsetOptions, candidates, choose, competitive, details, elasticNetOptions, enscale, ensteps, fcpSelectionOptions, gamma, hierarchy, kappa, L2, L2HIGH, L2LOW, lsCoeffs, maxEffects, maxSteps, method, minEffects, orderSelect, plots, relaxed, select, slEntry, slStay, stop, stopHorizon.
selection.bestSubsetOptions Subparameter of `selection`. Specifies options to perform best-subset selection. Subparameters: best, computeBeta, displayAIC, displayBIC, displayGMSEP, displayJP, displayMSE, displayPC, displayRMSE, displaySBC, displaySP, displaySSE, sigma.
selection.elasticNetOptions Subparameter of `selection`. Specifies options to use in performing elastic net selection methods. Subparameters: absFConv, fConv, gConv, lambda, mixing, numLambda, rho, solver.
selection.fcpSelectionOptions Subparameter of `selection`. Specifies options to use in performing the folded concave penalized (FCP) selection methods. Subparameters: alpha, bigM, coefTol, intTol, lambda, lambdaGrid, maxAlpha, maxIterAlpha, maxIterLambda, maxLambda, maxTime, minAlpha, minLambda, scale, solver.
spline Expands variables into spline bases whose form depends on the specified parameters. Subparameters: basis, dataBoundary, degree, details, knotMax, knotMethod, knotMin, *name, naturalCubic, separate, split, *vars.
ss3 When set to True, performs a model analysis of variance based on type III sums of squares. Default: FALSE.
store Stores regression models to a binary large object (BLOB). Subparameters: caslib, compress, indexVars, label, lifetime, maxMemSize, memoryFormat, name, promote, replace, replication, tableRedistUpPolicy, threadBlockSize, timeStamp, where.
table Specifies the input data table. Subparameters: caslib, computedOnDemand, computedVars, computedVarsProgram, dataSourceOptions, groupBy, groupByMode, importOptions, *name, orderBy, singlePass, vars, where, whereTable.
target Specifies the target variable to use for analysis.
weight Names the numeric variable to use to perform a weighted analysis of the data.
Data Preparation View data prep sheet
Example Data Creation

This example shows how to create a simple CAS table for use with the `glm` action.

Copied!
1DATA casuser.mydata;
2 INPUT x y z @@;
3 CARDS;
41 10 100 2 12 110 3 15 120 4 18 130 5 20 140
56 22 150 7 25 160 8 28 170 9 30 180 10 33 190
6;

Examples

This example performs a simple linear regression using `x` as the independent variable and `y` as the dependent variable.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 regression.glm /
3 TABLE={name='mydata'},
4 model={depVars={{name='y'}}, effects={{vars={'x'}}}};
5RUN;
6QUIT;
Result :
Output tables for regression analysis, including parameter estimates for x and y.

This example demonstrates fitting a linear regression model with multiple predictors, including a classification variable, and generating an output table with predicted values and residuals.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 /* Load the data */
3 DATA casuser.cars;
4 SET sashelp.cars;
5 IF make='Audi' THEN type_cat='German';
6 ELSE IF make='BMW' THEN type_cat='German';
7 ELSE IF make='Toyota' THEN type_cat='Japanese';
8 ELSE IF make='Honda' THEN type_cat='Japanese';
9 ELSE type_cat='Other';
10 RUN;
11
12 /* Run the glm action */
13 regression.glm /
14 TABLE={name='cars'},
15 model={depVars={{name='MSRP'}}, effects={{vars={'Horsepower'}}, {vars={'type_cat'}}, {vars={'Horsepower', 'type_cat'}, interaction='CROSS'}}},
16 class={{vars={'type_cat'}}},
17 OUTPUT={casOut={name='predicted_cars', replace=true}, pred='PredictedMSRP', resid='Residuals'};
18RUN;
19QUIT;
Result :
Detailed regression analysis results, parameter estimates, and an output table named 'predicted_cars' containing predicted MSRP and residuals.