decisionTree

forestCode

Description

The forestCode action generates SAS DATA step scoring code from a trained forest model. This code can be used to score new data directly within SAS or derived environments. The action provides options to control the generated code's format, select a specific voting method (majority or probability), and limit the number of trees used for scoring, allowing for model simplification or optimization.

decisionTree.forestCode <result=results> <status=rc> / code={ casOut={ caslib="string", compress=TRUE | FALSE, indexVars={"variable-name-1" <, "variable-name-2", ...>}, label="string", lifetime=64-bit-integer, maxMemSize=64-bit-integer, memoryFormat="DVR" | "INHERIT" | "STANDARD", name="table-name", promote=TRUE | FALSE, replace=TRUE | FALSE, replication=integer, tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE", threadBlockSize=64-bit-integer, timeStamp="string", where={"string-1" <, "string-2", ...>} }, comment=TRUE | FALSE, fmtWdth=integer, indentSize=integer, labelId=integer, lineSize=integer, noTrim=TRUE | FALSE, tabForm=TRUE | FALSE }, encodeName=TRUE | FALSE, modelTable={ caslib="string", computedOnDemand=TRUE | FALSE, computedVars={{...}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=value-1 <, key-2=value-2, ...>}, importOptions={fileType="ANY" | ...}, name="table-name", singlePass=TRUE | FALSE, vars={{...}, {...}}, where="where-expression", whereTable={...} }, nTree=integer, vote="MAJORITY" | "PROB";
Settings
ParameterDescription
modelTableSpecifies the in-memory table that contains the trained forest model. This is a required parameter.
codeSpecifies parameters for the output code generation, including the output table name (casOut) and formatting options like indentation (indentSize) and line width (lineSize).
nTreeSpecifies the number of trees to use for scoring. If not specified, all trees in the model are used.
voteSpecifies the voting strategy for classification: 'MAJORITY' (default) uses the majority class, while 'PROB' uses the average probability.
encodeNameIf set to True, encodes the predicted variable names (e.g., using P_Target instead of _DT_P_Target) in the generated code.
Data Preparation View data prep sheet
Model Training for Code Generation

Load the 'Cars' dataset and train a forest model to predict the 'Origin' of the car. This model table is required for the forestCode action.

Copied!
1PROC CAS;
2 /* Load sample data */
3 TABLE.loadTable / path="cars.csv" caslib="samples" casOut={name="cars", replace=true};
4 
5 /* Train a forest model */
6 decisionTree.forestTrain /
7 TABLE={name="cars"}
8 target="Origin"
9 inputs={"MSRP", "EngineSize", "Cylinders", "Horsepower", "MPG_City"}
10 nominals={"Origin"}
11 casOut={name="cars_model", replace=true};
12RUN;

Examples

Generates the SAS scoring code from the 'cars_model' table and saves it to a table named 'scoring_code'.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 decisionTree.forestCode /
3 modelTable={name="cars_model"}
4 code={casOut={name="scoring_code", replace=true}};
5RUN;
Result :
A table named 'scoring_code' is created in the active caslib containing the DATA step code.

Generates scoring code using probability voting ('PROB'), limiting the model to the first 10 trees, and encoding variable names.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 decisionTree.forestCode /
3 modelTable={name="cars_model"}
4 code={
5 casOut={name="prob_scoring_code", replace=true},
6 noTrim=true,
7 indentSize=4
8 }
9 nTree=10
10 vote="PROB"
11 encodeName=true;
12RUN;
Result :
The table 'prob_scoring_code' is created with SAS code that calculates probabilities using 10 trees, with encoded variable names and specific formatting.

FAQ

What is the 'code' parameter used for in the 'forestCode' action?
How is the output SAS score code specified using the 'casOut' subparameter within the 'code' parameter?
What does the 'encodeName' parameter do in the 'forestCode' action?
What is the purpose of the 'modelTable' parameter in the 'forestCode' action?
How do you specify the caslib for the input model table using the 'caslib' subparameter?
What is the function of the 'computedOnDemand' subparameter in 'modelTable'?
What does the 'computedVars' subparameter specify in 'modelTable'?
What is the role of the 'computedVarsProgram' subparameter in 'modelTable'?
What is the 'dataSourceOptions' subparameter in 'modelTable' used for?
What does the 'importOptions' subparameter define for 'modelTable'?
What is the 'name' subparameter used for within 'modelTable'?
What does the 'singlePass' subparameter control in 'modelTable'?
What is specified by the 'vars' subparameter within 'modelTable'?
How is data subsetting performed on the 'modelTable' using the 'where' subparameter?
What is the purpose of the 'whereTable' subparameter in 'modelTable'?
What does the 'nTree' parameter define in the 'forestCode' action?
How does the 'vote' parameter influence classification in the 'forestCode' action?
What does the 'MAJORITY' option for the 'vote' parameter signify?
What does the 'PROB' option for the 'vote' parameter signify?