decisionTree

forestScore

Description

The forestScore action scores an input table using a previously trained forest model. It generates predicted values and can optionally produce misclassification rates (for classification) or mean squared errors (for regression). The action supports various scoring options including generating predicted probabilities for assessment, handling missing values, calculating variable interaction importance, and outputting the results to a new CAS table.

Settings
ParameterDescription
applyRowOrderSpecifies whether to use a prespecified row ordering. This requires using the orderby and groupby parameters on a preliminary table.partition action call. Alias: reproducibleRowOrder.
assessWhen set to True, predicted probabilities are added to the result table for the event levels, enabling use with the assess action.
assessOneRowWhen set to True, predicted probabilities for all event levels are included as separate columns (named with prefix _DT_P_) in the result table.
casOutSpecifies the output table settings to store the scored results. If not specified, the action only computes statistics.
copyVarsSpecifies the variables to copy from the input table to the output table. Alias: copyVar.
encodeNameSpecifies whether to encode the variable names, such as using the prefix P_ instead of _DT_P_ for predicted probabilities.
imputeSpecifies how to handle observations with non-missing values for the target. When True, observed values are assumed known without error and used as predicted values.
includeMissingSpecifies whether to include observations with missing values. When False, observations with missing values for model variables are ignored.
isolationSpecifies isolation forest scoring options. Default is FALSE.
modelIdSpecifies the variable name for the model ID in the scored table.
modelTableSpecifies the table containing the trained forest model. This is a required parameter. Alias: model.
nTreeSpecifies the number of trees to use during scoring. Alias: nTrees.
rbaImpSpecifies whether to calculate variable importance using the random branch assignments (RBA) method.
seedSpecifies the random number generator seed. Set to a positive value for reproducibility.
tableSpecifies the input table to be scored. This is a required parameter.
targetSpecifies the target variable name. Not required if the target name in the model matches the input table.
treeErrorWhen set to True, computes the error for each tree.
treeVotesRequests that the scored table be enhanced with information about the votes of individual trees.
varIntImpRequests variable interaction importance and specifies the maximum degree of interaction (Range: 0-3).
voteSpecifies the voting strategy for classification: 'MAJORITY' (majority vote) or 'PROB' (average probability).
Data Preparation View data prep sheet
Data Loading and Model Training

Loads the HMEQ sample dataset and trains a forest model to be used in the scoring examples.

Copied!
1PROC CAS;
2 SESSION casauto;
3 TABLE.loadTable RESULT=r STATUS=rc / caslib="samplibrary" path="hmeq.csv" casout={name="hmeq", replace=true};
4 decisionTree.forestTrain / TABLE={name="hmeq", where="BAD is not null"} target="BAD" inputs={"LOAN", "MORTDUE", "VALUE", "REASON", "JOB", "YOJ", "DEROG", "DELINQ", "CLAGE", "NINQ", "CLNO", "DEBTINC"} nominals={"BAD", "REASON", "JOB"} modelTable={name="forest_model", replace=true};
5 RUN;

Examples

Scores the HMEQ table using the trained forest model and prints the misclassification rate.

SAS® / CAS Code Code awaiting community validation
Copied!
1 
2PROC CAS;
3 
4decisionTree.forestScore / TABLE="hmeq" modelTable="forest_model";
5 
6 
7RUN;
8 
Result :
The action output will display the scoring information, including the number of observations read and used, and the misclassification rate for the model.

Scores the table while generating an output table with probabilities, encoded names, and copied variables. It also computes variable interaction importance.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 decisionTree.forestScore /
3 TABLE="hmeq"
4 modelTable="forest_model"
5 casOut={name="hmeq_scored", replace=true}
6 copyVars={"LOAN", "BAD"}
7 assess=true
8 encodeName=true
9 vote="PROB"
10 varIntImp=1;
11 TABLE.fetch / TABLE="hmeq_scored" to=5;
12 RUN;
Result :
The action creates a table named 'hmeq_scored' containing the original LOAN and BAD variables, along with the predicted probabilities (prefixed with P_ due to encodeName) and prediction results. The output will also include tables for variable interaction importance.

FAQ

What is the primary function of the forestScore action?
What does the 'casOut' parameter specify?
How does the 'assess' parameter affect the output?
What is the effect of setting 'assessOneRow' to True?
How can variables be copied from the input table to the output table?
What is the purpose of the 'impute' parameter?
How are missing values handled by default in the forestScore action?
Which parameter is required to identify the model used for scoring?
What are the options for the 'vote' parameter?
How can I obtain information about the votes of individual trees?
What does the 'varIntImp' parameter control?
How can I ensure reproducible results when using random number generation?