decisionTree

forestScore

Description

The forestScore action scores an input table using a previously trained forest model. It generates predicted values and can optionally produce misclassification rates (for classification) or mean squared errors (for regression). The action supports various scoring options including generating predicted probabilities for assessment, handling missing values, calculating variable interaction importance, and outputting the results to a new CAS table.

Settings
ParameterDescription
applyRowOrder Specifies whether to use a prespecified row ordering. This requires using the orderby and groupby parameters on a preliminary table.partition action call. Alias: reproducibleRowOrder.
assess When set to True, predicted probabilities are added to the result table for the event levels, enabling use with the assess action.
assessOneRow When set to True, predicted probabilities for all event levels are included as separate columns (named with prefix _DT_P_) in the result table.
casOut Specifies the output table settings to store the scored results. If not specified, the action only computes statistics.
copyVars Specifies the variables to copy from the input table to the output table. Alias: copyVar.
encodeName Specifies whether to encode the variable names, such as using the prefix P_ instead of _DT_P_ for predicted probabilities.
impute Specifies how to handle observations with non-missing values for the target. When True, observed values are assumed known without error and used as predicted values.
includeMissing Specifies whether to include observations with missing values. When False, observations with missing values for model variables are ignored.
isolation Specifies isolation forest scoring options. Default is FALSE.
modelId Specifies the variable name for the model ID in the scored table.
modelTable Specifies the table containing the trained forest model. This is a required parameter. Alias: model.
nTree Specifies the number of trees to use during scoring. Alias: nTrees.
rbaImp Specifies whether to calculate variable importance using the random branch assignments (RBA) method.
seed Specifies the random number generator seed. Set to a positive value for reproducibility.
table Specifies the input table to be scored. This is a required parameter.
target Specifies the target variable name. Not required if the target name in the model matches the input table.
treeError When set to True, computes the error for each tree.
treeVotes Requests that the scored table be enhanced with information about the votes of individual trees.
varIntImp Requests variable interaction importance and specifies the maximum degree of interaction (Range: 0-3).
vote Specifies the voting strategy for classification: 'MAJORITY' (majority vote) or 'PROB' (average probability).
Data Preparation View data prep sheet
Data Loading and Model Training

Loads the HMEQ sample dataset and trains a forest model to be used in the scoring examples.

Copied!
1PROC CAS;
2 SESSION casauto;
3 TABLE.loadTable RESULT=r STATUS=rc / caslib="samplibrary" path="hmeq.csv" casout={name="hmeq", replace=true};
4 decisionTree.forestTrain / TABLE={name="hmeq", where="BAD is not null"} target="BAD" inputs={"LOAN", "MORTDUE", "VALUE", "REASON", "JOB", "YOJ", "DEROG", "DELINQ", "CLAGE", "NINQ", "CLNO", "DEBTINC"} nominals={"BAD", "REASON", "JOB"} modelTable={name="forest_model", replace=true};
5 RUN;

Examples

Scores the HMEQ table using the trained forest model and prints the misclassification rate.

SAS® / CAS Code Code awaiting community validation
Copied!
1 
2PROC CAS;
3 
4decisionTree.forestScore / TABLE="hmeq" modelTable="forest_model";
5 
6 
7RUN;
8 
Result :
The action output will display the scoring information, including the number of observations read and used, and the misclassification rate for the model.

Scores the table while generating an output table with probabilities, encoded names, and copied variables. It also computes variable interaction importance.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 decisionTree.forestScore /
3 TABLE="hmeq"
4 modelTable="forest_model"
5 casOut={name="hmeq_scored", replace=true}
6 copyVars={"LOAN", "BAD"}
7 assess=true
8 encodeName=true
9 vote="PROB"
10 varIntImp=1;
11 TABLE.fetch / TABLE="hmeq_scored" to=5;
12 RUN;
Result :
The action creates a table named 'hmeq_scored' containing the original LOAN and BAD variables, along with the predicted probabilities (prefixed with P_ due to encodeName) and prediction results. The output will also include tables for variable interaction importance.

FAQ

What is the primary function of the forestScore action?
What does the 'casOut' parameter specify?
How does the 'assess' parameter affect the output?
What is the effect of setting 'assessOneRow' to True?
How can variables be copied from the input table to the output table?
What is the purpose of the 'impute' parameter?
How are missing values handled by default in the forestScore action?
Which parameter is required to identify the model used for scoring?
What are the options for the 'vote' parameter?
How can I obtain information about the votes of individual trees?
What does the 'varIntImp' parameter control?
How can I ensure reproducible results when using random number generation?