mbc

mbcScore

Description

The mbcScore action processes a model created by the mbcFit action and saved in an item store. It produces clustering weights and other observation-wise statistics. This is useful for applying a trained clustering model to new data to determine the cluster membership probabilities for each observation.

mbc.mbcScore result=<results> status=<rc> / allstats=TRUE | FALSE, casOut={<casouttable>}, copyVars="ALL" | "ALL_NUMERIC" | {"variable-name-1" <, "variable-name-2", ...>}, currClus="string", display={<displayTables>}, loglik="string", maxpost="string", nextClus="string", outputTables={<outputTables>}, pred="string", restore={<castable>}, role="string", table={<castable>} ;
Settings
ParameterDescription
allstatsWhen set to True, adds all statistics to the output table. This includes predicted values, cluster log-likelihoods, and posterior probabilities.
casOutSpecifies the output table to store the scoring results.
copyVarsSpecifies a list of variables to be copied from the input data table to the output table. You can specify 'ALL' to copy all variables.
currClusSpecifies a prefix for naming the cluster membership probability estimates from the expectation (E) step that produced the final model estimates.
displaySpecifies a list of results tables to be displayed.
loglikSpecifies a prefix for naming the cluster log-likelihood variables in the output table.
maxpostSpecifies a name for the variable that will contain the maximum posterior probability cluster.
nextClusSpecifies a prefix for naming the cluster membership probability estimates from an extra expectation (E) step that uses the final model estimates.
outputTablesLists the names of results tables to be saved as CAS tables on the server.
predSpecifies a prefix for naming the predicted value variables in the output table.
restoreSpecifies the input item store that contains the model information for scoring. This item store is created by the mbcFit action.
roleSpecifies the name of the column that contains the observation role.
tableSpecifies the input data table to be scored.
Data Preparation View data prep sheet
Data Creation for Scoring

To use the mbcScore action, you first need a model that has been trained using the mbcFit action and saved to an item store. Then, you need the new data that you want to score. This example first creates a sample dataset `mycas.getstarted`, trains a model, and saves it to `mycas.modelstore`. It then creates a second dataset, `mycas.score_data`, to demonstrate the scoring process.

Copied!
1DATA mycas.getstarted;
2 keep x1-x10;
3 array x{10};
4 DO i = 1 to 1000;
5 DO j = 1 to 10;
6 x{j} = rannor(12345);
7 END;
8 OUTPUT;
9 END;
10RUN;
11 
12PROC CAS;
13 mbc.mbcFit /
14 TABLE='getstarted',
15 effects={'x1', 'x2', 'x3'},
16 nClusters=2,
17 store={name='modelstore', replace=true};
18RUN;
19 
20 DATA mycas.score_data;
21 keep x1-x3;
22 array x{3};
23 DO i = 1 to 100;
24 DO j = 1 to 3;
25 x{j} = rannor(54321);
26 END;
27 OUTPUT;
28 END;
29 RUN;
30QUIT;

Examples

This example demonstrates a simple use case of the mbcScore action. It scores the `mycas.score_data` table using the model stored in `mycas.modelstore` and saves the output to a table named `mycas.scored_output`.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 mbc.mbcScore /
3 TABLE={name='score_data'},
4 restore={name='modelstore'},
5 casOut={name='scored_output', replace=true};
6RUN;
Result :
The action creates a new table `mycas.scored_output` containing the original data from `score_data` plus new columns with scoring information, such as the predicted cluster for each observation.

This example performs scoring and requests all available output statistics. It uses the `allStats` parameter to generate detailed output, including cluster log-likelihoods and posterior probabilities, and copies all variables from the input table to the output table.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 mbc.mbcScore /
3 TABLE={name='score_data'},
4 restore={name='modelstore'},
5 casOut={name='scored_output_detailed', replace=true},
6 allStats=true,
7 copyVars='ALL';
8RUN;
Result :
An output table named `scored_output_detailed` is created. In addition to the predicted cluster, it contains columns for the log-likelihood of each cluster, the posterior probability for each cluster, and all original variables from the `score_data` table.

FAQ

What is the primary function of the mbcScore action?
Which parameter is mandatory for specifying the data to be scored?
How do you specify the model to use for scoring?
What is the purpose of the 'casOut' parameter?
How can you add all available statistics to the output table?
What does the 'copyVars' parameter do?