bart

bartScore

L'essentiel
At a glance
Deploying a machine learning model into a production environment requires a seamless mechanism to generate predictions on fresh, unseen data. The bartScore action serves as this bridge within SAS Viya, enabling analysts to execute pre-trained Bayesian Additive Regression Tree models against new input tables. This scoring phase is not merely about obtaining a target value; it provides a comprehensive statistical view, including posterior variability and credibility intervals, which are crucial for risk-aware decision-making. To assist with your deployment pipelines, we have compiled a set of Frequently Asked Questions covering syntax usage, output table schema, and performance optimization during the scoring process.

Description

The bartScore action scores a data table using a previously fitted Bayesian additive regression trees (BART) model. It generates predicted values, residuals, and confidence limits for each observation in the input data.

bart.bartScore { alpha=double, avgOnly=boolean, casOut={...}, copyVars={...}, into="string", intoCutpt=double, lcl="string", pred="string", resid="string", restore={...}, seed=long, table={...}, ucl="string" }
Settings
ParameterDescription
alpha Specifies the significance level for constructing equal-tail credible limits. Default is 0.05.
avgOnly When set to FALSE, includes predictions from each MCMC sample in the output, not just the average. Default is TRUE.
casOut Specifies the output table to store observation-wise statistics.
copyVars A list of variables to copy from the input data table to the output table.
into Specifies the name of the variable for predicted class labels in classification models.
intoCutpt Specifies the cutoff probability for classifying observations. Default is 0.5.
lcl Specifies the name for the lower credible limit variable.
pred Specifies the name for the predicted value variable. Default is 'Pred'.
resid Specifies the name for the residual variable.
restore Specifies the input CAS table that contains the fitted model information from a previous training run (e.g., from bartGauss or bartProbit).
seed Specifies the random seed for the pseudorandom number generator. Default is 0 for a random seed.
table Specifies the input data table to be scored.
ucl Specifies the name for the upper credible limit variable.
Data Preparation View data prep sheet
Data Creation for Scoring

This example first creates a training dataset 'mycas.train_data' and a scoring dataset 'mycas.score_data'. A BART model is trained using the 'bartGauss' action, and the model is saved to 'mycas.bart_model'. The 'score_data' table is then used for scoring.

Copied!
1PROC CAS;
2DATA mycas.train_data;
3 DO i = 1 to 100;
4 x1 = rand('UNIFORM');
5 x2 = rand('UNIFORM');
6 x3 = rand('NORMAL');
7 y = 10 * sin(3.14 * x1 * x2) + 20 * (x3 - 0.5)**2 + 10 * x1 + 5 * x2 + rand('NORMAL');
8 OUTPUT;
9 END;
10RUN;
11 
12DATA mycas.score_data;
13 DO i = 1 to 50;
14 x1 = rand('UNIFORM');
15 x2 = rand('UNIFORM');
16 x3 = rand('NORMAL');
17 OUTPUT;
18 END;
19RUN;
20 
21bart.bartGauss /
22 TABLE='train_data',
23 inputs={{name='x1'}, {name='x2'}, {name='x3'}},
24 target='y',
25 saveState={name='bart_model', replace=true};
26QUIT;

Examples

Scores the 'score_data' table using the saved model 'bart_model' and saves the results to 'mycas.bart_scored_simple'.

SAS® / CAS Code Code awaiting community validation
Copied!
1 
2PROC CAS;
3bart.bartScore / TABLE='score_data', restore='bart_model', casOut={name='bart_scored_simple', replace=true};
4 
5QUIT;
6 
Result :
An output table 'mycas.bart_scored_simple' is created containing the predicted values for each observation in 'mycas.score_data'.

This example scores the 'score_data' table, calculates 90% credible limits (alpha=0.1), and includes the original input variables in the output. Custom names are provided for the predicted value, lower limit, upper limit, and residual variables.

SAS® / CAS Code Code awaiting community validation
Copied!
1 
2PROC CAS;
3bart.bartScore / TABLE='score_data', restore='bart_model', casOut={name='bart_scored_detailed', replace=true}, copyVars={'x1', 'x2', 'x3'}, alpha=0.1, pred='Predicted_Y', lcl='Lower_CI', ucl='Upper_CI', resid='Residuals';
4 
5QUIT;
6 
Result :
An output table 'mycas.bart_scored_detailed' is created. It includes the original variables (x1, x2, x3), the predicted values ('Predicted_Y'), the 90% lower ('Lower_CI') and upper ('Upper_CI') credible limits, and the residuals ('Residuals').

This example scores the data and includes the predictions from every MCMC sample in the output table, in addition to the average prediction. This is useful for detailed posterior distribution analysis.

SAS® / CAS Code Code awaiting community validation
Copied!
1 
2PROC CAS;
3bart.bartScore / TABLE='score_data', restore='bart_model', casOut={name='bart_scored_allsamples', replace=true}, avgOnly=false;
4 
5QUIT;
6 
Result :
The output table 'mycas.bart_scored_allsamples' will contain the average predicted value ('Pred') and additional columns representing the prediction from each individual MCMC sample (e.g., _S_1, _S_2, etc.).

FAQ

What is the purpose of the bartScore action?
Which parameter is required to specify the output table for the scoring results?
How do you specify the fitted model to use for scoring?
What does the `avgOnly` parameter control?
How can I copy variables from the input data table to the output table?
What is the function of the `alpha` parameter?
How can I name the predicted values, residuals, and credible limits in the output table?

Associated Scenarios

Use Case
Standard Case: Predicting Customer Lifetime Value (CLV)

A retail company has built a BART model to predict the potential lifetime value of new customers based on their initial demographic data and first-week purchase behavior. The go...

Use Case
Performance Test: Full Posterior Distribution for Equipment Failure Analysis

An engineering team is analyzing sensor data from industrial equipment to predict time-to-failure. For a critical component, they need not just the average prediction but the en...

Use Case
Edge Case: Scoring Loan Default Risk with Missing Data

A financial institution uses a BART classification model (trained with bartProbit) to assess loan default risk. The scoring process must be robust and handle incoming applicatio...