bart bartScore

Performance Test: Full Posterior Distribution for Equipment Failure Analysis

Scénario de test & Cas d'usage

Business Context

An engineering team is analyzing sensor data from industrial equipment to predict time-to-failure. For a critical component, they need not just the average prediction but the entire posterior distribution of predictions from the MCMC samples to perform advanced risk analysis and understand the full range of uncertainty.
About the Set : bart

Bayesian Additive Regression Trees models.

Discover all actions of bart
Data Preparation

Generate a large training dataset (50k rows) and a scoring dataset (10k rows) simulating sensor readings. Train a model to predict 'time_to_failure'.

Copied!
1DATA mycas.sensor_train_large;
2 call streaminit(789);
3 DO i = 1 to 50000;
4 vibration = rand('NORMAL', 5, 1.5);
5 temperature = rand('NORMAL', 80, 10);
6 pressure = rand('NORMAL', 100, 5);
7 time_to_failure = 1000 - (temperature*5) - (vibration*20) - (pressure*1.5) + rand('NORMAL', 0, 50);
8 OUTPUT;
9 END;
10RUN;
11 
12DATA mycas.sensor_score_large;
13 call streaminit(101);
14 DO i = 1 to 10000;
15 vibration = rand('NORMAL', 5.5, 1.5);
16 temperature = rand('NORMAL', 85, 10);
17 pressure = rand('NORMAL', 102, 5);
18 OUTPUT;
19 END;
20RUN;

Étapes de réalisation

1
Train the BART model on the large sensor dataset.
Copied!
1PROC CAS;
2 bart.bartGauss /
3 TABLE='sensor_train_large',
4 inputs={{name='vibration'}, {name='temperature'}, {name='pressure'}},
5 target='time_to_failure',
6 saveState={name='failure_model', replace=true};
7QUIT;
2
Score the large scoring table, requesting all MCMC sample predictions by setting avgOnly=false. This will test performance and the ability to generate a wide table.
Copied!
1PROC CAS;
2 bart.bartScore /
3 TABLE='sensor_score_large',
4 restore='failure_model',
5 casOut={name='failure_posterior_preds', replace=true},
6 avgOnly=false;
7QUIT;
3
Check the output table's columns to confirm the presence of individual sample predictions.
Copied!
1 
2PROC CAS;
3TABLE.columnInfo / TABLE='failure_posterior_preds';
4QUIT;
5 

Expected Result


The action should execute successfully on the 10,000-row scoring table. The output table 'mycas.failure_posterior_preds' will be created. The column information should show the default 'Pred' column plus a series of columns for individual MCMC sample predictions (e.g., '_S_1', '_S_2', '_S_3', ...). The number of these columns depends on the default MCMC samples from the training step. The table should have 10,000 rows.