bart bartScore

Standard Case: Predicting Customer Lifetime Value (CLV)

Scénario de test & Cas d'usage

Business Context

A retail company has built a BART model to predict the potential lifetime value of new customers based on their initial demographic data and first-week purchase behavior. The goal is to score a new batch of customers to identify high-value prospects for a premium loyalty program.
About the Set : bart

Bayesian Additive Regression Trees models.

Discover all actions of bart
Data Preparation

Create a training set with customer demographics and first-week spend, and a similar scoring set for new customers. A bartGauss model is trained to predict 'clv_actual'.

Copied!
1DATA mycas.customer_train;
2 call streaminit(123);
3 DO customer_id = 1 to 2000;
4 age = 20 + floor(rand('UNIFORM') * 45);
5 first_week_spend = 50 + rand('NORMAL', 150, 75);
6 num_visits = 1 + floor(rand('UNIFORM') * 10);
7 clv_actual = (first_week_spend * 2.5) + (age * 15) - (num_visits * 20) + rand('NORMAL', 0, 500);
8 OUTPUT;
9 END;
10RUN;
11 
12DATA mycas.customer_score;
13 call streaminit(456);
14 DO customer_id = 2001 to 2500;
15 age = 18 + floor(rand('UNIFORM') * 50);
16 first_week_spend = 60 + rand('NORMAL', 120, 60);
17 num_visits = 1 + floor(rand('UNIFORM') * 8);
18 OUTPUT;
19 END;
20RUN;

Étapes de réalisation

1
Train the BART model using bartGauss and save the model state.
Copied!
1PROC CAS;
2 bart.bartGauss /
3 TABLE='customer_train',
4 inputs={{name='age'}, {name='first_week_spend'}, {name='num_visits'}},
5 target='clv_actual',
6 saveState={name='clv_model', replace=true};
7QUIT;
2
Score the new customers using bartScore, generating custom-named 99% credible limits and copying customer identifiers to the output table.
Copied!
1PROC CAS;
2 bart.bartScore /
3 TABLE='customer_score',
4 restore='clv_model',
5 casOut={name='clv_predictions', replace=true},
6 copyVars={'customer_id', 'age'},
7 alpha=0.01,
8 pred='Predicted_CLV',
9 lcl='Lower_99_CLV',
10 ucl='Upper_99_CLV',
11 resid='Prediction_Error';
12QUIT;
3
Verify the output table structure and content.
Copied!
1PROC CAS;
2 TABLE.columnInfo / TABLE='clv_predictions';
3 TABLE.fetch / TABLE='clv_predictions', to=10;
4QUIT;

Expected Result


The output table 'mycas.clv_predictions' should be created. It must contain the 'customer_id' and 'age' columns from the input table. It must also contain the newly generated columns: 'Predicted_CLV', 'Lower_99_CLV', 'Upper_99_CLV', and 'Prediction_Error'. The values in these columns should be populated for all 500 customers from the scoring table.