Scénario de test & Cas d'usage
Classification using Bayesian networks.
Discover all actions of bayesianNetClassifierCreation of a customer dataset with demographic, contract, and service information. The 'HasTechSupport' variable contains some missing values to test the handling of missing data. The target variable is 'Churn'.
| 1 | DATA casuser.telecom_churn; |
| 2 | LENGTH Contract $12. HasTechSupport $3. Churn $3.; |
| 3 | INFILE DATALINES delimiter=','; |
| 4 | INPUT CustomerID $ TenureMonths MonthlyCharges Contract $ HasTechSupport $ Churn $; |
| 5 | DATALINES; |
| 6 | C001,12,55.5,Month-to-Month,Yes,No |
| 7 | C002,24,80.0,One Year,No,No |
| 8 | C003,5,20.0,Month-to-Month,No,Yes |
| 9 | C004,55,105.2,Two Year,Yes,No |
| 10 | C005,3,45.0,Month-to-Month,,Yes |
| 11 | C006,36,95.0,One Year,Yes,No |
| 12 | C007,1,25.0,Month-to-Month,No,Yes |
| 13 | C008,68,110.5,Two Year,,No |
| 14 | C009,15,75.5,One Year,No,No |
| 15 | C010,8,30.0,Month-to-Month,No,Yes |
| 16 | ; |
| 17 | RUN; |
| 1 | PROC CAS; |
| 2 | bayesianNetClassifier.bnet |
| 3 | TABLE={name='telecom_churn'} |
| 4 | target='Churn' |
| 5 | inputs={'TenureMonths', 'MonthlyCharges', 'Contract', 'HasTechSupport'} |
| 6 | nominals={'Contract', 'HasTechSupport', 'Churn'} |
| 7 | partByFrac={train=0.7, validate=0.3, seed=1234} |
| 8 | structures={'TAN'} |
| 9 | missingNom='LEVEL' |
| 10 | saveState={name='churn_model_state', replace=true} |
| 11 | OUTPUT={casout={name='churn_scored_output', replace=true}, copyVars={'CustomerID', 'Churn'}} |
| 12 | outNetwork={name='churn_network_structure', replace=true}; |
| 13 | RUN; |
| 1 | |
| 2 | PROC CAS; |
| 3 | TABLE.tableInfo / caslib='casuser', name~='churn%'; |
| 4 | RUN; |
| 5 |
The action should successfully train a TAN model. Three tables must be created: 'churn_model_state' containing the binary model for scoring, 'churn_scored_output' with the original CustomerID, actual Churn value, and the predicted churn probabilities, and 'churn_network_structure' detailing the learned dependencies. The model should handle the missing 'HasTechSupport' values by treating them as a distinct category.