Scénario de test & Cas d'usage
Creates a dataset 'clinical_trial' where the predictor 'biomarker_b' has approximately 30% missing values (represented by '.'). The target is 'treatment_response'.
| 1 | DATA clinical_trial; |
| 2 | call streaminit(99); |
| 3 | DO patient_id = 1 to 500; |
| 4 | age = 30 + rand('UNIFORM') * 40; |
| 5 | drug_dosage = rand('UNIFORM') * 100; |
| 6 | biomarker_a = 10 + rand('NORMAL', 0, 2); |
| 7 | biomarker_b = 25 + rand('NORMAL', 0, 5); |
| 8 | IF rand('UNIFORM') < 0.3 THEN call missing(biomarker_b); |
| 9 | treatment_response = 50 + (biomarker_a - 10)*3 + (biomarker_b - 25)*2 - (age-30)*0.5 + rand('NORMAL', 0, 10); |
| 10 | IF missing(biomarker_b) THEN treatment_response = treatment_response - 15; |
| 11 | OUTPUT; |
| 12 | END; |
| 13 | RUN; |
| 1 | PROC CASUTIL; |
| 2 | load DATA=clinical_trial casout='clinical_trial' replace; |
| 3 | RUN; |
| 4 | QUIT; |
| 1 | PROC CAS; |
| 2 | LOADACTIONSET 'bart'; |
| 3 | bart.bartGauss / |
| 4 | TABLE={name='clinical_trial'}, |
| 5 | target='treatment_response', |
| 6 | inputs={'age', 'drug_dosage', 'biomarker_a', 'biomarker_b'}, |
| 7 | missing='SEPARATE', |
| 8 | nTree=50, |
| 9 | nBI=500, |
| 10 | nMC=2000, |
| 11 | seed=789, |
| 12 | outputTables={names={'VarImp', 'MissingInfo'}}; |
| 13 | RUN; |
| 14 | QUIT; |
The action completes successfully without errors. The 'MissingInfo' output table should be generated, showing that 'biomarker_b' had missing values and they were handled using the 'SEPARATE' method. The 'VarImp' table should include 'biomarker_b' as a predictor, confirming it was not dropped from the model. This demonstrates the action's ability to build a predictive model on incomplete data.