neuralNet annScore

Scoring Patient Records with Missing Data and Pre-existing Diagnoses

Scénario de test & Cas d'usage

Business Context

A healthcare provider needs to run a diagnostic prediction model on a batch of new patient records. The dataset is known to be 'dirty': some records have missing test results (input variables), while others are for returning patients who already have a manually entered diagnosis (target variable). The goal is to score only the patients without a diagnosis, while gracefully handling those with incomplete data.
About the Set : neuralNet

Training of classical artificial neural networks.

Discover all actions of neuralNet
Data Preparation

Creates a `patient_records` table. Patient P01 has a pre-existing diagnosis. P02 is a new patient with complete data. P03 is a new patient with a missing `BloodPressure` value. The model `diagnostic_model` is assumed to exist.

Copied!
1DATA mycas.patient_records(promote=yes);
2 LENGTH Diagnosis $20.;
3 INPUT PatientID Age BloodPressure Cholesterol Diagnosis $;
4 DATALINES;
51 65 140 200 Type2Diabetes
62 45 120 190 .
73 55 . 220 .
84 70 160 . .
9;
10RUN;

Étapes de réalisation

1
Run `annScore` with `impute=true`. This should prevent re-scoring patients who already have a value in the target variable `Diagnosis`.
Copied!
1PROC CAS;
2 neuralNet.annScore /
3 TABLE={name='patient_records'},
4 modelTable={name='diagnostic_model'},
5 casOut={name='diagnostic_results', replace=true},
6 copyVars={'PatientID'},
7 impute=true;
8RUN;
9QUIT;
2
Fetch and examine the results to confirm the imputation logic. Patient P01's prediction should match their original diagnosis. Patients P02 and P03 should have new predictions, even though P03 had a missing input value.
Copied!
1PROC CAS;
2 TABLE.fetch / TABLE='diagnostic_results';
3RUN;
4QUIT;

Expected Result


The output table `mycas.diagnostic_results` will be created. For PatientID=1, the predicted value column `_NN_PredName_` should be 'Type2Diabetes', matching the source data. For PatientID=2 and PatientID=3, the `_NN_PredName_` column should contain a model-generated prediction. The action should complete without errors, demonstrating that the model's internal imputation handled the missing `BloodPressure` for P03 and missing `Cholesterol` for P04.