Edge Case: Scoring Code with Custom Model ID and Missing Values

Business Context

A financial institution maintains multiple versions of a credit risk model. To avoid confusion in their model inventory, each generated scoring code needs a specific, human-readable model identifier. They are testing the `modelId` parameter and ensuring the process is robust to missing data in the training set.

About the Set : neuralNet

Training of classical artificial neural networks.

Discover all actions of neuralNet

Data Preparation

Create a credit application dataset with applicant information. Intentionally introduce missing values (`.`) in the `Years_At_Address` variable to test model robustness.

Copied!

1	DATA casuser.credit_applications;
2	call streaminit(789);
3	DO i = 1 to 1000;
4	Income = 30000 + rand('UNIFORM') * 150000;
5	Loan_Amount = 5000 + rand('UNIFORM') * 50000;
6	IF rand('UNIFORM') > 0.85 THEN Years_At_Address = . ;
7	ELSE Years_At_Address = rand('INTEGER', 0, 25);
8	Default_Flag = rand('BERNOULLI', 0.2);
9	OUTPUT;
10	END;
11	RUN;

Étapes de réalisation

Train a neural network on the `credit_applications` data. The `annTrain` action will impute missing values by default. Store the model in `credit_risk_model_v2`.

Copied!

1	PROC CAS;
2	neuralNet.annTrain /
3	TABLE={name='credit_applications', vars={{name='Years_At_Address', impute='MEAN'}}},
4	inputs={{name='Income'}, {name='Loan_Amount'}, {name='Years_At_Address'}},
5	target='Default_Flag',
6	casOut={name='credit_risk_model_v2', replace=true};
7	RUN;

Use `annCode` to generate scoring code, setting the `modelId` parameter to a specific version string: 'CreditRisk_NN_v2_Q42025'.

Copied!

1	PROC CAS;
2	neuralNet.annCode /
3	modelTable={name='credit_risk_model_v2'},
4	modelId='CreditRisk_NN_v2_Q42025',
5	code={casOut={name='credit_risk_scoring_code_v2', replace=true}};
6	RUN;

Expected Result

The `annCode` action should successfully generate scoring code. An inspection of the generated DATA step code must show that the output prediction variable is named using the custom ID (e.g., `P_CreditRisk_NN_v2_Q420251`). This confirms that the `modelId` parameter correctly customizes the output code for model management and versioning purposes, even when the training data contained missing values.

Voir la documentation technique de annCode