mlTools crossValidate

Rare Disease Diagnosis with SVM and Specific Target Event

Scénario de test & Cas d'usage

Business Context

A medical research facility is testing a Support Vector Machine (SVM) classifier for a rare disease. They need to validate the model using a low number of folds due to small sample size and must specify the exact target event ('Positive') to ensure correct sensitivity analysis, while keeping logs minimal.
Data Preparation

Creation of a small, specific medical dataset with a text-based target variable.

Copied!
1 
2DATA casuser.rare_disease;
3INPUT patient_id biomarker_a biomarker_b diagnosis $;
4DATALINES;
51 0.5 1.2 Negative 2 0.8 1.1 Positive 3 0.2 0.9 Negative 4 0.9 1.5 Positive 5 0.4 1.0 Negative 6 0.6 1.1 Negative 7 0.9 1.4 Positive 8 0.3 0.8 Negative ;
6 
7RUN;
8 

Étapes de réalisation

1
Ensure data is loaded and promoted for access.
Copied!
1 
2PROC CAS;
3 
4TABLE.promote name="rare_disease" caslib="casuser";
5 
6QUIT;
7 
2
Execute crossValidate with 'SVM', minimal logging (logLevel=0), specific target event, and kFolds=2 (minimum allowed).
Copied!
1PROC CAS;
2 mlTools.crossValidate /
3 TABLE={name="rare_disease"}
4 modelType="SVM"
5 kFolds=2
6 logLevel=0
7 targetEvent="Positive"
8 casOut={name="cv_svm_diagnosis", replace=TRUE}
9 trainOptions={
10 target="diagnosis",
11 inputs={"biomarker_a", "biomarker_b"},
12 nominals={"diagnosis"}
13 };
14QUIT;

Expected Result


The action executes silently (no detailed logs) due to logLevel=0. It successfully performs a 2-fold cross-validation using the SVM algorithm. The 'Positive' value is correctly used as the event of interest for calculating statistics like misclassification rate.