boolRule brTrain

Edge Case: Strict Filtering for Rare Events

Scénario de test & Cas d'usage

Business Context

In medical research, we want to identify specific symptom patterns for a rare disease. We need to ensure that rules are NOT created for statistically insignificant coincidences. We test the 'minSupports' and 'gPositive' filters.
About the Set : boolRule

Extraction of Boolean rules for classification.

Discover all actions of boolRule
Data Preparation

Creation of data where terms appear very infrequently (fewer than the threshold we will set).

Copied!
1 
2DATA casuser.rare_symptoms;
3INPUT patient_id symptom_id;
4DATALINES;
51 99 2 99 3 50 4 50 ;
6 
7RUN;
8 
9DATA casuser.diagnosis;
10INPUT patient_id RESULT $;
11DATALINES;
121 Positive 2 Positive 3 Negative 4 Negative ;
13 
14RUN;
15 

Étapes de réalisation

1
Attempt to train rules with strict support requirements (minSupports=5)
Copied!
1 
2PROC CAS;
3boolRule.brTrain / TABLE={name='rare_symptoms'} docId='patient_id' termId='symptom_id' docInfo={TABLE={name='diagnosis'}, id='patient_id', targets={'result'}} minSupports=5 gPositive=20 casOut={name='strict_rules', replace=true};
4 
5RUN;
6 
2
Verify that the output table is empty or contains no rules
Copied!
1 
2PROC CAS;
3SIMPLE.numRows / TABLE={name='strict_rules'};
4 
5RUN;
6 

Expected Result


The action runs without error, but the resulting 'strict_rules' table should be empty (0 rows) because no term appears 5 times (max occurrence is 2), confirming the filtering logic works.