The brScore action scores text data based on a set of Boolean rules. These rules are typically generated by the brTrain action and are provided in an input table. The action processes an input document-term table and produces an output table indicating which documents satisfy which rules.
| Parameter | Description |
|---|---|
| casOut | Specifies a data table to contain the rule matching (whether a document satisfies a rule) results. |
| docId | Specifies the variable in the TABLE= data table that contains the document ID. Default is '_document_'. |
| nThreads | Specifies the number of threads to be used per node. The value must be an integer between 0 and 1024. Default is 0. |
| ruleTerms | Specifies a data table that contains the terms in each rule that is generated by the training action. This is a required parameter. |
| table | Specifies the input data table for rule scoring. This is a required parameter. |
| termId | Specifies the variable in the TABLE= data table that contains the term ID. Default is '_termnum_'. |
| useOldNames | Specifies whether to use the old variable names used in HPBOOLRULE. Aliases are legacyName and legacyNames. Default is FALSE. |
This example first creates two CAS tables. The `rule_terms` table defines two Boolean rules. Rule 1 is 'term1 AND term2'. Rule 2 is 'term3'. The `doc_terms` table represents three documents and the terms they contain.
| 1 | DATA mycas.doc_terms; |
| 2 | INFILE DATALINES delimiter=','; |
| 3 | INPUT docid termid; |
| 4 | DATALINES; |
| 5 | 1,1 |
| 6 | 1,2 |
| 7 | 2,1 |
| 8 | 3,3 |
| 9 | ; |
| 10 | |
| 11 | DATA mycas.rule_terms; |
| 12 | INFILE DATALINES delimiter=','; |
| 13 | INPUT ruleid termid; |
| 14 | DATALINES; |
| 15 | 1,1 |
| 16 | 1,2 |
| 17 | 2,3 |
| 18 | ; |
This example scores the `doc_terms` table against the rules defined in `rule_terms` and stores the results in the `scored_docs` table.
| 1 | PROC CAS; |
| 2 | boolRule.brScore |
| 3 | TABLE={name='doc_terms'}, |
| 4 | ruleTerms={name='rule_terms'}, |
| 5 | docId='docid', |
| 6 | termId='termid', |
| 7 | casOut={name='scored_docs', replace=true}; |
| 8 | RUN; |
This example demonstrates scoring with more explicit parameter naming. It uses the `useOldNames` parameter, which changes the output column names to a legacy format (_BR_Rule_1_, _BR_Rule_2_, etc.). The output is saved to a different caslib.
| 1 | PROC CAS; |
| 2 | boolRule.brScore |
| 3 | TABLE={name='doc_terms', caslib='mycas'}, |
| 4 | ruleTerms={name='rule_terms', caslib='mycas'}, |
| 5 | docId='docid', |
| 6 | termId='termid', |
| 7 | useOldNames=true, |
| 8 | casOut={name='scored_docs_legacy', caslib='mycas', replace=true, label='Scoring with Legacy Names'}; |
| 9 | RUN; |
A telecom company receives thousands of support tickets daily. They want to automatically tag tickets related to 'Billing Disputes' and 'Technical Outages' to route them to the ...
A hospital network needs to screen 100,000 patient history records overnight to identify potential candidates for a clinical trial based on specific comorbidity patterns defined...
An insurance firm is migrating from an older SAS version (HPBOOLRULE) to Viya. Their data schema uses non-standard column names, and the downstream reporting systems expect the ...