boolRule

brScore

Description

The brScore action scores text data based on a set of Boolean rules. These rules are typically generated by the brTrain action and are provided in an input table. The action processes an input document-term table and produces an output table indicating which documents satisfy which rules.

boolRule.brScore { casOut={caslib="string", compress=TRUE|FALSE, indexVars={"variable-name-1" <, "variable-name-2", ...>}, label="string", lifetime=64-bit-integer, maxMemSize=64-bit-integer, memoryFormat="DVR"|"INHERIT"|"STANDARD", name="table-name", promote=TRUE|FALSE, replace=TRUE|FALSE, replication=integer, tableRedistUpPolicy="DEFER"|"NOREDIST"|"REBALANCE", threadBlockSize=64-bit-integer, timeStamp="string", where={"string-1" <, "string-2", ...>}}, docId="variable-name", nThreads=integer, ruleTerms={caslib="string", computedOnDemand=TRUE|FALSE, computedVars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, groupBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, groupByMode="NOSORT"|"REDISTRIBUTE", importOptions={fileType="ANY"|"AUDIO"|"AUTO"|"BASESAS"|"CSV"|"DELIMITED"|"DOCUMENT"|"DTA"|"ESP"|"EXCEL"|"FMT"|"HDAT"|"IMAGE"|"JMP"|"LASR"|"PARQUET"|"SOUND"|"SPSS"|"VIDEO"|"XLS", fileType-specific-parameters}, name="table-name", orderBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, singlePass=TRUE|FALSE, vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY"|"AUDIO"|"AUTO"|"BASESAS"|"CSV"|"DELIMITED"|"DOCUMENT"|"DTA"|"ESP"|"EXCEL"|"FMT"|"HDAT"|"IMAGE"|"JMP"|"LASR"|"PARQUET"|"SOUND"|"SPSS"|"VIDEO"|"XLS", fileType-specific-parameters}, name="table-name", vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}}, table={caslib="string", computedOnDemand=TRUE|FALSE, computedVars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, groupBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, groupByMode="NOSORT"|"REDISTRIBUTE", importOptions={fileType="ANY"|"AUDIO"|"AUTO"|"BASESAS"|"CSV"|"DELIMITED"|"DOCUMENT"|"DTA"|"ESP"|"EXCEL"|"FMT"|"HDAT"|"IMAGE"|"JMP"|"LASR"|"PARQUET"|"SOUND"|"SPSS"|"VIDEO"|"XLS", fileType-specific-parameters}, name="table-name", orderBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, singlePass=TRUE|FALSE, vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY"|"AUDIO"|"AUTO"|"BASESAS"|"CSV"|"DELIMITED"|"DOCUMENT"|"DTA"|"ESP"|"EXCEL"|"FMT"|"HDAT"|"IMAGE"|"JMP"|"LASR"|"PARQUET"|"SOUND"|"SPSS"|"VIDEO"|"XLS", fileType-specific-parameters}, name="table-name", vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}}, termId="variable-name", useOldNames=TRUE|FALSE };
Settings
ParameterDescription
casOutSpecifies a data table to contain the rule matching (whether a document satisfies a rule) results.
docIdSpecifies the variable in the TABLE= data table that contains the document ID. Default is '_document_'.
nThreadsSpecifies the number of threads to be used per node. The value must be an integer between 0 and 1024. Default is 0.
ruleTermsSpecifies a data table that contains the terms in each rule that is generated by the training action. This is a required parameter.
tableSpecifies the input data table for rule scoring. This is a required parameter.
termIdSpecifies the variable in the TABLE= data table that contains the term ID. Default is '_termnum_'.
useOldNamesSpecifies whether to use the old variable names used in HPBOOLRULE. Aliases are legacyName and legacyNames. Default is FALSE.
Data Preparation View data prep sheet
Data Creation

This example first creates two CAS tables. The `rule_terms` table defines two Boolean rules. Rule 1 is 'term1 AND term2'. Rule 2 is 'term3'. The `doc_terms` table represents three documents and the terms they contain.

Copied!
1DATA mycas.doc_terms;
2 INFILE DATALINES delimiter=',';
3 INPUT docid termid;
4 DATALINES;
51,1
61,2
72,1
83,3
9;
10 
11DATA mycas.rule_terms;
12 INFILE DATALINES delimiter=',';
13 INPUT ruleid termid;
14 DATALINES;
151,1
161,2
172,3
18;

Examples

This example scores the `doc_terms` table against the rules defined in `rule_terms` and stores the results in the `scored_docs` table.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 boolRule.brScore
3 TABLE={name='doc_terms'},
4 ruleTerms={name='rule_terms'},
5 docId='docid',
6 termId='termid',
7 casOut={name='scored_docs', replace=true};
8RUN;
Result :
The `scored_docs` output table will contain columns for each rule (_Rule_1_, _Rule_2_), with a value of 1 if the document matches the rule and 0 otherwise. Document 1 matches Rule 1. Document 3 matches Rule 2. Document 2 matches neither.

This example demonstrates scoring with more explicit parameter naming. It uses the `useOldNames` parameter, which changes the output column names to a legacy format (_BR_Rule_1_, _BR_Rule_2_, etc.). The output is saved to a different caslib.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 boolRule.brScore
3 TABLE={name='doc_terms', caslib='mycas'},
4 ruleTerms={name='rule_terms', caslib='mycas'},
5 docId='docid',
6 termId='termid',
7 useOldNames=true,
8 casOut={name='scored_docs_legacy', caslib='mycas', replace=true, label='Scoring with Legacy Names'};
9RUN;
Result :
The output table `scored_docs_legacy` will be created in the `mycas` caslib. The columns representing rule matches will be named _BR_Rule_1_ and _BR_Rule_2_ instead of the default _Rule_1_ and _Rule_2_. The matching logic remains the same: Document 1 matches Rule 1, and Document 3 matches Rule 2.

Associated Scenarios

Use Case
Standard Ticket Classification

A telecom company receives thousands of support tickets daily. They want to automatically tag tickets related to 'Billing Disputes' and 'Technical Outages' to route them to the ...

Use Case
High Volume Medical Record Screening

A hospital network needs to screen 100,000 patient history records overnight to identify potential candidates for a clinical trial based on specific comorbidity patterns defined...

Use Case
Legacy System Migration with Custom Mappings

An insurance firm is migrating from an older SAS version (HPBOOLRULE) to Viya. Their data schema uses non-standard column names, and the downstream reporting systems expect the ...