Builds a concept model from linguistic rules defined in a CAS table. This action compiles LITI (Language Interpretation for Text Information) rules into a binary format, which can then be used by other text analytics actions for tasks like concept extraction, categorization, and sentiment analysis. It allows for the inclusion of predefined entities and sentiment models to enrich the custom model.
| Parameter | Description |
|---|---|
| casOut | Specifies the output CAS table to store the compiled concept model binary. This table is used as input by other actions like `tpParse` and `tmMine`. |
| config | Specifies the name of the variable in the input table that contains the concept rule definitions (LITI rules). |
| enablePredefined | When set to TRUE, includes predefined entities (like nlpPerson, nlpLocation) from the specified language's linguistic binaries in the compiled model. |
| language | Specifies the language of the linguistic binaries to use for compiling the rules. Default is 'ENGLISH'. |
| predefinedSentiment | When set to TRUE, the action extends the predefined sentiment model for the specified language with the custom rules. |
| ruleId | Specifies the name of the variable in the input table that contains the unique identifier for each rule. |
| table | Specifies the input CAS table that contains the concept rule definitions to be compiled. |
| tokenizer | Specifies the tokenizer to use. 'STANDARD' uses a language-specific tokenizer. 'BASIC' uses a simple tokenizer based on whitespace and punctuation, which is useful for Chinese, Japanese, and Korean. |
To use the `compileConcept` action, you first need a CAS table containing your concept rules. This table must have at least two columns: one for the rule ID and one for the rule definition (LITI syntax). The following code creates a simple example of such a table.
| 1 | DATA mycas.concept_rules; |
| 2 | LENGTH ruleid $ 50 config $ 32767; |
| 3 | INFILE DATALINES delimiter='|'; |
| 4 | INPUT ruleid $ config $; |
| 5 | DATALINES; |
| 6 | my_company_concept|CONCEPT_RULE:(C_CONCEPT){SAS} |
| 7 | my_product_concept|CONCEPT_RULE:(C_CONCEPT){Viya} |
| 8 | ; |
| 9 | RUN; |
This example demonstrates the simplest use case: compiling a set of rules from an input table into a binary model stored in an output table.
| 1 | PROC CAS; |
| 2 | textRuleDevelop.compileConcept / |
| 3 | TABLE={caslib="mycas", name="concept_rules"}, |
| 4 | ruleId="ruleid", |
| 5 | config="config", |
| 6 | casOut={caslib="mycas", name="my_concept_model", replace=true}; |
| 7 | RUN; |
This example shows how to compile a concept model for Japanese text. It enables predefined entities to leverage SAS-provided concepts and uses the 'BASIC' tokenizer, which is often more effective for languages like Japanese, Chinese, and Korean.
| 1 | PROC CAS; |
| 2 | textRuleDevelop.compileConcept / |
| 3 | TABLE={caslib="mycas", name="japanese_rules"}, |
| 4 | ruleId="rule_id_jp", |
| 5 | config="rule_def_jp", |
| 6 | language="JAPANESE", |
| 7 | enablePredefined=true, |
| 8 | tokenizer="BASIC", |
| 9 | casOut={caslib="mycas", name="japanese_concept_model", replace=true}; |
| 10 | RUN; |