High-Volume Product Sentiment Model Extension

Business Context

A global e-commerce giant monitors reviews for thousands of products. They need to extend the standard English sentiment model with thousands of product-specific slang terms and positive/negative indicators specific to their niche (e.g., 'DOA' is negative, 'BIFL' is positive). The test validates performance on a larger rule set.

Data Preparation

Generating a synthetic dataset with 5,000 custom sentiment rules to test volume handling.

Copied!

1	DATA casuser.retail_sentiment_rules;
2	LENGTH id $20 LITI_rule $100;
3	DO i=1 to 2500;
4	id=cats('POS_RULE_', i);
5	LITI_rule=cats('CONCEPT:AmazingProduct_', i);
6	OUTPUT;
7	id=cats('NEG_RULE_', i);
8	LITI_rule=cats('CONCEPT:BrokenItem_', i);
9	OUTPUT;
10	END;
11	RUN;

Étapes de réalisation

Compiling a large rule set while extending the existing SAS Sentiment model.

Copied!

1	PROC CAS;
2	textRuleDevelop.compileConcept /
3	TABLE={caslib="casuser", name="retail_sentiment_rules"},
4	ruleId="id",
5	config="LITI_rule",
6	predefinedSentiment=TRUE,
7	casOut={caslib="casuser", name="retail_large_model", replace=TRUE};
8	RUN;

Expected Result

The compilation should complete within a reasonable time frame without memory errors. The output table 'retail_large_model' is created, containing a binary that merges the 5,000 custom rules with the base English sentiment logic.

Voir la documentation technique de compileConcept