Scénario de test & Cas d'usage
Rule-based scoring of text documents.
Discover all actions of textRuleScoreCreate a small dataset of customer reviews and a mock categorization model. The reviews contain typical positive, negative, and questioning language.
| 1 | DATA mycas.customer_reviews; |
| 2 | LENGTH review_id $ 20 review_text $ 500; |
| 3 | INFILE DATALINES truncover dsd dlm='|'; |
| 4 | INPUT review_id $ review_text $; |
| 5 | DATALINES; |
| 6 | PROD_001|The battery life on this new phone is amazing! I highly recommend it. |
| 7 | PROD_002|I'm very disappointed. The item arrived broken and the packaging was damaged. |
| 8 | PROD_003|Can you tell me if this product is compatible with model X? I can't find the information. |
| 9 | PROD_004|Excellent service and fast delivery. Five stars! |
| 10 | ; |
| 11 | RUN; |
| 12 | |
| 13 | DATA mycas.feedback_model; |
| 14 | LENGTH _mco_ long; |
| 15 | _mco_ = 112233; |
| 16 | RUN; |
| 1 | PROC CASUTIL; |
| 2 | load DATA=WORK.customer_reviews casout='customer_reviews' replace; |
| 3 | load DATA=WORK.feedback_model casout='feedback_model' replace; |
| 4 | QUIT; |
| 1 | PROC CAS; |
| 2 | textRuleScore.applyCategory / |
| 3 | TABLE={name='customer_reviews'}, |
| 4 | docId='review_id', |
| 5 | text='review_text', |
| 6 | model={name='feedback_model'}, |
| 7 | casOut={name='review_categories', replace=true}, |
| 8 | matchOut={name='review_matches', replace=true}; |
| 9 | RUN; |
| 10 | QUIT; |
Two tables are created in CAS. 'review_categories' contains the original data with new columns for each category (e.g., 'CAT_Positive', 'CAT_Negative'), with a score indicating the number of rule matches. 'review_matches' provides a detailed log of which specific terms in each review led to a category assignment, allowing for fine-grained analysis of the model's performance.