Scénario de test & Cas d'usage
Rule-based scoring of text documents.
Discover all actions of textRuleScoreCreate a dataset with challenging data, including a completely empty text field, a null value, and text with XML tags. A mock weighted model is created.
| 1 | DATA mycas.risk_feed; |
| 2 | LENGTH feed_id $ 10 content $ 1024; |
| 3 | INFILE DATALINES truncover dsd dlm='|'; |
| 4 | INPUT feed_id $ content $; |
| 5 | DATALINES; |
| 6 | FEED_1|Breaking News: |
| 7 | FEED_2|Market volatility is a major concern. Analysts predict a downturn. |
| 8 | FEED_3| |
| 9 | FEED_4|This is a neutral report on quarterly earnings. |
| 10 | FEED_5|Rumors of |
| 11 | ; |
| 12 | RUN; |
| 13 | |
| 14 | DATA mycas.risk_model_weighted; |
| 15 | LENGTH _mco_ long; |
| 16 | _mco_ = 778899; |
| 17 | RUN; |
| 1 | PROC CAS; |
| 2 | textRuleScore.applyCategory / |
| 3 | TABLE={name='risk_feed'}, |
| 4 | docId='feed_id', |
| 5 | text='content', |
| 6 | docType='TEXT', |
| 7 | model={name='risk_model_weighted'}, |
| 8 | scoringAlgorithm='WEIGHTED', |
| 9 | casOut={name='risk_results_weighted', replace=true}, |
| 10 | groupedMatchOut={name='risk_grouped_matches', replace=true}, |
| 11 | matchDelimiter=' || '; |
| 12 | RUN; |
| 13 | QUIT; |
| 1 | PROC CAS; |
| 2 | textRuleScore.applyCategory / |
| 3 | TABLE={name='risk_feed'}, |
| 4 | docId='feed_id', |
| 5 | text='content', |
| 6 | docType='XML', |
| 7 | model={name='risk_model_weighted'}, |
| 8 | casOut={name='risk_results_xml', replace=true}; |
| 9 | RUN; |
| 10 | QUIT; |
For step 1, the action runs successfully. The 'risk_results_weighted' table shows weighted scores, and the empty/null records (FEED_3) are processed without error, yielding zero scores. The 'risk_grouped_matches' table uses ' || ' to separate matched terms. For step 2, the action should ideally produce a warning or error in the log, indicating that the content is not valid XML, demonstrating the action's input validation capabilities. The output table 'risk_results_xml' may be empty or contain partial results depending on the error handling logic.